New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elements additions #62
Conversation
* Adds a new ast element called MalformedQuery, this will be used by invenio_search to interpret malformed queries into google-like searches. Signed-off-by: Panos Paparrigopoulos <panos.paparrigopoulos@cern.ch>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is breaking syntax change. We need approval from more services. cc @egabancho @tiborsimko
@@ -1,7 +1,8 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# This file is part of Invenio-Query-Parser. | |||
# Copyright (C) 2014, 2016 CERN. | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed
We should also check that Elastic query builder works with the new ast element. |
38e4f68
to
d8bf50c
Compare
@Panos512 Please test new element with all walkers and contrib/elasticsearch. Thanks |
In principle I don't see any issue to include this, just a few questions
and as @jirikuncar mentioned
|
Legacy from SPIRES. The two symbols were equivalent there, and they are still equivalent in the current INSPIRE: https://inspirehep.net/search?ln=en&ln=en&p=find+j+%22Phys.Rev.Lett.%2C105*%22 and https://inspirehep.net/search?ln=en&ln=en&p=find+j+%22Phys.Rev.Lett.%2C105%23%22 |
For our case we treat
The problem with accepting wildcards everywhere in a word is that there are some particles that contain wildcard characters (e.g. and as @jirikuncar mentioned
Apart from |
IMHO this type of strings should be escaped using quotes to avoid ambiguity, i.e.
Test are passing indeed but, you didn't add any test to the walkers to check that your change is not breaking them, which is what we are asking for. Maybe if this is so inspire/spires centric you should consider updating contrib/spires instead (just a thought). |
I'll take a look into this. Indeed it could be done like this and it's the optimal solution for us. But are we sure that the users are going to escape words with special characters? Maybe we could white-list all known particles and ignore special characters in them ( I could resurrect #39 for this ).
That's true, I will update the walkers + tests.
The thing is that we need this functionality in the |
d8bf50c
to
01f50d6
Compare
* Adds wildcards (`*`, `#`) support to queries. Signed-off-by: Panos Paparrigopoulos <panos.paparrigopoulos@cern.ch>
01f50d6
to
2a16737
Compare
@jirikuncar I think @Panos512 has fullfilled the requests, and thus this PR is in good state to be merged. |
@@ -182,3 +186,7 @@ class RegexValue(Leaf): | |||
|
|||
class EmptyQuery(Leaf): | |||
pass | |||
|
|||
|
|||
class MalformedQuery(Leaf): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you still need MalformedQuery
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally yes. This is then needed to implement: inspirehep/inspire-next#1182
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I don't see any test for it.
@@ -80,6 +81,24 @@ def visit(self, node, op): | |||
def visit(self, node): | |||
return node.value | |||
|
|||
@visitor(WildcardQuery) | |||
def visit(self, node): | |||
def query(keyword): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see tests for this part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Panos has now left. Can you maybe guide us on concretely what type of tests we would need to write?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def query(keyword): | ||
fields = self.get_fields_for_keyword(keyword, mode='p') | ||
if len(fields) > 1: | ||
res = Q('bool', should=[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return Q('bool', should=[
query=value, | ||
default_field=k, | ||
analyze_wildcard=True) for k in fields]) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- else
+ return Q('query_string',
Closing PR as it's been stale for a long time. |
*
and#
).