-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query construction #91
Query construction #91
Conversation
Possible to construct term/boost/boolean queries
pyserini/search/pyquerybuilder.py
Outdated
|
||
|
||
def get_boolean_query_builder(): | ||
""" Get a BooleanQueryBuilder object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space here, per PEP 8. and below.
pyserini/search/pyquerybuilder.py
Outdated
|
||
|
||
def get_clause_should(): | ||
""" Get a BooleanClause.Occur.SHOULD statement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8 - docstring comments should end in periods.
pyserini/search/pysearch.py
Outdated
@@ -72,14 +72,14 @@ class SimpleSearcher: | |||
def __init__(self, index_dir: str): | |||
self.object = JSimpleSearcher(JString(index_dir)) | |||
|
|||
def search(self, q: str, k: int=10, t: int=-1, | |||
def search(self, q, k: int=10, t: int=-1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can still provide a type hint here? using Union
?
pyserini/search/pyquerybuilder.py
Outdated
return JQueryGeneratorUtils.getBooleanClauseShould() | ||
|
||
|
||
def get_clause_must(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to implement an Enum for these? Like here? https://github.com/castorini/pyserini/blob/master/pyserini/pyclass.py#L103
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm. Wait, or maybe somehow bind them to singleton symbols?
Dunno...
tests/test_querybuilding.py
Outdated
hits1 = self.searcher.search(bq1) | ||
hits2 = self.searcher.search('information retrieval') | ||
|
||
self.assertEqual(hits1[0].docid, hits2[0].docid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you iterate through all the hits to check all scores and docids?
self.assertEqual(hits1[0].docid, hits2[0].docid) | ||
self.assertEqual(hits1[0].score, hits2[0].score) | ||
|
||
boost3 = pyquerybuilder.get_boost_query(term_query1, 2.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a test case where you assign both weight 2 - docid (in entire hits) should be the same, but scores different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think this is a lot cleaner!
Make it possible to construct queries in Pyserini