-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Searchcollection #114
Searchcollection #114
Conversation
pyserini/search/pysearch.py
Outdated
import re | ||
import argparse | ||
import sys | ||
sys.path.insert(0,'./') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this necessary? I don't think so...
pyserini/search/pysearch.py
Outdated
|
||
|
||
def main(index, topics, output): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this down into __main__
pyserini/search/pysearch.py
Outdated
|
||
if __name__ == "__main__": | ||
parser = argparse.ArgumentParser(description='Create a ArcHydro schema') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ArcHydro schema"??
pyserini/search/pysearch.py
Outdated
with open(topics, 'r') as content_file: | ||
content = content_file.read() | ||
result = re.findall(r'(?<=<top>)(.+?)(?=<desc>)', content, flags=re.S) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can load topics directly: https://github.com/castorini/pyserini/blob/master/tests/test_loadtopics.py
So allow something like --topics robust04
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I keep allowing the -topics filepath one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interpret it as an id, per above, otherwise interpret it as a path (if it fails).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it. thanks
pyserini/search/pysearch.py
Outdated
parser = argparse.ArgumentParser(description='Create a ArcHydro schema') | ||
|
||
def main(index, topics, output): | ||
searcher = SimpleSearcher(index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need an extra method definition? just move the code down?
pyserini/search/pysearch.py
Outdated
searcher = SimpleSearcher(index) | ||
my_list = [] | ||
if topics == 'robust04': | ||
topics_dic = get_topics('robust04') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just do something like get_topics(topics)
and check None
?
pyserini/search/pysearch.py
Outdated
hits = searcher.search(search, 1000) | ||
for i in range(0, len(hits)): | ||
my_list.append(f'{number} Q0 {hits[i].docid.strip()} {i + 1} {hits[i].score:.6f} Anserini') | ||
with open(output, 'w') as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can move the with
to higher scope, you don't need to append to my_list
, and just directly write out?
No description provided.