-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Team 4) Design: Keyword match operator #31
Comments
Per our discussion in the lecture today, please do the following:
Please contact me to schedule a F2F meeting to discuss the details. Also I wonder whether you are still interested to solve the problem of "finding documents similar to a document." |
2. Modified DictionaryMatcher and KeyWordMatcher to use the methods in Utils class.
[Issue #31] (Team 4) keyword matcher refactoring
[Issue #31] (Team 4) Enabling positional indexing in Lucene for TEXT type
@akshaybetala and @prakul : please finish the documentation and performance test. Thanks. |
The initial performance numbers (time) for index-based search operator were very high (https://github.com/TextDB/textdb/wiki/CS290-2016S-Task:-Keyword-Match-Operator). Here's the answer of @prakul :
I still wonder why the overhead is so high. @sandeepreddy602 @rajesh9625 @inkudo @zuozhi for their thoughts. |
These are the results I got when I ran DictionaryMatcher: index-based search operator(PHRASEOPERATOR) on my machine: Machine configuration : MacBook Pro, 2.7 GHz Intel Core i5, 8 GB 1867 MHz DDR3 Performance results for DictionaryMatcher with PHRASEOPERATOR: Dictionary : {"medical","medication","medicare","medicaid"} Lucene Query time for me has always been under a second. |
I just ran the query "medicine" with keyword operator on 1 Million records. The numbers I get is pretty normal: So is "7192.9650 seconds" the real number or a typo? And are we talking about "Lucene Query time" here or the total time in general? |
@rajesh9625 : please include more interesting entities in the dictionary, with more varieties and multiple-keyword entities. @zuozhi : I think we are talking about total query time, since that's what a user experiences. |
This task is done. |
Team 4:
Please do the following:
Add @prakul to this issue.
The text was updated successfully, but these errors were encountered: