Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix removing tokens that appear in query file but not index file from query sets #15

Merged
merged 4 commits into from
Aug 30, 2022
Merged

Fix removing tokens that appear in query file but not index file from query sets #15

merged 4 commits into from
Aug 30, 2022

Conversation

innovate-invent
Copy link
Contributor

I believe this is an effective fix, but I am not entirely sure what the consequences of using negative indices is.

Resolves #13

@ekzhu
Copy link
Owner

ekzhu commented Jul 29, 2022

Thanks for the pull request. I think maybe a more robust solution is to modify the similarity function to add set sizes as new arguments. So we can use a different size than the set of tokens into the function. e.g., we can use the actual query set size rather than the size of the subset of tokens that exist in the index.

@ekzhu
Copy link
Owner

ekzhu commented Jul 29, 2022

I made the required changes. Can you help me verify if the changes are correct by adding a unit test for your scenario? Thanks!

@innovate-invent
Copy link
Contributor Author

I ran the test on master and this branch, it fails on master and passes here.

@ekzhu ekzhu merged commit 1a3eadd into ekzhu:master Aug 30, 2022
@innovate-invent innovate-invent deleted the fix_query branch August 30, 2022 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect output during Query
2 participants