Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Ranking and Index Documentation #19

Open
xeoncross opened this issue Mar 18, 2022 · 1 comment
Open

Update Ranking and Index Documentation #19

xeoncross opened this issue Mar 18, 2022 · 1 comment
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@xeoncross
Copy link

The Index File Format and Search Result Ranking is not defined very clearly.

Could these two critical documents be added to? For example, the domain IDF-TF score calculation is shown, but it's not clear if that is part of the index file format or stored elsewhere. It appears as if each data record might be grouped by the domain which contains the IDF-TF score in the header, but that doesn't make sense because it's supposed to be a score for each term.

Likewise, it's also not clear if the 8 * n bytes keys represent the document terms which are indexed at each position in the data record. Based on the index header containing these int64 bit pointers to locations inside the data record, I assume there must be many index files as they look like they are immutable requiring knowledge of all data record contents when they are constructed.

@joscul joscul added the documentation Improvements or additions to documentation label Mar 25, 2022
@joscul
Copy link
Collaborator

joscul commented Mar 25, 2022

Yes, the documentation for the index file format and result ranking is not accurate. I will go through them.

@joscul joscul self-assigned this Mar 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants