Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shard/Index introduce versioning #1833

Closed
6 tasks done
etiennedi opened this issue Feb 24, 2022 · 3 comments
Closed
6 tasks done

Shard/Index introduce versioning #1833

etiennedi opened this issue Feb 24, 2022 · 3 comments
Labels
autoclosed Closed by the bot. We still want this, but it didn't quite make the latest prioritization round

Comments

@etiennedi
Copy link
Member

etiennedi commented Feb 24, 2022

  • Anything that did not have versioning before is read as 1
  • Anything that is written with the new index is stored as 2
  • Future index changes will increase the number further
  • This way we can fall back to old mechanisms if there was a breaking change (such as switching doc ids from LittleEndian to BigEndian)
  • When version is 1 keep writing and reading with little endian in all places, if version is at least 2 use big endian

TODOs

  • initialize version on shard startup
    • if no file exists and no data is present -> latest version
    • if no file exists and data is present -> assume v1
    • in all other cases read version from disk
  • error if trying to use BM25 before v2
  • use LittleEndian for v1, BigEndian for v2 when reading and writing doc ids in inverted index
  • if v1 sort rows in MapList
  • if v1 sort rows in map cursor
  • if v1 sort rows in map compaction
@etiennedi
Copy link
Member Author

etiennedi commented Feb 25, 2022

To find some of the affected places grep for TODO: gh-1833

etiennedi added a commit that referenced this issue Feb 25, 2022
This still requires version checking as outlined in #1833
etiennedi added a commit that referenced this issue Feb 27, 2022
- So far it decides to use the correct endianness for the inverted index
  based on the version.
- It does not yet introduce runtime sorting for v1 indices
- Skipping the test for now as it seems I can't run them on my M1 Mac
  currently, will run them from an Intel mac after pushing
etiennedi added a commit that referenced this issue Feb 27, 2022
etiennedi added a commit that referenced this issue Feb 27, 2022
etiennedi added a commit that referenced this issue Feb 28, 2022
For backward compatibility.

Also make sure versioner file and proplen tracker are deleted when
DELETEing the class.
@stale
Copy link

stale bot commented Apr 30, 2022

Thank you for your contribution to Weaviate. This issue has not received any activity in a while and has therefore been marked as stale. Stale issues will eventually be autoclosed. This does not mean that we are ruling out to work on this issue, but it most likely has not been prioritized high enough in the last months.
If you believe that this issue should remain open, please leave a short reply. This lets us know that the issue is not abandoned and acts as a reminder for our team to consider prioritizing this again.
Please also consider if you can make a contribution to help with the solution of this issue. If you are willing to contribute, but don't know where to start, please leave a quick message and we'll try to help you.
Thank you, The Weaviate Team

@stale stale bot added the autoclosed Closed by the bot. We still want this, but it didn't quite make the latest prioritization round label Apr 30, 2022
@etiennedi
Copy link
Member Author

This is complete and can be closed :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autoclosed Closed by the bot. We still want this, but it didn't quite make the latest prioritization round
Projects
None yet
Development

No branches or pull requests

1 participant