-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DB schema: optimize for tail view, keep track of insertion order for entities (with index, ideally via primary key) #789
Comments
Adding a thought from #786 to here:
|
Btw, I liked https://supabase.com/blog/choosing-a-postgres-primary-key quite a bit when I started to read about about this back then in March. |
Let's just go ahead with using UUID type 7 for the primary key before we stall on this for even longer. Interesting perspective from https://www.percona.com/blog/store-uuid-optimized-way/
I am glad they discuss monotonicity to a great extent in the RFC draft: https://www.ietf.org/archive/id/draft-ietf-uuidrev-rfc4122bis-02.html#name-monotonicity-and-counters. For CPython someone rather brutally dropped a minimal pull request: python/cpython#29824 -- but there isn't even an issue on the tracker discussing the feature/enhancement to the stdlib. The new standard is still in are not complete yet. https://pypi.org/project/uuid7/ looks good, though, https://github.com/stevesimmons/uuid7 has a decent README. The latest public progress report by the working group pushing towards standardization of the new UUID types can be found in uuid6/uuid6-ietf-draft#122. The current version of the RFC that is supposed to replace RFC 4122 is from 2023-02-16 and can be found at https://www.ietf.org/archive/id/draft-ietf-uuidrev-rfc4122bis-02.html. The work is happening in this repo: https://github.com/ietf-wg-uuidrev/rfc4122bis |
https://uuid.ramsey.dev/en/stable/rfc4122/version7.html is a lovely rendered version of the draft work. |
As I mentioned in #871 (comment), I think most of the work here is done, and we just need to do a manual cleanup of "old" data. Will close this ticket once that's done. |
After consulting with various stakeholders, there's nontrivial pushback against the idea of archiving or migrating old data, since deleting or modifying the primary key will break old URL links to benchmark results. I think we can still move forward with the requirement that any code that wants to take advantage of the UUID7 primary key will have to limit queries to data created after the date #1138 was deployed. (The latest date for the instances I manage is 2023-06-03.) So I'll call this ticket complete with that caveat. |
Generalization of #786.
Requirement for #622.
We want to have a solution for "give the last N inserted entities, fast" for all the things (including benchmark results).
Currently that's not easily possible because of a UUID4 type primary key.
I think we should cheaply keep track of insertion order via primary key. Could use simple auto-increment integer (non-fancy default in many systems, would have served us well here, yeah) or we could do something like UUID7 (might be a little too new, maybe no established tooling support as of today).
I think this task/goal/ticket should be tracked independently of tracking insertion time (#790).
The text was updated successfully, but these errors were encountered: