DB schema: optimize for tail view, keep track of insertion order for entities (with index, ideally via primary key) #789

jgehrcke · 2023-03-02T11:52:59Z

Generalization of #786.

Requirement for #622.

We want to have a solution for "give the last N inserted entities, fast" for all the things (including benchmark results).

Currently that's not easily possible because of a UUID4 type primary key.

I think we should cheaply keep track of insertion order via primary key. Could use simple auto-increment integer (non-fancy default in many systems, would have served us well here, yeah) or we could do something like UUID7 (might be a little too new, maybe no established tooling support as of today).

I think this task/goal/ticket should be tracked independently of tracking insertion time (#790).

jgehrcke · 2023-03-06T15:53:15Z

We want to have a solution for "give the last N inserted entities, fast" for all the things (including benchmark results).

Adding a thought from #786 to here:

I keep centering around the idea that most users would like to have a tail view, i.e. insertion order should play a central role: #789 ... Maybe in the future, a central UI concept for snappy DB queries/snappy UI population should involve the idea of showing tabular data (everywhere in the UI) for the last D days (could default to e.g. 10). That relates to #790.

jgehrcke · 2023-04-19T14:13:15Z

Btw, I liked https://supabase.com/blog/choosing-a-postgres-primary-key quite a bit when I started to read about about this back then in March.

jgehrcke · 2023-04-19T14:28:56Z

Let's just go ahead with using UUID type 7 for the primary key before we stall on this for even longer.

Interesting perspective from https://www.percona.com/blog/store-uuid-optimized-way/

The size of the UUID table is almost 50% bigger than Ordered UUID table and 30% bigger than the table with BIGINT as PRIMARY KEY. Comparing the Ordered UUID table BIGINT table, the time is taken to insert rows and the size are almost the same. But they may vary slightly based on the index structure.

I am glad they discuss monotonicity to a great extent in the RFC draft: https://www.ietf.org/archive/id/draft-ietf-uuidrev-rfc4122bis-02.html#name-monotonicity-and-counters.

For CPython someone rather brutally dropped a minimal pull request: python/cpython#29824 -- but there isn't even an issue on the tracker discussing the feature/enhancement to the stdlib.

The new standard is still in are not complete yet. https://pypi.org/project/uuid7/ looks good, though, https://github.com/stevesimmons/uuid7 has a decent README.

The latest public progress report by the working group pushing towards standardization of the new UUID types can be found in uuid6/uuid6-ietf-draft#122.

The current version of the RFC that is supposed to replace RFC 4122 is from 2023-02-16 and can be found at https://www.ietf.org/archive/id/draft-ietf-uuidrev-rfc4122bis-02.html.

The work is happening in this repo: https://github.com/ietf-wg-uuidrev/rfc4122bis

jgehrcke · 2023-04-19T14:29:21Z

https://uuid.ramsey.dev/en/stable/rfc4122/version7.html is a lovely rendered version of the draft work.

This is for issue #789.

austin3dickey · 2023-10-03T18:17:10Z

As I mentioned in #871 (comment), I think most of the work here is done, and we just need to do a manual cleanup of "old" data. Will close this ticket once that's done.

austin3dickey · 2023-10-04T18:30:06Z

After consulting with various stakeholders, there's nontrivial pushback against the idea of archiving or migrating old data, since deleting or modifying the primary key will break old URL links to benchmark results.

I think we can still move forward with the requirement that any code that wants to take advantage of the UUID7 primary key will have to limit queries to data created after the date #1138 was deployed. (The latest date for the instances I manage is 2023-06-03.) So I'll call this ticket complete with that caveat.

This was referenced Mar 2, 2023

db schema: store insertion/creation timestamp for entities #790

Open

api: GET /api/benchmarks/ -- improve specification, make it responsive (less DoSsy) #786

Closed

jgehrcke mentioned this issue Mar 7, 2023

app: introduce HistorySample type, change api/history output schema #811

Merged

jgehrcke added the DB schema ticket related to database schema label Mar 10, 2023

jgehrcke changed the title ~~DB schema: keep track of insertion order for entities (with index, ideally via primary key)~~ DB schema: optimize for tail view, keep track of insertion order for entities (with index, ideally via primary key) Mar 15, 2023

This was referenced Mar 15, 2023

API endpoints: allow for reading data in chunks (introduce pagination) #871

Closed

Streamlining API calls for performance (faster&smaller HTTP responses via filters etc) #746

Closed

jgehrcke self-assigned this Apr 19, 2023

jgehrcke added a commit that referenced this issue Apr 19, 2023

db insert: use uuid7 primary key to track insert order

b362658

This is for issue #789.

This was referenced Apr 19, 2023

db insert: use uuid7 primary key to track insert order #1138

Merged

DB schema: do not use user-given data in primary key (insert order invariant) #1139

Closed

jgehrcke added a commit that referenced this issue Apr 20, 2023

db insert: use uuid7 primary key to track insert order (#1138)

77a0a91

This is for issue #789.

deepsihag assigned austin3dickey and unassigned jgehrcke Sep 28, 2023

austin3dickey closed this as completed Oct 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DB schema: optimize for tail view, keep track of insertion order for entities (with index, ideally via primary key) #789

DB schema: optimize for tail view, keep track of insertion order for entities (with index, ideally via primary key) #789

jgehrcke commented Mar 2, 2023 •

edited

jgehrcke commented Mar 6, 2023

jgehrcke commented Apr 19, 2023

jgehrcke commented Apr 19, 2023

jgehrcke commented Apr 19, 2023

austin3dickey commented Oct 3, 2023

austin3dickey commented Oct 4, 2023

DB schema: optimize for tail view, keep track of insertion order for entities (with index, ideally via primary key) #789

DB schema: optimize for tail view, keep track of insertion order for entities (with index, ideally via primary key) #789

Comments

jgehrcke commented Mar 2, 2023 • edited

jgehrcke commented Mar 6, 2023

jgehrcke commented Apr 19, 2023

jgehrcke commented Apr 19, 2023

jgehrcke commented Apr 19, 2023

austin3dickey commented Oct 3, 2023

austin3dickey commented Oct 4, 2023

jgehrcke commented Mar 2, 2023 •

edited