-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
meta: flatten meta types #1
Comments
👍 Definitely agree. I was always defensive of meta types just because they were so correct, but you have a much better argument. Composite types have always made development difficult, especially when prototyping in What do you recommend as the approach to migration? I'm sure you don't want to lose the actual data export. I don't think we store meta types often, but bundle is chock full of them. I'm not sure about performance or validation, but it would be nice to always use the text representation of meta types when storing them. That way you can decouple their usage from their internal structure. Say you want to rename a field in the type, you don't have to do a migration. Along those lines, I can't remember if there is any sort of validation you can apply to types. It'll be necessary to make sure a text representation actually fits the structure of the type, but it would be even more impressive to actually validate that the id represents an actual database object (i.e. a row_id points to a real row). This would also be a good time to make sure you're happy with the field names in the type, e.g. |
Cool. Retaining data: Yes, migrating that will be challenging. I started a branch. For bundle internals values like all the Re: using text rep in bundle export: Yes! I think performance ramifications will be minimal. We'll probably get a speedup overall, because composite type comparisons and casting is extremely slow. I remember actually running into an instance where comparing one field_id to another was like 100x slower than casting both to text and comparing the strings. Re: validation, certainly we should be validating the strings as containing valid values for all the idents and literals. There are quire a few places where say bundle contains a row_id that isn't live in the working copy. Having that level of exists() validation could be really handy in some scenarios but other times it doesn't apply, like say when a bundle is fetched but not checked out.
|
Quick summary of the refactor so far:
Overall, a very challenging but successful refactor. The end is in sight, just fix the widgets, and just build the tool necessary to do so. Mickey says yak shaving, ha. This issue was the last piece of low-level architecture that I know of that was absolutely wrong and touched every aspect of the system. Much lessons learned, it was a "correct" decision at the time, but has plagued the architecture from the beginning. Much thanks but goodbye. |
Meta identifiers have been unnested and are now flat, aka no meta types contain any other meta types. In the refactor, the entire process of how identifiers are created and maintained has changed: Meta identifiers are now automatically generated progratically, instead of maintaining a massive static SQL file. For each of the 26 meta-identifiers, the script creates 17 statements that flesh out the identifier such as type, constructor function, casts to and from json and jsonb, etc. As additional meta-identifiers are added to the system, the entire file can be easily regenerated using the generator system. |
To convert a bundle from nested-types format to flat-types format, here's some regexes: rowset_row.csv:
rowset_row_field.csv
|
The meta identifier type system has been a source of great joy and also much tears. They very nicely encapsulate the identifier of a database object (which can be in some cases four or five separate values) in a single value. This encapsulation makes life easier in many, many scenarios. Overall the concept of meta identifiers is a huge win. HOWEVER. The implementation has a number of problems:
1. They are highly composite
schema_id
has just aname
value.relation_id
has a relationname
value and aschema_id
.column_id
has a columnname
value and arelation_id
.row_id
has a primary keycolumn_id
, and a primary key valuepk_value
.field_id
has arow_id
and acolumn_id
The nesting here is hopefully obvious.
While correct in terms of information architecture, five-level-deep composite types are patently ridiculous to work with.
Creating one (without using the handy constructors) looks like:
Accessing some variable (without using a handy cast function) looks like:
schema_name := (((column_id).relation_id).schema_id).name
2. Their internal representation is awful
Unfortunately we cannot override this behavior, because in postgres, composite types do not have input or output functions, but all share a single global (and dumb) function.
3. They cause havoc in datum.js
So, when endpoint's REST API selects rows, it checks to see if that row's type is JSON or a composite type, and if it is, sends the row over the wire as a JSON object. When pushing the object back to the database, ... really not sure what happens, but I doubt it works. Basically the REST API has never worked properly with composite types, which has been a massive bottleneck in terms of UI design of the schema admin, bundle admin, user admin, and more. They don't work, and the obtuseness of their structure makes designing a sane REST API to handle them...formidable.
We need to redesign them entirely, to be flat, non-composite types.
Functionality to retain
These patterns of usage should continue to work:
1. Constructors
2. Casts from one type to another
3. Casts to/from JSON
Improvements
1. Easily accessible (flat) value namespace
2. Casts to/from
text
We already have casts to text for some types, however not full coverage. Text representation should be readable, unique and unambiguous. We need to make use of
quote_ident
andquote_literal
in our casting, and test against identifiers that contain quotes, slashes, etc.Desired Outcomes
meta.*
fromdatum.js
meta.*
frombundle
These are the two biggest objectives still ahead, and this is the source of all the problems.
Implementation Ramifications
This is a central pillar in the architecture, a high-flying buttress. Pulling it out will be highly impactful throughout the system. Test coverage will preserve functionality in bundle and meta, presuming tests don't rely too heavily on internals and instead use constructors and casts. Need a endpoint test suite.
Implementation Approach
The constructors for the types should be a pretty good indicator of how the flattened types should be structured. Do that.
We need to figure out what bundle is going to store in
blob.value
for, say, ameta.column_id
, be it the text representation or ... whatever postgresql outputs for non-composite types...? Since these types won't be composite anymore, we have hope of writing input/output functions that do the right thing without the incessant explicit casting.The text was updated successfully, but these errors were encountered: