-
Notifications
You must be signed in to change notification settings - Fork 494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dvobject Table Dangers: 800,000 rows with no type checking and wasted space #777
Comments
See also #807. |
Marking this is as critical:
|
@michbarsinai can you comment? My recollection is that when you first did this, you had to have one dvobject table because of how eclipselink handles this (that it does not support the one table per entity model)? Am I recalling correctly or has this changed? |
I'll review this again after beta 8. Note that joins come at a cost as well. There may be some other solutions, e.g. "row objects". |
This is a not an insignificant change. In the tables where permissions reference the dvobject (RoleAssignment, DataverseRole, etc), it means replacing one attribute with two:
Un-mixing the entities also makes things simpler:
It will be easier to make this change before 4.0 is live. After the data is migrated and the system is live, it will be much more difficult. That said, at this late date, this change will push milestone dates forward. |
With the new JOINED inheritence strategy, we knowhave both type safety and removed the sparseness. |
3 tables in place, objects added to tables on create. will open tickets if issues arise. closing |
The sparseness has been removed for sure (yay!) but we don't get automatic type safety or data integrity checks/constraints. We should review the code and add |
Related: #733
Background
A single postgres table, dvobject, is being used to store 3 distinct entities.
The table is not normalized and allows logical inconsistencies and anomalies.
Consider normalizing dvobject table and using a dvobject_types table.
note: After migration, the dvobject table will contain nearly 800,000 rows. (750k files, 54k studies, 750+ dataverses, etc)
Reason 1 for normalization: No business logic enforcement / data integrity at database level
Reason 2 for normalization: Wasted space on a large table
Suggested Fix: Path to normalization
Consider normalizing and using a dvobject_types table
Use the dvobject_types table when storing object level roles/permissions
Questions/Discussion
The text was updated successfully, but these errors were encountered: