Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull fields out of binary BLOBs #975

Open
remerle opened this issue Sep 4, 2020 · 2 comments
Open

Pull fields out of binary BLOBs #975

remerle opened this issue Sep 4, 2020 · 2 comments
Labels
area/core Items relating to core functionality which may not be directly related to API or Portal behavior kind/maintenance Code cleanup, refactoring, upgrading to new shiny things status/ready the issue is ready to be worked on and prioritized

Comments

@remerle
Copy link
Member

remerle commented Sep 4, 2020

In order to allow for metric gathering, as well as more sophisticated searching within the application, we need to move away from using BLOBs to store data. BLOBs are opaque and unqueryable.

  • Any place where a BLOB is not strictly necessary (were we'd need multiple tables to house the data), pull the data from the BLOB up into fields
  • Any place where a BLOB still makes sense, pull up any fields that may be important for querying (all fields where pulling them up into the table wouldn't necessitate extra tables to house the relationships)

Impact

  • This will be a breaking change which will require migration
  • It will be important to drop DynamoDB support (Remove DynamoDB module #971) since DynamoDB cannot easily support the types of queries that we'd want to perform against the new columns. It's possible, but it'd be ugly and the wrong tool for the job.

Some discussion on migration is in #960

@remerle remerle added kind/maintenance Code cleanup, refactoring, upgrading to new shiny things ⌚ pending-ready labels Sep 4, 2020
@remerle remerle added status/ready the issue is ready to be worked on and prioritized and removed ⌚ pending-ready labels Sep 4, 2020
@pauljamescleary
Copy link
Contributor

A few things for consideration:

  1. Currently blobs are passed via commands (like RecordSetChange) in the message queue
  2. When loading very large zones for the purposes of doing a zone sync, we load all records at one time in order to perform a massive diff operation. This is fast for few records, but zones that have 100,000 records this can take a long time to complete. The protobufs are excellent in this regard due to their very small size for this operation. If we remove blobs, we need to optimize the zone sync process.

@remerle
Copy link
Member Author

remerle commented Sep 8, 2020

This issue just looks to solve the opaqueness problem, so let's be sure to keep blobs where necessary for zone sync and queueing.

However, let's consider a hash-based approach to zone sync once we open up the hood. Follow up with a secondary issue to tackle that when/where it makes sense.

@remerle remerle added the area/core Items relating to core functionality which may not be directly related to API or Portal behavior label Sep 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core Items relating to core functionality which may not be directly related to API or Portal behavior kind/maintenance Code cleanup, refactoring, upgrading to new shiny things status/ready the issue is ready to be worked on and prioritized
Projects
None yet
Development

No branches or pull requests

2 participants