Fix DB constraint violations #1212
Labels
data-cleaning
Tasks related to cleaning & regularizing data during ETL.
epic
Any issue whose primary purpose is to organize other issues into a group.
metadata
Anything having to do with the content, formatting, or storage of metadata. Mostly datapackages.
sqlite
Issues related to interacting with sqlite databases
Description
In getting the SQLite/Parquet ETL (#1176) working, some previously enforced database constraints were relaxed. There are also new constraints that we want to impose on the structure of the DB to keep it tidy, and enable programmatic use of the relational structure, most immediately in the context of the entity harvesting & resolution process (#639). These issues include modifying our data processing to ensure that no primary keys contain null values, and all are unique, making sure that foreign key relationships and data types are being checked by the database, and that appropriate NA/Null values are being used in the DB.
This work should be billed to the Sloan Metadata/Harvest Revamp project in Harvest.
Motivation
In Scope
Out of Scope
The text was updated successfully, but these errors were encountered: