See old GitLab issue. This issue needs to be updated to reflect current data cataloguing plans. (Summer 2023)
We should consolidate all of our disparate data catalogues, inventories, and trackers into a single Excel sheet. I've created a template of what should be included:
new_data_catalog.xlsx
And I'm working to consolidate the following sheets:
warehouse_athena_map.xlsx
data_catalog_wiki.xlsx
data_catalog_warehouse.xlsx
update_inventory.xlsx
The final workbook should:
- Live at
Data/Data_Dept_Catalog.xlsx in this repo
- Be linked to from the Home and _sidebar wiki pages + from a readme note in the data architecture repo
- Be tracked using Git LFS
- Orange columns in the worksheet should be updated programmatically via daily API calls to AWS. Can use GitLab's CI + boto3 to accomplish this
- Be machine-readable in the long format, no merged cells!
See old GitLab issue. This issue needs to be updated to reflect current data cataloguing plans. (Summer 2023)
We should consolidate all of our disparate data catalogues, inventories, and trackers into a single Excel sheet. I've created a template of what should be included:
new_data_catalog.xlsx
And I'm working to consolidate the following sheets:
warehouse_athena_map.xlsx
data_catalog_wiki.xlsx
data_catalog_warehouse.xlsx
update_inventory.xlsx
The final workbook should:
Data/Data_Dept_Catalog.xlsxin this repo