Skip to content

Consolidate data inventories and catalogues into single workbook #7

@dfsnow

Description

@dfsnow

See old GitLab issue. This issue needs to be updated to reflect current data cataloguing plans. (Summer 2023)

We should consolidate all of our disparate data catalogues, inventories, and trackers into a single Excel sheet. I've created a template of what should be included:

new_data_catalog.xlsx

And I'm working to consolidate the following sheets:

warehouse_athena_map.xlsx
data_catalog_wiki.xlsx
data_catalog_warehouse.xlsx
update_inventory.xlsx

The final workbook should:

  • Live at Data/Data_Dept_Catalog.xlsx in this repo
  • Be linked to from the Home and _sidebar wiki pages + from a readme note in the data architecture repo
  • Be tracked using Git LFS
  • Orange columns in the worksheet should be updated programmatically via daily API calls to AWS. Can use GitLab's CI + boto3 to accomplish this
  • Be machine-readable in the long format, no merged cells!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions