Skip to content

Data pipeline storage organization

Aaron Collier edited this page Apr 18, 2023 · 3 revisions

DRAFT Storage organization for the vendor data loading pipeline

/dataloader

The root path of /dataloader is a 250GB mount intended as the shared storage mount for data processing.

Working data path

/dataloader/data/{org-id}/{dagRunId}

When data is automatically fetched from a vendor site based on a folio interface ID it will be staged in the data path described above by org-id, interface-id and dag run id.

When. data is manually uploaded it will initially be placed in a temporary location and then moved into

Archive data path

/dataloader/archive/{year}/{month}/{day}/{org-id}/{interface-id}