Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better way of presenting DAP collections contains larger number of files #15

Open
t83714 opened this issue Jan 24, 2022 · 0 comments
Open

Comments

@t83714
Copy link
Contributor

t83714 commented Jan 24, 2022

Better way of presenting DAP collections contains a larger number of files

A collection (dataset) on the DAP system might contain hundreds or even thousands of files.

e.g. https://doi.org/10.25919/5e30b5231c669 or https://data.gov.au/dataset/ds-dap-csiro%3A42600 which has 4000 files

The DAP connector currently present each individual file as a "distribution" and "harvest only limited distributions with every formats data included":

// Read environment param distributionSize and harvest only limited distributions with every formats data included

Also, Collections might have a folder structure, e.g. https://doi.org/10.4225/08/57988C158CB9C or https://data.gov.au/dataset/ds-dap-csiro%3A16610/

We probably should think about a better alternative way of presenting DAP collection files on Magda.
e.g.

  • Present the same type of files as one single distribution.
    • e.g. The 4000 xyz format files in this collection will be presented as one distribution. The download link can be a POST request of (/ws/v2/collections/{id}/downloadzip)[https://data.csiro.au/dap/swagger-ui.html#/Collection%20Download/getDownloadZipUsingGET_1] API (this API can specify a list of item IDs that included in the archive to be downloaded)
  • Or present the files in one folder as one single distribution.
  • Or mix the two options above
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant