Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documenting how to access data for benchmarking #121

Closed
matt-graham opened this issue Nov 22, 2023 · 0 comments
Closed

Documenting how to access data for benchmarking #121

matt-graham opened this issue Nov 22, 2023 · 0 comments

Comments

@matt-graham
Copy link

Raising as part of JOSS review openjournals/joss-reviews/issues/5901

As the data files are stored on Git LFS and the free LFS quota for this account seems to be regularly exceeded (see openjournals/joss-reviews#5901 (comment)) it would be useful to document an alternative approach for accessing the data, ideally one which uses an open data repository which doesn't require subscribing to an account to download. While the datasets have been made available on Kaggle (openjournals/joss-reviews#5901 (comment)) this is not currently documented in this repository and a Kaggle account is required to download. An open research data repository / archive like Zenodo would seem to be a better fit with JOSS requirement that the software should be stored in a repository that can be cloned without registration. While I don't think this strictly extends to data associated with the software, from a FAIR data and reproducibility perspective a service like Zenodo is much better than Kaggle.

A potentially even nicer approach would be to use a tool like pooch to automate getting the data from a remote repository as part of running the benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants