This repo has a mixed bag of LICENSE, README, CITATION, .license, etc. files. Further, multiple distinct and unrelated fields are often included, e.g. non-licensing info is included in the LICENSE file.
We should standardize the format so that there is a single convention adopted across all datasets. For example, we could use .toml format to organize data in tables that can easily be parsed downstream. Using a standard convention will allow pyvista/pyvista to automatically retrieve this metadata and prominently display it in the Dataset Gallery docs. This convention should ideally be enforced by CI for every dataset.
PR #66 attempts to do something like this, but IMO this is still a bit ad-hoc and does not allow us to programatically retrieve the metadata fields that users should be including. The only thing mentioned in that PR that could probably reliably be parsed is the standard SPDX-License-Identifier: line.
This repo has a mixed bag of LICENSE, README, CITATION, .license, etc. files. Further, multiple distinct and unrelated fields are often included, e.g. non-licensing info is included in the LICENSE file.
We should standardize the format so that there is a single convention adopted across all datasets. For example, we could use
.tomlformat to organize data in tables that can easily be parsed downstream. Using a standard convention will allowpyvista/pyvistato automatically retrieve this metadata and prominently display it in the Dataset Gallery docs. This convention should ideally be enforced by CI for every dataset.PR #66 attempts to do something like this, but IMO this is still a bit ad-hoc and does not allow us to programatically retrieve the metadata fields that users should be including. The only thing mentioned in that PR that could probably reliably be parsed is the standard
SPDX-License-Identifier:line.