New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid having too many datasets in a dataset series #276
Comments
For example, the Czech catalog presumes there can be a large number of datasets in the series and has a search functionality through the members (basically, it filters all datasets for the ones in the series and re-uses the normal faceted search functionality). So, I would say it should be expected that there can be as many datasets in a series as there are datasets in a catalog. Specific examples is a daily time series going for multiple years. Downloading all files is trivial, as you can search for them in the catalog using SPARQL and then download them. |
Ok, thanks for the example, I did not expect that! Still I think it is important to clarify that this may happen. @jakubklimek have you checked if this is handled gracefully by data.europa.eu? |
Actually, I don't see dataset series handled by data.europa.eu at all, for example this dataset series looks like an empty dataset 🤷🏻♂️ |
So far there are no restrictions on the size or organisation of DatasetSeries. So unless there are specific requests I would not restrict sizes. |
During webinar 21 Nov 2023 the working group agreed to not impose any restrictions of this kind. |
It is highly likely that dataportals will make the assumption that the amount of datasets in a dataset series can be listed directly (that there are quite few of them). That means that there won't be a searchable UI to find datasets inside a dataset series. This assumption is supported by scenarios like one datasets is added to a dataset series per year. It should be stated if this assumption is wrong at least via an example.
We believe this assumption should be supported by the specification by providing guidance on how to solve situations where there is a drift towards providing many more datasets in a series. For example:
If you find that you need to create hundreds of datasets in a dataset series it would be a better choice to consider other strategies, e.g.:
Note that dataportals will likely hide datasets that are part of dataset series and only give access via their dataset series.
The text was updated successfully, but these errors were encountered: