Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional dataset category and frequency to align with aleph #109

Merged
merged 1 commit into from May 3, 2023

Conversation

simonwoerpel
Copy link
Contributor

This PR adds two optional properties to a dataset, frequency and category that are found in the aleph dataset model. (https://github.com/alephdata/aleph/blob/main/aleph/model/collection.py#L20)

It extends the nomenklatura.dataset.util.type_check function with an optional literal parameter to test for the membership of the value in this literal.

Additionally, the dataset tests are updated accordingly.

Hope that's a useful contribution, I personally definetly have the use case 馃檭

@pudo
Copy link
Member

pudo commented May 2, 2023

Very cool, thanks so much for pulling this together with tests and everything! One thing that I'm a bit torn about is if it makes sense to replicate the enumerated values for categories in parallel to those in Aleph. The category scheme in aleph is broken to begin with (category error: leak vs. company registry for example), so I wonder if we shouldn't just use it as a string field in Nomenklatura which sometimes may happen to contain values aleph understands when fed via the API?

@simonwoerpel
Copy link
Contributor Author

yes, agree for categories. if someone wants to integrate with aleph, just make sure the values match...
for frequencies: i suggest using https://github.com/Sonic0/cron-converter for cron-like string parsing & validation, what about this? this is a port of a similar JS library so the values could easily adapted to render on nice websites...

@pudo pudo merged commit 1646979 into master May 3, 2023
3 checks passed
pudo added a commit that referenced this pull request May 3, 2023
@pudo pudo deleted the metadata/dataset branch June 26, 2023 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants