Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Add a repo provider for CKAN datasets #1833

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

u10313335
Copy link

This PR adds a repository provider for datasets on repositories built upon CKAN, which is an open-source DMS (data management system) for powering data hubs and data portals.

This PR requires a content provider for repo2docker: jupyterhub/repo2docker#1336.

We have adopted this PR in our Binder service (as the CKAN dataset repository): https://binder.depositar.io/, e.g.:

Copy link

welcome bot commented Mar 7, 2024

Thanks for submitting your first pull request! You are awesome! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@u10313335 u10313335 changed the title [WIP] Add a repo provider for CKAN datasets [MRG] Add a repo provider for CKAN datasets Mar 7, 2024
Copy link
Collaborator

@yuvipanda yuvipanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love to see the binderhub set up! Yay! Apologies for the late response.

This is a companion review to jupyterhub/repo2docker#1336 (review). And like that one, there're a couple minor stylistic issues.

But the fundamental question is one of versions and 'ref'. Should 'version' be a concept here directly specifyable, similar to 'ref' in git? I think you have enough knowledge of CKAN to help answer that.

Regardless, excited to find time to get this PR merged :)

client = AsyncHTTPClient()

api = parsed_repo._replace(
path=re.sub(self.url_regex, "/api/3/action/", parsed_repo.path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about using regexes as in jupyterhub/repo2docker#1336 (comment).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I have made changes according to your suggestion.

except HTTPError:
return None

def parse_date(json_body):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this function is used only once, let's just inline that here. Otherwise the function gets redefined each time the parent function is called. Also adds another layer of indirection that's otherwise not needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I will fix this as soon as possible.

@u10313335
Copy link
Author

Thank you again for the review!

But the fundamental question is one of versions and 'ref'. Should 'version' be a concept here directly specifyable, similar to 'ref' in git? I think you have enough knowledge of CKAN to help answer that.

Please also find my comments at jupyterhub/repo2docker#1336 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants