New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable the private datasets #39
Comments
It's not compatible with the cache for now. See #39 to restablish the feature.
Previous code has been removed with b08e649, for reference |
Requested here: huggingface/datasets#3604 |
Also: asked for by the AutoNLP team, to be able to preview the training datasets |
https://github.com/huggingface/moon-landing/pull/2442 has added an endpoint to check the authentication: |
As we get more enterprise prospects and customers, it would be nice to be able to show off the dataset viewer on private datasets. So +1 on this feature request from me. |
Yes, it's in the list of the next features that we will implement. We're working on a roadmap to make it clearer for everybody what could be expected in the following months. Viewer for private models could not be reasonably implemented with the previous "artisanal" infrastructure because it would have fallen under the load, but now that we run in Kubernetes, it should work seamlessly. |
Required here for private hub: https://huggingface.slack.com/archives/CTKK32GE8/p1658236535144079?thread_ts=1658236048.820219&cid=CTKK32GE8 |
Priority level:
|
Requested on the forum: https://discuss.huggingface.co/t/the-dataset-preview-has-been-disabled-on-this-dataset/21339/5 |
Also here: https://discuss.huggingface.co/t/does-the-rest-api-work-with-private-repo/28987 (to consume the API) |
Hi! Chiming in to show interest. I completely understand this is not a top priority, but being able to have the viewer in private datasets would be super cool (also as a last final "sanity check" before making the repo public for example). Happy to help debug, if needed |
+1 to this feature |
+1 for this feature |
+1 |
+1 This would be very useful |
I see this issue has been tagged as a P2 -- are you able to give us a rough estimate of what that might mean in terms of when it could land? |
No, we will update here when we have an ETA. Meanwhile, see #39 (comment) |
+1 |
Is there a way we can explicitly enable this for "private datasets"(at our own risk) ? At the moment my dataset has to be private, but I am concerned less about leakages etc. Would love to enable it |
No, it's not possible at the moment. It is not a replacement, but if it works for you, maybe you can set your dataset as gated: the dataset viewer works with them. |
What is the strictest possible gating? Can you restrict it to effectively be a private dataset? |
It's not the same as private, but if you opt to manually approve the requests to access the gated dataset (https://huggingface.co/docs/hub/datasets-gated#manual-approval), you would avoid giving public access to the data. |
+1 to enable dataset viewer on private datasets |
+1. As an enterprise user, I don't want to share the dataset publicly, but not having a preview is really frustrating and makes collaboration difficult. Currently, we are creating a tiny dataset (10 rows) and making that public just to have a preview and understand what is what. |
It's not compatible with the cache for now. See huggingface/dataset-viewer#39 to restablish the feature.
+1 for this! The datasets filtering and sorting UI is very very barebones, so the viewer would help to navigate private datasets a lot |
Also, internal request: https://huggingface.slack.com/archives/C02EMARJ65P/p1702930353945389 |
+1 for viewing private datasets |
please continue +1'ing this issue (can be on the OP) so we get a sense for how many people/teams need this! 🙏 |
+1 |
This would be great for development of datasets, especially since the documentation for how the viewer/preview works is quite limited and currently requires piecing together information from multiple pages on the docs. While we're waiting on this to be developed (assuming it still will be), could the docs be updated with clearer requirements for the preview feature (i.e., more than just that the dataset needs to be public)? When I run the list parquet files query I'm able to see that the parquet has failed, but it's not clear why. Is there a way to get more information about what went wrong? For one of the datasets in question it is simply a collection of images in |
Done! The private datasets are now supported in the datasets-server, enabling the dataset viewer + parquet conversion on the dataset pages. Enregistrement.de.l.ecran.2024-01-31.a.10.39.59.movNote that it's a paid feature, available as of today for Pro users and Enterprise orgs. Please give us feedback when you try this new feature! |
Merci Sylvain ! |
The code is already present to pass the token, but it's disabled in the code (hardcoded):
https://github.com/huggingface/datasets-preview-backend/blob/df04ffba9ca1a432ed65e220cf7722e518e0d4f8/src/datasets_preview_backend/cache.py#L119-L120
The text was updated successfully, but these errors were encountered: