title | emoji | colorFrom | colorTo | sdk | sdk_version | app_file | pinned | python_version | hf_oauth | hf_oauth_scopes | license | thumbnail | short_description | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Github To Huggingface Dataset Migration Tool π |
π |
green |
gray |
gradio |
5.20.0 |
app.py |
true |
3.11.6 |
true |
|
apache-2.0 |
A web-based tool to migrate and analyze datasets with ease π |
This web-based tool allows you to migrate and analyze datasets from GitHub to the Hugging Face Datasets Hub. It provides a user-friendly interface for importing GitHub repositories, analyzing their contents, and exporting them to the Hub with automatic dataset card generation and validation.
π Key Features:
- π Import GitHub Repositories: Easily import GitHub repositories containing datasets by providing their URLs.
- π Dataset Analysis: Analyze the repository's structure, identify potential dataset files (CSV, JSON, etc.), and extract relevant metadata.
- π€ Hugging Face Hub Integration: Seamlessly export datasets to the Hugging Face Datasets Hub with built-in validation and dataset card generation.
- π§ AI-Powered Analysis: Leverage AI to generate summaries of the dataset, analyze data quality, and identify potential issues.
- π Comprehensive Reports: Generate detailed reports on code quality, community engagement, technical metrics, and AI-powered insights.
- Share their datasets with the Hugging Face community.
- Discover and access datasets from GitHub.
- Analyze and understand datasets before using them.
- Improve the quality and accessibility of their datasets.
π‘ How to Use:
- π Provide GitHub Repository URL: Enter the URL of the GitHub repository containing the dataset you want to migrate.
- π Analyze and Review: The tool will analyze the repository and extract relevant metadata. Review and edit the metadata as needed.
- π Export to Hugging Face: Click the "Export to Hugging Face" button to initiate the migration process.
- π Generate Report: Download a comprehensive analysis report to gain insights into the dataset.
Technology Stack: This tool is built using Gradio, Python, and Hugging Face Transformers. AI Provider: AI-powered analysis is provided by Anthropic. Open Source: The code for this tool is available on GitHub.
π We welcome contributions and feedback from the community to make this tool even better!
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference