A comprehensive platform for ML dataset management and code generation with Hugging Face integration.
- Dataset Management: Upload, explore, and manage machine learning datasets
- Data Visualization: Visualize dataset statistics and distributions
- Code Generation: Fine-tune models for code generation tasks
- Code Quality Tools: Improve code quality with integrated formatters, linters, and type checkers
- Frontend: Streamlit
- Backend: Python
- Database: SQLite (via SQLAlchemy)
- ML Integration: Hugging Face Transformers, Datasets
- Visualization: Plotly, Matplotlib
.
├── app.py # Main application entry point
├── components/ # UI components
│ ├── code_quality.py # Code quality tools
│ ├── dataset_preview.py # Dataset preview component
│ ├── dataset_statistics.py # Dataset statistics component
│ ├── dataset_uploader.py # Dataset upload component
│ ├── dataset_validation.py # Dataset validation component
│ ├── dataset_visualization.py # Dataset visualization component
│ └── fine_tuning/ # Fine-tuning components
│ ├── finetune_ui.py # Fine-tuning UI
│ └── model_interface.py # Model interface
├── database/ # Database configuration
│ ├── models.py # Database models
│ └── operations.py # Database operations
├── utils/ # Utility functions
│ ├── dataset_utils.py # Dataset utilities
│ ├── huggingface_integration.py # Hugging Face integration
│ └── smolagents_integration.py # SmolaAgents integration
└── assets/ # Static assets
This application is designed to be deployed as a Hugging Face Space.
- Fork this repository
- Create a new Hugging Face Space
- Connect the forked repository to your Space
- The application will be deployed automatically
- Clone the repository
- Install dependencies:
pip install streamlit pandas numpy plotly matplotlib scikit-learn SQLAlchemy huggingface-hub datasets transformers torch
- Run the application:
streamlit run app.py
.streamlit/config.toml
: Streamlit configuration.streamlit/secrets.toml
: Secrets and API keyshuggingface-spacefile
: Hugging Face Space configuration
To use the Hugging Face integration features, add your Hugging Face API token to .streamlit/secrets.toml
:
[huggingface]
hf_token = "YOUR_HF_TOKEN"
This project is licensed under the MIT License - see the LICENSE file for details.