A scalable Streamlit-based machine learning pipeline platform specialized for cybersecurity purple-teaming, enabling advanced data processing and model training.
- Distributed Data Processing: Leverage Dask for handling large-scale datasets
- Interactive ML Pipeline: Build and customize machine learning workflows
- Real-time Visualization: Monitor model performance and data insights
- Cybersecurity Focus: Tailored for purple team operations and security analytics
- Dask: Distributed data processing
- Scikit-learn: ML model training and evaluation
- Streamlit: Interactive web interface
- Pandas/NumPy: Data manipulation and analysis
- Matplotlib/Seaborn: Data visualization
- Clone the repository
git clone https://github.com/yourusername/cybersec-ml-pipeline.git
cd cybersec-ml-pipeline
- Install dependencies
pip install -r requirements.txt
- Run the application
streamlit run app.py
-
Data Upload
- Support for CSV and JSON formats
- Automatic handling of large datasets using Dask
-
Pipeline Configuration
- Choose preprocessing steps
- Configure model parameters
- Select features for training
-
Model Training
- Interactive parameter tuning
- Real-time performance metrics
- Visual model evaluation
To set up GitHub Actions for pushing to Hugging Face Hub, follow these steps:
-
Create a GitHub Actions workflow file: The workflow file should be located at
.github/workflows/hf-push.yml
. -
Trigger the workflow: The workflow should be triggered on a push to the
main
branch. -
Set up Python environment: Ensure the Python version is set to 3.11.
-
Install dependencies: Install the necessary dependencies including
requests
,pandas
,numpy
,plotly
,scikit-learn
,statsmodels
,streamlit
,nltk
, andhuggingface_hub
. -
Retrieve Hugging Face token: The Hugging Face token (
HF_TOKEN
) should be retrieved from the GitHub secrets and set as an environment variable. -
Push to Hugging Face Hub: Use the
huggingface_hub
library to push the repository contents to the Hugging Face Hub.
Make sure you have the HF_TOKEN
secret set up in your GitHub repository settings to authenticate with Hugging Face Hub.
Please read our Contributing Guidelines for details on our code of conduct and the process for submitting pull requests.
For security concerns, please review our Security Policy.
This project is licensed under the MIT License - see the LICENSE file for details.
- Streamlit community for the amazing framework
- Scikit-learn team for the ML tools
- All contributors who help improve this project