My portfolio of applied AI and machine learning projects demonstrating various techniques across time series forecasting, natural language processing, traditional machine learning and data visualisation tasks, currently including:
- XGBoost: Gradient boosting implementation for time series prediction
- Prophet: Facebook's forecasting tool for time series with seasonality
- Multi-horizon forecasting (e.g., 1-day, 7-day, 30-day ahead)
- Comprehensive metrics: MAE, RMSE, MAPE, SMAPE, R², MASE
- Automated train/validation/test splitting with temporal ordering
- Feature engineering (lag features, date components)
- Model comparison with detailed visualisations
- Persistent model storage and results tracking
- Machine learning based models: logistic regression, naive bayes
- Deep learning based models: feed forward neural network, CNN, RNN (LSTM), GRU,Bi-LSTM with attention
- Pre-trained language models:
- BERT, RobERTa, DistilBERT (base & fine-tuned)
Available at edin-vis.streamlit.app
Since my main goal here is learning and gaining experience, I have tried to stick to the below principles:
- Modular Architecture: Clean separation between different ML domains with reusable pipeline components
- Production-Ready Code: Well-structured, documented, and maintainable implementations
- Comprehensive Evaluation: Multiple metrics, visualisations, and comparative analysis
- Configuration-Driven: Easy experimentation through YAML configuration files
- Extensible Design: Framework supports adding new models and project types
- Modern Libraries: Uses well-known python packages such as
wandb,mlflow,pytorch,lightning,transformers,huggingface-hub,accelerate,unsloth,dspy,langchain,weaviate
AI-Portfolio/
├── TimeSeries/ # Time series forecasting
│ ├── configs/
│ │ └── config.yaml # Model and pipeline configuration
│ ├── data/
│ │ ├── raw/ # Original time series data
│ │ └── processed/ # Processed datasets
│ ├── models/ # Model implementations
│ │ ├── statistical/ # Prophet, ARIMA, ETS, Theta
│ │ ├── ml/ # XGBoost, LightGBM, Random Forest
│ │ └── neural/ # LSTM, GRU, N-BEATS, Transformer
│ ├── src/ # Core pipeline code
│ │ ├── data_loader.py # Data loading and preprocessing
│ │ ├── pipeline.py # Main pipeline orchestration
│ │ ├── evaluate.py # Evaluation and metrics
│ │ └── utils.py # Utility functions
│ ├── results/ # Output storage
│ │ ├── models/ # Saved models
│ │ ├── figures/ # Visualisations
│ │ └── metrics/ # Performance metrics
│ ├── main.py # Entry point
│ └── requirements.txt # Dependencies
├── NLP/ # Natural Language Processing tasks
│ ├── configs/
│ │ └── config.yaml # Model and pipeline configuration
│ ├── data/
│ │ ├── books/
│ │ ├── dvd/
│ │ ├── electronics/
│ │ └── kitchen_&_housewares/
│ ├── models/ # Model implementations
│ │ ├── Deep/ # FFNN, CNN, RNN, Transformers
│ │ ├── LMs/ # BERT, RoBERTa, DistilBERT
│ │ └── ml/ # Logistic Regression, Naive Bayes
│ ├── src/ # Core pipeline code
│ │ ├── data_loader.py # Data loading and preprocessing
│ │ ├── pipeline.py # Main pipeline orchestration
│ │ ├── evaluate.py # Evaluation and metrics
│ │ └── utils.py # Utility functions
│ ├── results/ # Output storage
│ │ ├── models/ # Saved models
│ │ ├── figures/ # Visualisations
│ │ └── metrics/ # Performance metrics
└── MachineLearning/ # Traditional ML tasks (PLANNED)
- Navigate to the desired directory (TimeSeries, NLP/Classification):
cd DIRECTORY- Install dependencies:
uv python install 3.12
uv venv --python 3.12
source .venv/bin/activate # optional
uv pip install -r requirements.txtRun the pipeline for each sub-project with default settings:
python main.pyEdit **/configs/config.yaml in each sub-project. E.g. for the Time Series project you can customise:
- Data Settings: File paths, timestamp column, target variables
- Model Parameters: Hyperparameters for XGBoost and Prophet
- Forecast Horizons: Which time steps ahead to predict (e.g., [1, 7, 30])
- Evaluation Metrics: Which metrics to compute
Example configuration for XGBoost:
models:
ml:
xgboost:
n_estimators: 100
max_depth: 6
learning_rate: 0.1Define experiments in **/main.py e.g.:
EXPERIMENTS = [
{
"model": "xgboost",
},
{
"model": "prophet",
},
]After running the pipeline, results are saved to:
- Models:
**/results/models/ - Metrics:
**/results/metrics/results_summary.json - Figures:
**/results/figures/
Below are the latest model comparison results from the time series forecasting pipeline comparing XGBoost and Prophet across multiple forecast horizons.
Comprehensive comparison of model performance across different forecast horizons (1-day, 7-day, 30-day ahead).
Detailed performance metrics for 1-step ahead forecasting.
Visual comparison of actual vs predicted values on the test set across all forecast horizons.
- Additional statistical models: ARIMA, ETS, Theta
- Machine learning models: LightGBM, Random Forest, Gradient Boosting
- Deep learning models: xLSTM, GRU, N-BEATS, Transformer, TCN
- Advanced ensemble methods
- Hyperparameter optimisation
- Multi-modal agentic VQA with open-source VLMs
- More open source [thinking] models for classification (nemotron, mistral, qwen, etc. for zero-shot, few-shot, and chain-of-thought prompting)


