Skip to content

mathisdelsart/ML-Classification

Repository files navigation

ML-Classification - Machine Learning Projects

Python Machine Learning Scikit-learn Jupyter

A comprehensive collection of machine learning classification projects — from medical diagnosis (Parkinson's disease) to waveform classification, heart failure prediction, and gene mutation analysis.

ProjectsQuick StartTechnologies


About

This repository demonstrates fundamental and advanced machine learning classification techniques across four practical domains. Each project showcases different aspects of the ML workflow: data exploration, feature engineering, hyperparameter tuning, model comparison, and ensemble methods with real-world medical and biological datasets.


Projects

Medical diagnosis classification using decision trees and random forests.

  • Dataset: Parkinson's disease patient measurements
  • Task: Binary classification (Parkinson's vs healthy)
  • Methods: Decision trees, Random Forest, hyperparameter optimization
  • Tools: Cross-validation, learning curves, tree complexity analysis

Multi-class waveform pattern classification with discriminant analysis and support vector machines.

  • Dataset: Synthetic waveform patterns with noise
  • Task: Multi-class classification (3 waveform types)
  • Methods: Linear Discriminant Analysis (LDA), SVM (linear, polynomial, RBF, sigmoid kernels)
  • Tools: Kernel comparison, feature importance, model performance evaluation

Cardiovascular risk prediction using various classification algorithms.

  • Dataset: Heart failure clinical records
  • Task: Binary classification (survival prediction)
  • Methods: Decision trees, statistical analysis, model comparison
  • Tools: Feature analysis, class imbalance handling, performance metrics

Advanced ensemble learning for predicting gene mutation activity status.

  • Dataset: Gene expression and mutation data
  • Task: Binary classification (active vs inactive mutations)
  • Methods: Stacking ensemble, nested cross-validation, advanced feature selection
  • Tools: PCA, feature engineering, balanced accuracy optimization

Quick Start

Each project is self-contained with its own notebooks and datasets. Get started quickly:

# Decision Tree - Parkinson's Disease
cd decision-tree-parkinson
jupyter notebook data-exploration.ipynb

# Linear Discriminant & SVM - Waveform Classification
cd linear-discriminant-svm-waveform
jupyter notebook linear-discriminant-analysis.ipynb

# Heart Failure Classification
cd heart-failure-classification
jupyter notebook data-exploration.ipynb

# Gene Mutation Prediction
cd gene-mutation-prediction/model
python complete_pipeline.py

Technologies

Technology Purpose
Python 3.8+ Core implementation language
Jupyter Notebook Interactive analysis and visualization
Pandas Data manipulation and analysis
NumPy Numerical computing
Matplotlib/Seaborn Data visualization
Scikit-learn Machine learning algorithms
SciPy Statistical computations

Repository Structure

ML-Classification/
├── decision-tree-parkinson/           # Parkinson's disease classification
│   ├── data-exploration.ipynb
│   ├── hyperparameter-tuning.ipynb
│   ├── random-forest-comparison.ipynb
│   ├── data/
│   │   ├── parkinson_train.csv
│   │   └── parkinson_test.csv
│   └── results/
├── linear-discriminant-svm-waveform/  # Waveform classification
│   ├── linear-discriminant-analysis.ipynb
│   ├── svm-classification.ipynb
│   └── data/
│       ├── waveform_train.csv
│       └── waveform_test.csv
├── heart-failure-classification/       # Cardiovascular risk prediction
│   ├── data-exploration.ipynb
│   ├── classification-models.ipynb
│   └── data/
│       ├── HeartFailure_train.csv
│       └── HeartFailure_test.csv
├── gene-mutation-prediction/          # Gene mutation activity prediction
│   ├── complete_pipeline.py
│   ├── data/
│   └── results/
│       ├── nested_cv_results.csv
│       └── final_test_predictions.csv
└── README.md

Each folder contains:

  • Complete Jupyter notebooks with analysis
  • Real-world datasets
  • Results and visualizations
  • Comprehensive model evaluations

Author

Mathis DELSART

License

This project is developed for academic purposes as part of university coursework.


Built for LINFO2262 - Machine Learning: Classification and Evaluation @ UCLouvain (Universite catholique de Louvain).

About

ML-Classification — A collection of machine learning classification projects in Python, featuring supervised models, performance evaluation, and real-world medical and biological datasets.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors