A comprehensive collection of Machine Learning algorithms implemented during my Master's degree program. This repository showcases hands-on implementations of supervised, unsupervised, and advanced ML techniques using Python and scikit-learn.
This repository contains Colab notebooks demonstrating various machine learning algorithms learned and implemented throughout my graduate studies. Each notebook includes theoretical explanations, practical implementations, and real-world applications of the algorithms.
Program: Master's Degree
Focus Area: Artificial Intelligence & Machine Learning
Duration: 2 years of intensive study and implementation
- Linear Regression - Simple linear regression for continuous variable prediction
- Multivariate Linear Regression - Multiple feature regression analysis
- Multivariate Linear Regression (Extended) - Advanced multivariate techniques
- Logistic Regression - Binary and multiclass classification
- Decision Tree - Tree-based classification and regression
- Support Vector Machine (SVM) - Margin-based classification
- SVM with ROC-AUC Analysis - Performance evaluation using ROC curves
- SVM vs Logistic Regression - Comparative analysis
- Naive Bayes (Multinomial) - Text classification and probability-based predictions
- Complement Naive Bayes - Enhanced NB for imbalanced datasets
- K-Means Clustering - Centroid-based clustering
- Finding Optimal K Value - Elbow method and silhouette analysis
- DBSCAN Clustering - Density-based spatial clustering
- Hierarchical Clustering - Agglomerative and divisive clustering
- Regularization Techniques - Ridge, Lasso, and Elastic Net for overfitting prevention
- Python 3.x
- Jupyter Notebook / Google Colab
- Libraries:
- NumPy - Numerical computing
- Pandas - Data manipulation
- Scikit-learn - Machine learning algorithms
- Matplotlib - Data visualization
- Seaborn - Statistical plotting
- SciPy - Scientific computing
- Simple and multiple linear regression
- Polynomial regression
- Residual analysis
- R-squared and adjusted R-squared
- Binary and multiclass classification
- Decision boundaries
- Confusion matrix analysis
- Precision, recall, and F1-score
- ROC-AUC curves
- Centroid-based clustering (K-Means)
- Density-based clustering (DBSCAN)
- Hierarchical clustering dendrograms
- Cluster evaluation metrics
- Train-test split
- Cross-validation
- Hyperparameter tuning
- Overfitting and underfitting
- Regularization (L1, L2)
pip install numpy pandas scikit-learn matplotlib seaborn jupyter- Clone the repository
git clone https://github.com/Manya123-max/Machine-Learning-Algorithms.git
cd Machine-Learning-Algorithms- Launch Jupyter Notebook
jupyter notebook- Or open in Google Colab
- Upload the
.ipynbfiles to Google Drive - Open with Google Colaboratory
Through these implementations, I have gained proficiency in:
- ✅ Understanding mathematical foundations of ML algorithms
- ✅ Implementing algorithms from scratch and using libraries
- ✅ Data preprocessing and feature engineering
- ✅ Model selection and evaluation
- ✅ Handling imbalanced datasets
- ✅ Interpreting model performance metrics
- ✅ Applying appropriate algorithms to real-world problems
- ✅ Optimizing model performance through regularization
Machine-Learning-Algorithms/
│
├── Regression/
│ ├── ML1_LinearRegiression.ipynb
│ ├── MULTIVARIATE.ipynb
│ └── multivariatelinearregression.ipynb
│
├── Classification/
│ ├── LogisticRegression.ipynb
│ ├── DecisionTree.ipynb
│ ├── SuppoortVectorMachineExample.ipynb
│ ├── SupportVectorMachineROC_AUC.ipynb
│ ├── SupportVectorMachine_Logistic.ipynb
│ ├── NaiveBayesMultinomial.ipynb
│ └── COMPLEMENT_NB.ipynb
│
├── Clustering/
│ ├── K_MeansClustering1.ipynb
│ ├── FINDING__k_value_AND_cluster2.ipynb
│ ├── DBSCAN_CLUSTERING3.ipynb
│ └── HierarchicalClustering.ipynb
│
├── Optimization/
│ └── Regularization.ipynb
│
└── README.md
Comprehensive implementation including kernel tricks, margin optimization, and comparative analysis with logistic regression using ROC-AUC metrics.
Complete clustering workflow from finding optimal K values using elbow method to advanced density-based and hierarchical clustering techniques.
In-depth exploration of L1 (Lasso) and L2 (Ridge) regularization for preventing overfitting in linear models.
Manya
This project is open source and available for educational purposes.