This repository is a comprehensive collection of implementations of essential machine learning models. Each implementation includes a brief explanation, the dataset used, and the Jupyter Notebook used to demonstrate the model. This resource is valuable for beginners and practitioners who want to understand, experiment with, and apply machine learning concepts.
The repository covers supervised, unsupervised, and deep learning models. Each model includes:
- Description: Overview of the model and its key features.
- Dataset: The dataset used for demonstration, with sources linked when applicable.
- Code: This Jupyter Notebook implementation leverages popular libraries such as scikit-learn, TensorFlow, and PyTorch.
- Evaluation: Metrics used to assess the performance of each model.
The implementations are designed to be modular, making it easy to adapt them to new datasets or use cases.
- Python 3.8+
- Libraries:
numpypandasscikit-learnmatplotlibseaborntensorflowtorchgym
- Clone the repository:
git clone https://github.com/Mo-Sam-Mo/ml-models-repo.git cd ml-models-repo - Install dependencies:
pip install -r requirements.txt
Each model's implementation is stored in its respective folder. To run a specific model:
- Navigate to the folder, e.g.,
Linear_Regression. - Open the Jupyter Notebook:
jupyter notebook linear_regression.ipynb
Each notebook is well-documented, and datasets will either be loaded directly from libraries (e.g., sklearn.datasets) or downloaded automatically.
Contributions are welcome! If you'd like to add new models, improve the existing code, or fix issues:
- Fork the repository.
- Create a new branch:
git checkout -b feature-name
- Commit your changes:
git commit -m 'Add feature' - Push to the branch:
git push origin feature-name
- Submit a pull request.
This project is licensed under the MIT License - see the LICENSE file for details.
For queries or feedback, feel free to reach out:
- Name: Mo-Sam-Mo
- GitHub: Mo-Sam-Mo
- Description: Predicts continuous values by fitting a linear equation to the input data.
- Dataset: Randomly created Dataset for linear regression (
Kaggle Datasets) - Metrics: Mean Squared Error (MSE), R-squared.
- Description: Used for binary and multi-class classification by predicting probabilities.
- Dataset: Breast Cancer Dataset (
sklearn.datasets) - Metrics: Accuracy, Precision, Recall, F1-Score.
- Description: Non-linear model that splits data based on feature conditions.
- Dataset: Iris Dataset (
sklearn.datasets) - Metrics: Accuracy, Gini Impurity.
- Description: An ensemble of decision trees for improved performance.
- Dataset: Titanic Dataset (Kaggle)
- Metrics: Accuracy, ROC-AUC.
- Description: Assigns labels based on the majority class of k-nearest neighbors.
- Dataset: Wine Quality Dataset (UCI)
- Metrics: Accuracy, Confusion Matrix.
- Description: Finds a hyperplane to classify data.
- Dataset: MNIST Dataset (
sklearn.datasets) - Metrics: Accuracy, Precision, Recall.
- Description: Groups data into k clusters based on feature similarity.
- Dataset: Mall Customers Dataset (Kaggle)
- Metrics: Inertia, Silhouette Score.
- Description: Reduces dimensionality by finding orthogonal components.
- Dataset: Digit Recognition Dataset (
sklearn.datasets) - Metrics: Variance Explained Ratio.
- Description: Extracts spatial features for image classification tasks.
- Dataset: CIFAR-10 Dataset (
tensorflow.keras.datasets) - Metrics: Accuracy.
- Description: Processes sequential data, with LSTM and GRU variants for improved context.
- Dataset: IMDB Reviews Dataset (
tensorflow.keras.datasets) - Metrics: Accuracy.
- Description: Reinforcement learning algorithm to optimize decision-making.
- Dataset: CartPole Environment (
gym) - Metrics: Cumulative Reward.