Machine Learning from Scratch

This repository was created to deepen my understanding of machine learning methods by implementing them from scratch in Python Jupyter Notebooks and comparing their results with the models provided by existing libraries.

Description

The notebooks are organized to demonstrate the step-by-step process of building and evaluating algorithms. They rely on a set of widely used Python libraries:

NumPy – for vectorized and matrix operations, linear algebra, and numerical routines.
Pandas – for structured data manipulation, preprocessing, and tabular analysis.
Matplotlib and Seaborn – for data visualization, including exploratory analysis and graphical representation of algorithm results.
Scikit-learn – for access to datasets, utility functions, and baseline models for validation.

Each notebook typically includes:

Math – key formulas and theoretical background for the algorithm or evaluation metric.
Implementation – step-by-step Python code that reproduces the method without relying on high-level machine learning functions.
Datasets – description and links to datasets used in the experiments.
Visualization – plots illustrating the behavior of the algorithm, decision boundaries, performance metrics, or error analysis.
Comparison – evaluation of the custom implementation against Scikit-learn (or other libraries) to verify correctness and performance.

Notebook	Description
EDA COVID-19	A small exploratory data analysis (EDA) of COVID-19 datasets, focusing on general trends and basic insights from the data.
Linear Regression	Analysis and implementation of linear regression to identify linear relationships that may influence students’ learning outcomes.
K-Nearest Neighbors (KNN)	Implementation and evaluation of the K-Nearest Neighbors (KNN) algorithm for both classification and regression tasks.
Principal Component Analysis (PCA)	Application of PCA for dimensionality reduction and visualization, highlighting how major components capture the key variance in the dataset.
Clustering Algorithms	Exploration of clustering methods including K-Means, DBSCAN, and Agglomerative Clustering to identify hidden patterns and group structures in the data.

Author

Created by Denys Bondarchuk. Feel free to reach out or contribute to the project!

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Machine Learning from Scratch

Description

Contents

Author

About

Uh oh!

Releases

Packages

Languages

License

thejvdev/ml-from-scratch

Folders and files

Latest commit

History

Repository files navigation

Machine Learning from Scratch

Description

Contents

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages