Machine Learning Algorithms from Scratch

This project implements several machine learning algorithms from scratch in Python, including Decision Tree Classifier and Regressor, AdaBoost Classifier, and Gradient Boost Regressor. The goal is to build these algorithms without using any explicit machine learning libraries, except for basic utilities like NumPy.

Dataset

The dataset used in this project consists of real estate features in different areas of Bangalore. The original data can be found here. The data was pre-processed for convenience and split into train, validation, and test sets.

Variables

availability: is the property available immediately (1) or in the near future (0).
total_sqft: the area of the property in square feet (1 foot = 30.54 cm).
bedrooms: the number of bedrooms in the property.
bath: the number of bathrooms in the property.
balcony: the number of balconies in the property.
rank: the ranking of the neighborhood in terms of average price (1 is the highest).
area_type: is the property type a built up area (B) or plot area (P).
price in rupees: the price of the property.

Algorithms Implemented

Decision Tree
- Classifier: Predicts the area type (B, P) using all features in the dataset.
- Regressor: Predicts the price of a property using all features in the dataset.
AdaBoost Classifier: Predicts the area type (B, P) using all features in the dataset.
Gradient Boost Regressor: Predicts the price of a property using all features in the dataset.

Evaluation

Each algorithm builds a model based on the training and validation data and predicts the label of the test data. The following evaluation metrics are reported:

Accuracy for classification tasks.
Mean Squared Error (MSE) for regression tasks.

Comparison with Sklearn

The project also includes implementations of the same algorithms using the Sklearn library. The results of the custom implementations are compared with the Sklearn models in terms of metrics and runtime.

Dependencies

Python 3.x
NumPy
Sklearn (for comparison)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitattributes		.gitattributes
Adaboost_ML_1.html		Adaboost_ML_1.html
Adaboost_ML_1.ipynb		Adaboost_ML_1.ipynb
DecisionTreeClassifier_ML.html		DecisionTreeClassifier_ML.html
DecisionTreeClassifier_ML.ipynb		DecisionTreeClassifier_ML.ipynb
README.md		README.md
Regression_DecisionTree_GradientBoost.html		Regression_DecisionTree_GradientBoost.html
Regression_DecisionTree_GradientBoost.ipynb		Regression_DecisionTree_GradientBoost.ipynb
Section_C_2.txt		Section_C_2.txt
data_hw.csv		data_hw.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning Algorithms from Scratch

Dataset

Variables

Algorithms Implemented

Evaluation

Comparison with Sklearn

Dependencies

About

Releases

Packages

Languages

mlk500/ML-Algorithms

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Algorithms from Scratch

Dataset

Variables

Algorithms Implemented

Evaluation

Comparison with Sklearn

Dependencies

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages