Supervised Learning Algorithm Comparisons

This project aims to replicate and extend the comprehensive evaluation of supervised learning algorithms, inspired by the study conducted by Rich Caruana and Alexandru Niculescu-Mizil (CNM06). While the original study compared ten algorithms, this project focuses on three popular ones.

Background

The last extensive evaluation of supervised learning algorithms was conducted in the 90s. Since then, the landscape of machine learning has evolved significantly. This project revisits the topic, drawing inspiration from the CNM06 study.

Objective

Replicate the results of the CNM06 study, but with a focus on three algorithms: k-nearest neighbors, logistic regression, and decision trees. Various performance metrics are used for evaluation.

Original Study

An Empirical Comparison of Supervised Learning Algorithms Using Different Performance Metrics (Empirical Comparison)

Data Sets Used

Methodology

The methodology closely follows the original Cornell study (Empirical Comparison). Three datasets from the UCI Machine Learning Repository were chosen. Each dataset underwent preprocessing as described in the CNM06 paper. For each classifier-dataset combination, three trials were conducted, totaling 27 trials. The hyperparameter tuning process and specific settings for each algorithm are detailed in the CNM06 paper.

Experiment

A 5-fold cross-validation was performed on a training set of size 5000 for each dataset. The modeling process involved using pipelines for each algorithm on the training set. The best hyperparameters were chosen based on the performance during cross-validation. The accuracy of each model was computed for each trial, and the overall performance was determined by averaging the accuracies. Detailed performance metrics and comparisons can be found in the provided charts.

Results

How to Run Locally

Visit Google Colab (Internet connection and Gmail account required).
Select "GitHub" in the open window and paste the project URL.
Download the desired dataset(s) and upload them to Google Colab using the 'Files' icon on the left sidebar.
Click 'Runtime' and select 'Run all'.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
images		images
.gitattributes		.gitattributes
README.md		README.md
SupervisedLearningAlgorithmsCompareson_Final.ipynb		SupervisedLearningAlgorithmsCompareson_Final.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Supervised Learning Algorithm Comparisons

Background

Objective

Original Study

Data Sets Used

Methodology

Experiment

Results

How to Run Locally

About

Uh oh!

Releases

Packages

Languages

DilrajS/Supervised-Learning-Algorithms-Comparison

Folders and files

Latest commit

History

Repository files navigation

Supervised Learning Algorithm Comparisons

Background

Objective

Original Study

Data Sets Used

Methodology

Experiment

Results

How to Run Locally

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages