Skip to content

Execution of k-Nearest Neighbors algorithm on a UCI dataset containing the chemical composition of various types of glass using Python Pandas and Scikit-Learn.

Notifications You must be signed in to change notification settings

jakemath/knn-sklearn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

KNN - Using SKLearn

Problem Statement:

You are provided with a dataset from USA Forensic Science Service which has description of 6 types of 
glass; defined in terms of their oxide content (i.e. Na, Fe, K, etc). Your task is to use K-Nearest Neighbor (KNN) 
classifier to classify the glasses. The original dataset is available at 
(https://archive.ics.uci.edu/ml/datasets/glass+identification). For detailed description on the attributes of the dataset, 
please refer to the original link of the dataset in the UCI ML repository.

This program performs exploratory data analysis on the dataset using Python Pandas, including dropping irrelevant fields for predicted values, and standardization of each attribute.

Following data cleaning, two Scikit-Learn KNN models are created for two different distance metrics: Square Euclidean and Manhattan distance. The performance of the two models using different distance metrics is compared in terms of accuracy to the test data and Scikit-Learn Classification Report.

About

Execution of k-Nearest Neighbors algorithm on a UCI dataset containing the chemical composition of various types of glass using Python Pandas and Scikit-Learn.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published