Skip to content

CS207-AP/DataScience-ClassifierC-

Repository files navigation

ClassifierC

This is the repository of our final project for CS 207. We will be writing a Naive Bayes Classifier, a KNN Classifier and a Simple Linear Regressor in C and working on several Data Science problems for this.

Archit Checker | Sachin Bhatia | Shivam Agarwal

Exploring Data Science

OVERVIEW

For this project, we wrote a KNN Classifier, a Naive Bayes Classifier, and a Simple Regressor from scratch in C. We used python to pre-process and clean various datasets and performed classification problems on them.

GOALS

  1. Writing our own classifiers in C.
  2. Working with datasets of various sizes.
  3. Understanding data science techniques.
  4. Using python to clean datasets.
  5. Using python libraries to do more challenging data science problems.

MILESTONES

BEGINNER LEVEL​:

IRIS Dataset This dataset has 150 rows and 4 columns.

Problem:​ Predict the class of the flower based on available attributes.

Pima Indian Diabetes Dataset

Problem:​ Predict if the person is suffering from type-2 Diabetes.

Loan Prediction Dataset This dataset has 615 rows and 13 columns.

Problem:​ Predict if a loan will get approved or not.

Turkiye Student Evaluation Dataset This dataset has 5820 rows and 33 columns.

Problem:​ Predict final grade based on answers to all other questions.

INTERMEDIATE LEVEL​:

Black Friday Dataset This dataset has 550,069 rows and 12 columns.

Problem:​ Predict purchase amount.

Trip History Dataset This dataset has 2.2 Lakh rows..

Problem: ​ Identify the User Type.

ADVANCED LEVEL​:

Digit Identifier Dataset This dataset has pixel values for around 50,000 images of 28 X 28 size.

Problem:​ Identify digits from pixel values.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published