Skip to content

Patients readmitted rate, which is a supervised learning classification problem. Using a combination of models (KNN, SVM, Decision Tree, Perceptron and Naïve Bayes algorithms to deal with the real-world datasets from UCI machine learning lab. After cleaning and hyper tuning, ensemble results and get the best accuracy.

Notifications You must be signed in to change notification settings

MarcoXM/Readmittedrate

This branch is up to date with CS6930GroupWork/Readmittedrate:master.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
Marco Wang
Jun 9, 2019
bda00b5 · Jun 9, 2019

History

46 Commits
Dec 3, 2018
Dec 2, 2018
Nov 16, 2018
Jun 9, 2019
Jun 9, 2019
Jun 9, 2019
Nov 28, 2018
Jun 9, 2019
Jun 9, 2019
Jun 9, 2019
Jun 9, 2019
Jun 9, 2019
Jun 9, 2019
Jun 9, 2019
Jun 9, 2019
Jun 9, 2019
Nov 7, 2018

Repository files navigation

Readmitted-Rate

Introduction

Patients readmitted rate, which is a supervised learning classification problem.

Using a combination of models (KNN, SVM, Decision Tree, Perceptron and Naïve Bayes algorithms) to deal with the real-world datasets from UCI machine learning lab.

Including:

EDA
Data Cleaning
Feature Seletion
Models tuning
Ensemble
and get the best accuracy about the readmitted rate.

Implementation:

Put these file { Smote_BackwardWrapper_Pipeline.py,

ForwardWrapper_Smote.py,

Results for the test set.py,

GridSearch Main Version Updated.py,

Base_Scoring_Algorithm (Corr, Chi2, Mutual_Info, C4_5 Importance).py,

New_train_set.csv,

New_test_set.csv } in the same location.

Filter Method:

Base_Scoring_Algorithm

(Corr, Chi2, Mutual_Info, C4_5 Importance).py is for feature selection:

use to calculate correlation, chi-square, mutual information between variables and target label

GridSearch Main Version Updated.py:

Using the results of Scoring to do feature selection with filter method; Tuning different models with selected features using gridsearch

###Results for the test set.py: Compute different accuracy score of test set with best tuned parameters

Wrapper Method:

Fordward wrapper to find the global best collection of features.

ForwardWrapper_Smote.py:

Input training set data and return selected features as csv

Smote_BackwardWrapper_Pipeline.py

Take advantages of pipeline to do SMOTE, backward wrapper and tuning;

Compute different accuracy score of test set with best tuned parameters;

Save best parameters, test scores, and feature selected as csv

About

Patients readmitted rate, which is a supervised learning classification problem. Using a combination of models (KNN, SVM, Decision Tree, Perceptron and Naïve Bayes algorithms to deal with the real-world datasets from UCI machine learning lab. After cleaning and hyper tuning, ensemble results and get the best accuracy.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.9%
  • Python 1.1%