# Diabetes Classification - Models
[Jose R. Zapata](https://joserzapata.github.io)
- https://joserzapata.github.io
- https://twitter.com/joserzapata
- https://www.linkedin.com/in/jose-ricardo-zapata-gonzalez/       


## Introduction


Analyze factors related to readmission as well as other outcomes pertaining to patients in order to classify a patient-hospital outcome

3 different outputs:

1. No readmission

2. A readmission in less than `30` days (this situation is not good, because maybe your treatment was not appropriate);

3. A readmission in more than 30 days (this one is not so good as well the last one, however, the reason could be the state of the patient.


## Main Objective

> **How effective was the treatment received in hospital?**

## Principal References

### Paper

Beata Strack, Jonathan P. DeShazo, Chris Gennings, Juan L. Olmo, Sebastian Ventura, Krzysztof J. Cios, and John N. Clore, “Impact of HbA1c Measurement on Hospital Readmission Rates: Analysis of 70,000 Clinical Database Patient Records,” BioMed Research International, vol. 2014, Article ID 781670, 11 pages, 2014.

https://www.hindawi.com/journals/bmri/2014/781670/

### Dataset

https://archive.ics.uci.edu/ml/datasets/Diabetes+130-US+hospitals+for+years+1999-2008#

### Data description

https://www.hindawi.com/journals/bmri/2014/781670/tab1/

# Import libraries

In [None]:
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split

# Load Dataset

In [None]:
data = pd.read_csv("../data/processed/data_imbalanced.csv")

In [None]:
data.info()

## Encode categorical colums

In [None]:
cat_cols = list(data.select_dtypes('object').columns)
class_dict = {}
for col in cat_cols:
    data = pd.concat([data.drop(col, axis=1), pd.get_dummies(data[col], prefix=col, drop_first=True)], axis=1)

In [None]:
data.info()

## Input and output data

In [None]:
X = data.drop("readmitted", axis=1)
y= data["readmitted"]

In [None]:
X.shape

In [None]:
y.shape

## split dataset

Split data straty = y because the dataset is imbalanced

In [None]:

X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.3,random_state =123 , stratify=y)

# Simple BaseLine

In [None]:
import lazypredict
from lazypredict.Supervised import LazyClassifier

In [None]:
clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None)
models,predictions = clf.fit(X_train, X_test, y_train, y_test)
models

# MODELS

# Validacion y Evaluacion Cruzada (k-fold Cross Validation)

Entrenar los algoritmos Usando el Training Set

# Optimizacion de Hiper parametros (Hyper Parameter optimization)

Se seleccionan alguno modelos para realziar eset paso, ya que tiene una carga computacional alta.

# Prueba final del modelo con el Test set

# Save Model

In [None]:
from joblib import dump # libreria de serializacion

# garbar el modelo en un archivo
#dump(Modelo_final, 'Nombre_Archivo_Modelo.joblib')

# Comunicacion de Resultados (Data Story Telling)

# Conclusiones

# Ayudas Y Referencias

- Correction to: Hospital Readmission of Patients with Diabetes - https://link.springer.com/article/10.1007/s11892-018-0989-1

- Center for disease control and prevention, Diabetes atlas- https://gis.cdc.gov/grasp/diabetes/DiabetesAtlas.html

- https://medium.com/@joserzapata/paso-a-paso-en-un-proyecto-machine-learning-bcdd0939d387
- [a-complete-machine-learning-walk-through-in-python-part-one](https://towardsdatascience.com/a-complete-machine-learning-walk-through-in-python-part-one-c62152f39420)

- https://www.kaggle.com/vignesh1609/readmission-classification-model

- https://www.kaggle.com/kavyarall/predicting-effective-treatments/

[Jose R. Zapata](https://joserzapata.github.io)
- https://joserzapata.github.io
- https://twitter.com/joserzapata
- https://www.linkedin.com/in/jose-ricardo-zapata-gonzalez/   