# Lab 7: Heart Attack

### The Data

In [1]:
import pandas as pd
import numpy as np
from plotnine import *
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer, make_column_selector
from sklearn.preprocessing import StandardScaler, OneHotEncoder, PolynomialFeatures, MinMaxScaler
from sklearn.linear_model import LinearRegression, Ridge, Lasso, ElasticNet, LogisticRegression
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor, DecisionTreeClassifier


import warnings
from sklearn.exceptions import ConvergenceWarning

In [2]:
ha = pd.read_csv("https://www.dropbox.com/s/aohbr6yb9ifmc8w/heart_attack.csv?dl=1")

#No missing values, might check for outliers
ha

Unnamed: 0,age,sex,cp,trtbps,chol,restecg,thalach,output
0,63,1,3,145,233,0,150,1
1,37,1,2,130,250,1,187,1
2,56,1,1,120,236,1,178,1
3,57,0,0,120,354,1,163,1
4,57,1,0,140,192,1,148,1
...,...,...,...,...,...,...,...,...
268,59,1,0,164,176,0,90,0
269,57,0,0,140,241,1,123,0
270,45,1,3,110,264,1,132,0
271,68,1,0,144,193,1,141,0


In [3]:
ha['output'] = pd.Categorical(ha['output'], categories=[0, 1], ordered=True)
ha['cp'] = pd.Categorical(ha['cp'], categories=[0, 1, 2, 3], ordered=True)

In [4]:
# Mapping for 'sex'
ha['sex'] = ha['sex'].map({'female': 0, 'male': 1})

# Mapping for 'cp'
cp_mapping = {
    1: 'typical angina',
    2: 'atypical angina',
    3: 'non-anginal pain',
    4: 'asymptomatic'
}
ha['cp'] = ha['cp'].map(cp_mapping)

# Mapping for 'restecg'
restecg_mapping = {
    0: 'normal',
    1: 'ST-T wave abnormality',
    2: 'probable/definite left ventricular hypertrophy'
}
ha['restecg'] = ha['restecg'].map(restecg_mapping)

# Mapping for 'output'
output_mapping = {
    0: 'not at risk',
    1: 'at risk'
}
ha['output'] = ha['output'].map(output_mapping)

### Outliers
Went through and calculated any outliers for each of the numerical variables. While there were a handful in each of the variables, none of the values seemed to be a calculation or measurement error, so I did not remove any outliers

# Part One: Fitting Models

This section asks you to create a final best model for each of the model types studied this week. For each, you should:

- Find the best model based on ROC AUC for predicting the target variable.
- Report the (cross-validated!) ROC AUC metric.
- Fit the final model.
- Output a confusion matrix; that is, the counts of how many observations fell into each predicted class for each true class.
- (Where applicable) Interpret the coefficients and/or estimates produced by the model fit.

You should certainly try multiple model pipelines to find the best model. You do not need to include the output for every attempted model, but you should describe all of the models explored. You should include any hyperparameter tuning steps in your writeup as well.

In [8]:
X = ha.drop(["output"], axis = 1)
y = ha["output"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

In [9]:
ct = ColumnTransformer(
  [
    ("standardize",
    StandardScaler(),
    make_column_selector(dtype_include=np.number))
  ],
  remainder = "passthrough"
)

## Q1: KNN

In [10]:
knn_pipeline_1 = Pipeline(
  [("preprocessing", ct),
  ("knn", KNeighborsRegressor(n_neighbors=12))]
)

In [11]:
knn_pipeline_fitted_1 = knn_pipeline_1.fit(X_train, y_train)
y_preds_1 = knn_pipeline_fitted_1.predict(X_test)



ValueError: could not convert string to float: 'typical angina'

## Q2: Logistic Regression

## Q3: Decision Tree

## Q4: Interpretation

Which predictors were most important to predicting heart attack risk?

## Q5: ROC Curve

Plot the ROC Curve for your three models above.

# Part Two: Metrics

Consider the following metrics:

- True Positive Rate or Recall or Sensitivity = Of the observations that are truly Class A, how many were predicted to be Class A?
- Precision or Positive Predictive Value = Of all the observations classified as Class A, how many of them were truly from Class A?
- True Negative Rate or Specificity or Negative Predictive Value = Of all the observations classified as NOT Class A, how many were truly NOT Class A?

Compute each of these metrics (cross-validated) for your three models (KNN, Logistic Regression, and Decision Tree) in Part One.

# Part Three: Discussion

Suppose you have been hired by a hospital to create classification models for heart attack risk.

The following questions give a possible scenario for why the hospital is interested in these models. For each one, discuss:
- Which metric(s) you would use for model selection and why.
- Which of your final models (Part One Q1-3) you would recommend to the hospital, and why.
- What score you should expect for your chosen metric(s) using your chosen model to predict future observations.

## Q1

The hospital faces severe lawsuits if they deem a patient to be low risk, and that patient later experiences a heart attack.

## Q2

The hospital is overfull, and wants to only use bed space for patients most in need of monitoring due to heart attack risk.

## Q3

The hospital is studying root causes of heart attacks, and would like to understand which biological measures are associated with heart attack risk.

## Q4

The hospital is training a new batch of doctors, and they would like to compare the diagnoses of these doctors to the predictions given by the algorithm to measure the ability of new doctors to diagnose patients.

# Part Four: Validation

Before sharing the dataset with you, I set aside a random 10% of the observations to serve as a final validation set.

ha_validation = pd.read_csv("https://www.dropbox.com/s/jkwqdiyx6o6oad0/heart_attack_validation.csv?dl=1")

Use each of your final models in Part One Q1-3, predict the target variable in the validation dataset.

For each, output a confusion matrix, and report the ROC AUC, the precision, and the recall.

Compare these values to the cross-validated estimates you reported in Part One and Part Two. Did our measure of model success turn out to be approximately correct for the validation data?

# Part Five: Cohen’s Kappa
Another common metric used in classification is Cohen’s Kappa.

Use online resources to research this measurement. Calculate it for the models from Part One, Q1-3, and discuss reasons or scenarios that would make us prefer to use this metric as our measure of model success. Do your conclusions from above change if you judge your models using Cohen’s Kappa instead? Does this make sense?