<a href="https://colab.research.google.com/github/matthewpecsok/4482_fall_2022/blob/main/labs/quiz_7_ethical_ai.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Import Packages

In [1]:
## Load packages 

import pandas as pd
import numpy as np
import sklearn
from sklearn import tree
from sklearn import metrics
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt

from sklearn.model_selection import cross_validate

# Import Data

In [2]:
# read_csv has some defaults, we can just take the defaults here, but be aware they exist. 
titanic_raw = pd.read_csv("https://raw.githubusercontent.com/matthewpecsok/4482_fall_2022/main/data/titanic_cleaned.csv")
titanic = titanic_raw.copy()

# raw is the original unedited version of our data which can be useful for inspecting changes we've made 
# compared to the original unedited data

# EDA

## Gender Analysis

In [3]:
titanic_raw.Sex.value_counts(normalize=False)

male      453
female    261
Name: Sex, dtype: int64

In [4]:
titanic_raw.Sex.value_counts(normalize=True)

male      0.634454
female    0.365546
Name: Sex, dtype: float64

In [5]:
ct = pd.crosstab(titanic_raw['Sex'], titanic_raw['Survived'])
ct.div(ct.sum(axis=1), axis=0)

Survived,0,1
Sex,Unnamed: 1_level_1,Unnamed: 2_level_1
female,0.245211,0.754789
male,0.794702,0.205298


we can see that most passengers were males, and that the most males did not survive, while most females did survive. 

what we would like to know.... are the quality of predictions roughly the same for both males and females?

## prepare for modeling

In [6]:
y_target = titanic.pop('Survived')
titanic_encoded_X = pd.get_dummies(titanic)
# now that we have encoded our data split it into train test

X_train, X_test, y_train, y_test = train_test_split(titanic_encoded_X, y_target , test_size=0.3, random_state=0,stratify=y_target)

In [7]:
tree_model_1 = tree.DecisionTreeClassifier(random_state=42,ccp_alpha=.003)
tree_model_1

DecisionTreeClassifier(ccp_alpha=0.003, random_state=42)

# Fit the model

In [8]:
tree_model_1.fit( X_train, y_train)

DecisionTreeClassifier(ccp_alpha=0.003, random_state=42)

In [9]:
tree_model_1.get_n_leaves() # how complex is our tree?

30

In [10]:
y_train_pred = tree_model_1.predict(X_train) # predict on train set
y_test_pred = tree_model_1.predict(X_test) # predict on test set

 ## train metrics and confusion matrix

In [11]:
print(confusion_matrix(y_true=y_train,y_pred=y_train_pred))
print(metrics.classification_report(y_train,y_train_pred))

[[280  16]
 [ 36 167]]
              precision    recall  f1-score   support

           0       0.89      0.95      0.92       296
           1       0.91      0.82      0.87       203

    accuracy                           0.90       499
   macro avg       0.90      0.88      0.89       499
weighted avg       0.90      0.90      0.89       499



## test metrics and confusion matrix

In [12]:
print(confusion_matrix(y_true=y_test,y_pred=y_test_pred))
print(metrics.classification_report(y_test,y_test_pred))

[[118  10]
 [ 27  60]]
              precision    recall  f1-score   support

           0       0.81      0.92      0.86       128
           1       0.86      0.69      0.76        87

    accuracy                           0.83       215
   macro avg       0.84      0.81      0.81       215
weighted avg       0.83      0.83      0.82       215



now let's find out what kind of errors we are getting by gender. a simple strategy is to put the test predictions BACK into the test X dataframe so we can then filter by gender, in addition you must also put the real target value back into the dataframe as it was popped.

Remember here our goal is to generate the same metrics we have generated in the past (ie accuracy precision recall) but for only males, and only females to see if some of thos metrics are worse by gender. for example is precision much better for men than women?

Remember our X data is encoded so our filter must take that into account.

# Prepare the data for model evaluation by gender

## put the predictions and the real values into X_test

For example:

```
X_test['y_true'] = y_test
X_test['y_pred'] = y_pred

```

In [13]:
# TODO by Student
# put the true and predicted values BACK into the X_test dataframe

## Subset our dataframe into 2 dataframes, split by gender

create 2 new dataframes, one for female, one for male by filtering the dataframes for gender.


In [14]:
# sample new_df1 = old_df[old_df['SomeColumn']==1].copy()
# sample new_df2 = old_df[old_df['SomeColumn']==0].copy()
# TODO by Student

now we have our filtered dataframes by gender, generate the performance metrics again for each gender's prediction in test.

this is fairly straightforward now. you should have two dataframes, each with a specific gender and only that gender in the dataframe. you should also have two columns in those dataframes with the predicted value and the true value for the target variable. Simply pass those columns into the metrics and confusion matrix. 



```
print(confusion_matrix(y_true=df.y_true,y_pred=df.y_pred))
print(metrics.classification_report(df.y_true,df.y_pred))
```



female results:


In [15]:
# generate confusion matrix and classification reports for females
# TODO by Student

male results:


In [16]:
# generate confusion matrix and classification reports for males
# TODO by Student