# Exercise 4 - Model Evaluation
Idag ska vi se hur bra våra modeller egentligen är. Vi kommer att tackla ett klassificeringsproblem (titanic - decisiontree, random forest och XGBoost) och ett regressionsproblem (huspriser, Model 1, Model 2 och Model 3) och skapa mått på hur bra dessa modeller är. Vi ska även skapa baselines för att ha en referenspunkt på hur bra våra modeller är i förhållande till något annat än ML.

## Klassificeringsproblem - Titanic
Note: Använd samma kod som i Exercise 2 för att generera tre modeller som predikterar vilka som kommer att överleva titanic.
1. Ladda in, städa och dela upp träningssettet titanic.csv
2. Skapa och träna fyra klassificeringsmodeller (Decision Tree, Random Forest, XGBoost, SVM).
3. För varje modell, ta fram måtten:
    - Accuracy
    - Precision
    - Recall
    - F1, F2, F0.5
4. Vilken modell presterar bäst? Skiljer sig modellens prestanda från Exercise 2 när vi endast undersökte modellens Accuracy?

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
df = pd.read_csv('titanic.csv')

def dictionary_function(df, col):
    my_value_list = sorted(list(set([item[0] for item in list(set(list(df[col].values)))])))
    my_ranking_list = list(range(len(my_value_list)))
    my_dictionary = {}
    for x,y in zip(my_value_list, my_ranking_list):
        my_dictionary[x] = y
    df.replace({col: my_dictionary}, inplace=True)
    return df

if 'Unnamed: 0' in df.columns:
    df.drop(columns={'Unnamed: 0'}, inplace=True)
df = df[df['Survived'].notna()]
df.Age.fillna(round(df.Age.mean()), inplace=True)
df.Embarked.fillna('Unknown', inplace=True)
df.Cabin.fillna('U0', inplace=True)
df.Embarked.replace({'Unknown': 'U'},inplace=True)
df[['CabinSection', 'CabinNr', 'dummy']] = df["Cabin"].str.split("(\d+)", n=1, expand=True)
df.CabinSection = df.CabinSection.apply(lambda x: x[0])
df['Sex'] = df.Sex.eq('male').mul(1)
df.Survived = df.Survived.astype(int)
df.Age = df.Age.astype(int)
df.drop(columns={'dummy', 'Cabin', 'Ticket', 'Name'}, inplace=True)
df.CabinNr.fillna(0, inplace=True)
df.CabinNr = df.CabinNr.astype(int)
for col in ['CabinSection', 'Embarked']:
    df = dictionary_function(df, col)
    df[col].astype(int)
    
target_col = ['Survived']
feature_cols = [col for col in df.columns if col not in target_col]

y = df[target_col]
X = df[feature_cols]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

### Model 1: Decision Tree

In [2]:
from sklearn import tree
import matplotlib.pyplot as plt
model = tree.DecisionTreeClassifier() # optimize! Can we do it better?
model.fit(X_train, y_train)
predictions_decision_tree = list(model.predict(X_test))

### Model 2: Random Forest

In [3]:
from sklearn.ensemble import RandomForestClassifier
model_forest = RandomForestClassifier() # optimize! Can we do it better?
model_forest.fit(X_train, y_train)
predictions_random_forest = list(model_forest.predict(X_test))

  model_forest.fit(X_train, y_train)


### Model 3: XGBoost

In [4]:
import xgboost as xgb
model_XGB = xgb.XGBClassifier() # optimize! Can we do it better?
model_XGB.fit(X_train, y_train)
predictions_XGB = list(model_XGB.predict(X_test))

### Model 4: SVM

In [5]:
from sklearn.svm import SVC
model_SVC = SVC() # optimize! Can we do it better?
model_SVC.fit(X_train, y_train)
predictions_SVC = list(model_SVC.predict(X_test))

  y = column_or_1d(y, warn=True)


### Model Evaluation - Classification
- Accuracy
- Precision
- Recall
- F1, F2, F0.5

In [23]:
from sklearn.metrics import confusion_matrix
from sklearn.metrics import precision_score, recall_score, f1_score, accuracy_score
our_prediction_list = [predictions_decision_tree, predictions_random_forest, predictions_XGB, predictions_SVC]
model_list = ['Decision Tree', 'Random Forest', 'XGB', 'SVC']
utfall = y_test['Survived'].to_list()

for preds, modl in zip(our_prediction_list, model_list):
    precision = precision_score(utfall, preds, labels=['Died','Survived'])
    recall = recall_score(utfall, preds, labels=['Died','Survived'])
    accuracy = accuracy_score(utfall, preds)
    f1 = f1_score(utfall, preds, labels=['Died','Survived'])
    print(modl)
    print(f'Accuracy: {round(accuracy,3)}')
    print(f'Precision: {round(precision, 3)}')
    print(f'Recall: {round(recall, 3)}')
    print(f'F1-score: {round(f1, 3)}')
    print(' ')

Decision Tree
Accuracy: 0.776
Precision: 0.733
Recall: 0.708
F1-score: 0.72
 
Random Forest
Accuracy: 0.807
Precision: 0.812
Recall: 0.683
F1-score: 0.742
 
XGB
Accuracy: 0.8
Precision: 0.77
Recall: 0.725
F1-score: 0.747
 
SVC
Accuracy: 0.647
Precision: 0.75
Recall: 0.2
F1-score: 0.316
 


## Regressionproblem - Huspriser
1. Ladda in, städa och dela upp träningssettet housing.csv
2. Skapa och träna fyra regressionsmodeller (Linear Regression, Random Forest Regressor, XGB Regressor, SVM).
3. För varje modell, ta fram måtten:
    - Mean Squared Error
    - Root Mean Squared Error
    - R2-score
    - Mean Absolute Error
4. Vilken modell presterar bäst?

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
import os
df = pd.read_csv('housing.csv')

In [2]:
# Clean and fix the data
df.drop(columns={'Id'}, inplace=True)
one_hot_columns = ['MSZoning', 'Street', 'Alley', 'LotShape', 'LandContour', 'Utilities', 'LotConfig','LandSlope','Neighborhood','Condition1','Condition2','BldgType','HouseStyle',
                'RoofStyle', 'RoofMatl', 'Exterior1st','Exterior2nd','MasVnrType','ExterQual','ExterCond','Foundation','BsmtQual', 'BsmtCond','BsmtExposure','BsmtFinType1','BsmtFinType2',
                 'Heating', 'HeatingQC','CentralAir','Electrical','KitchenQual','Functional','FireplaceQu','GarageType','GarageFinish','GarageQual','GarageCond','PavedDrive','PoolQC','Fence',
                'MiscFeature','SaleType','SaleCondition', 'GarageYrBlt', 'MasVnrArea']
df.drop(columns=one_hot_columns, inplace=True)
df['LotFrontage'].fillna(0, inplace=True)
df = (df - df.mean()) / df.std()
target_col = ['SalePrice']
feature_cols = [col for col in df.columns if col not in target_col]

y = df[target_col]
X = df[feature_cols]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)


### Model 1: Linear Regression

In [3]:
from sklearn.linear_model import LinearRegression
model_LinearRegression = LinearRegression()
model_LinearRegression.fit(X_train, y_train)
predictions_LinearRegression = list(model_LinearRegression.predict(X_test))


### Model 2: Random Forest Regressor

In [4]:
from sklearn.ensemble import RandomForestRegressor
model_RandomForestRegressor = RandomForestRegressor()
model_RandomForestRegressor.fit(X_train, y_train)
predictions_RandomForestRegressor = list(model_RandomForestRegressor.predict(X_test))

  model_RandomForestRegressor.fit(X_train, y_train)


### Model 3: XGB Regressor

In [5]:
from xgboost import XGBRegressor
model_XGBRegressor = XGBRegressor()
model_XGBRegressor.fit(X_train, y_train)
predictions_XGBRegressor = list(model_XGBRegressor.predict(X_test))

### Model Evaluation - Regression
- Mean Squared Error
- Root Mean Squared Error
- R2-score
- Mean Absolute Error

In [6]:
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

our_prediction_list = [predictions_LinearRegression, predictions_RandomForestRegressor, predictions_XGBRegressor]
model_list = ['Linear Regression', 'Random Forest Regressor', 'XGB Regressor']
utfall = y_test['SalePrice'].to_list()

for preds, modl in zip(our_prediction_list, model_list):
    MSE = mean_squared_error(utfall, preds, squared=False)
    RMSE = mean_squared_error(utfall, preds, squared=True)
    MAE = mean_absolute_error(utfall, preds)
    R2 = r2_score(utfall, preds)
    
    print(modl)
    print(f'Mean Squared Error: {round(MSE,3)}')
    print(f'Root Mean Squared Error: {round(RMSE, 3)}')
    print(f'Mean Absolute Error: {round(MAE, 3)}')
    print(f'R2-score: {round(R2, 3)}')
    print(' ')

Linear Regression
Mean Squared Error: 0.507
Root Mean Squared Error: 0.257
Mean Absolute Error: 0.307
R2-score: 0.779
 
Random Forest Regressor
Mean Squared Error: 0.391
Root Mean Squared Error: 0.153
Mean Absolute Error: 0.229
R2-score: 0.868
 
XGB Regressor
Mean Squared Error: 0.405
Root Mean Squared Error: 0.164
Mean Absolute Error: 0.233
R2-score: 0.859
 
