## Problem Statement:
Given a classification model, we want to investigate how much the performance score computed on the test set depends on the choice of train/test split proportion. Eg. how would our performance estimate change if we used a 60/40 split rather than 80/20?

Write a function that takes a scikit-learn estimator and a dataset, and computes an evaluation metric over a grid of train/test split proportions from 0 to 100%. To assess variability, for each split proportion it should resplit and recompute the metric multiple times. It should output a table of splits with multiple metric values per split.


## Solution Statement

The evaluation metrics to be used are; accuracy_score, precision_score, recall_score and f1_score and I will be using train test splits in the range <b>95% train and 5% test</b> to <b>5% train and 95% test</b> and copute everything into a a dataframe.

The model used is random forest.

In [1]:
import pandas as pd
import os

The function module

In [2]:
import traversal_function as tf

In [3]:
current_dir = os.path.abspath(os.getcwd())

dataset_path = os.path.join(current_dir, '..', '..', '..', 'datasets', 'winequality.csv')

In [4]:
wine = pd.read_csv(dataset_path)

In [5]:
wine.head()

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality,recommend
0,7.0,0.27,0.36,20.7,0.045,45.0,170.0,1.001,3.0,0.45,8.8,6,False
1,6.3,0.3,0.34,1.6,0.049,14.0,132.0,0.994,3.3,0.49,9.5,6,False
2,8.1,0.28,0.4,6.9,0.05,30.0,97.0,0.9951,3.26,0.44,10.1,6,False
3,7.2,0.23,0.32,8.5,0.058,47.0,186.0,0.9956,3.19,0.4,9.9,6,False
4,7.2,0.23,0.32,8.5,0.058,47.0,186.0,0.9956,3.19,0.4,9.9,6,False


In [6]:
wine_m = wine.drop(['recommend'], axis=1)

- tf.accuracy returns the accuracy score of the model
- tf.precision_ returns the precision score
- tf.recall_ returns the recall score
- tf.f1_ returns the f1_score

In [7]:
col = ['split', 'accuracy', 'precision', 'recall', 'f1_score']
train_95 = ['95_5_split', tf.accuracy(wine_m, 'quality', 0.05), 
           tf.precision_(wine_m, 'quality', 0.05), 
            tf.recall_(wine_m, 'quality', 0.05), 
            tf.f1_(wine_m, 'quality', 0.05)]
train_90 = ['90_10_split', tf.accuracy(wine_m, 'quality', 0.1), 
            tf.precision_(wine_m, 'quality', 0.1), 
            tf.recall_(wine_m, 'quality', 0.1), 
            tf.f1_(wine_m, 'quality', 0.1)]
train_85 = ['85_15_split', tf.accuracy(wine_m, 'quality', 0.15), 
            tf.precision_(wine_m, 'quality', 0.15), 
            tf.recall_(wine_m, 'quality', 0.15), 
            tf.f1_(wine_m, 'quality', 0.15)]
train_80 = ['80_20_split', tf.accuracy(wine_m, 'quality', 0.2), 
            tf.precision_(wine_m, 'quality', 0.2), 
            tf.recall_(wine_m, 'quality', 0.2), 
           tf.f1_(wine_m, 'quality', 0.2)]
train_75 = ['75_25_split', tf.accuracy(wine_m, 'quality', 0.25), 
            tf.precision_(wine_m, 'quality', 0.25), 
            tf.recall_(wine_m, 'quality', 0.25), 
            tf.f1_(wine_m, 'quality', 0.25)]
train_70 = ['70_30_split', tf.accuracy(wine_m, 'quality', 0.30), 
            tf.precision_(wine_m, 'quality', 0.30), 
            tf.recall_(wine_m, 'quality', 0.30), 
            tf.f1_(wine_m, 'quality', 0.30)]
train_65 = ['65_35_split', tf.accuracy(wine_m, 'quality', 0.35), 
            tf.precision_(wine_m, 'quality', 0.35), 
            tf.recall_(wine_m, 'quality', 0.35), 
            tf.f1_(wine_m, 'quality', 0.35)]
train_60 = ['60_40_split', tf.accuracy(wine_m, 'quality', 0.40), 
            tf.precision_(wine_m, 'quality', 0.40), 
            tf.recall_(wine_m, 'quality', 0.40), 
            tf.f1_(wine_m, 'quality', 0.40)]
train_55 = ['55_45_split', tf.accuracy(wine_m, 'quality', 0.45), 
            tf.precision_(wine_m, 'quality', 0.45), 
            tf.recall_(wine_m, 'quality', 0.45), 
            tf.f1_(wine_m, 'quality', 0.45)]
train_50 = ['50_50_split', tf.accuracy(wine_m, 'quality', 0.50), 
            tf.precision_(wine_m, 'quality', 0.50), 
            tf.recall_(wine_m, 'quality', 0.50), 
            tf.f1_(wine_m, 'quality', 0.50)]
train_45 = ['45_55_split', tf.accuracy(wine_m, 'quality', 0.55), 
            tf.precision_(wine_m, 'quality', 0.55), 
            tf.recall_(wine_m, 'quality', 0.55), 
            tf.f1_(wine_m, 'quality', 0.55)]
train_40 = ['40_60_split', tf.accuracy(wine_m, 'quality', 0.6), 
            tf.precision_(wine_m, 'quality', 0.6), 
            tf.recall_(wine_m, 'quality', 0.6), 
            tf.f1_(wine_m, 'quality', 0.6)]
train_35 = ['35_65_split', tf.accuracy(wine_m, 'quality', 0.65), 
            tf.precision_(wine_m, 'quality', 0.65), 
            tf.recall_(wine_m, 'quality', 0.65), 
            tf.f1_(wine_m, 'quality', 0.65)]
train_30 = ['30_70_split', tf.accuracy(wine_m, 'quality', 0.70), 
            tf.precision_(wine_m, 'quality', 0.70), 
            tf.recall_(wine_m, 'quality', 0.70), 
            tf.f1_(wine_m, 'quality', 0.70)]
train_25 = ['25_75_split', tf.accuracy(wine_m, 'quality', 0.75), 
            tf.precision_(wine_m, 'quality', 0.75), 
            tf.recall_(wine_m, 'quality', 0.75), 
            tf.f1_(wine_m, 'quality', 0.75)]
train_20 = ['20_80_split', tf.accuracy(wine_m, 'quality', 0.80), 
            tf.precision_(wine_m, 'quality', 0.80), 
            tf.recall_(wine_m, 'quality', 0.80),
            tf.f1_(wine_m, 'quality', 0.80)]
train_15 = ['15_85_split', tf.accuracy(wine_m, 'quality', 0.85), 
            tf.precision_(wine_m, 'quality', 0.85), 
            tf.recall_(wine_m, 'quality', 0.85), 
            tf.f1_(wine_m, 'quality', 0.85)]
train_10 = ['10_90_split', tf.accuracy(wine_m, 'quality', 0.90),
            tf.precision_(wine_m, 'quality', 0.90), 
            tf.recall_(wine_m, 'quality', 0.90), 
            tf.f1_(wine_m, 'quality', 0.90)]
train_5 = ['5_95_split', tf.accuracy(wine_m, 'quality', 0.95), 
           tf.precision_(wine_m, 'quality', 0.95), 
           tf.recall_(wine_m, 'quality', 0.95),
           tf.f1_(wine_m, 'quality', 0.95)]



In [8]:
data = [col, train_95, train_90, train_85, train_80, train_75, train_70, train_65, train_60, train_55, train_50, train_45, train_40,
                 train_35, train_30, train_25, train_20, train_15, train_10, train_5]

column_names = data.pop(0)

In [9]:
df = pd.DataFrame(data, columns=column_names)
df

Unnamed: 0,split,accuracy,precision,recall,f1_score
0,95_5_split,69.387755,0.689796,0.673469,0.693878
1,90_10_split,66.530612,0.691837,0.689796,0.689796
2,85_15_split,64.761905,0.684354,0.653061,0.665306
3,80_20_split,62.755102,0.614286,0.657143,0.658163
4,75_25_split,64.408163,0.643265,0.626122,0.636735
5,70_30_split,63.809524,0.640816,0.647619,0.659184
6,65_35_split,63.848397,0.640816,0.630904,0.635569
7,60_40_split,62.295918,0.615816,0.62398,0.620918
8,55_45_split,62.312925,0.619048,0.628571,0.627211
9,50_50_split,61.412822,0.619028,0.612495,0.610862


## Observation

Highest accuracy, precision, recall and f1_score were recorded in the 95% train and 5% test split, this is understandable because the model has more than sufficient data to train with