# Part 3 - Performance estimation

## Performance estimation

In this part the gathered standardized features are used. <br>
The performance of each model is estimated using nested cross validation, 10-fold cross validation for outer and <br>
5-fold repeated cross validation with 3 repetitions for inner loop.  <br> 
The best model is selected in the inner loop using the hyperparameter combinations and ranges defined in the Part 2. <br>
For each model, calculate the accuracy and the confusion matrix. <br> 
Finally I report which hyperparameter/hyperparameter combination is most often chosen as the best one for each classifier.

In [15]:
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import RepeatedKFold, GridSearchCV, KFold, train_test_split, cross_validate
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.neural_network import MLPClassifier
from sklearn import metrics
import matplotlib.pyplot as plt 
import seaborn as sb
from statistics import mode
import numpy as np

In [3]:
# I got this box and the next from my own report of Exercise2
# I am using the output of previous part directly

data_path = '../training_data/rice_feature_data.parquet' # path of data
data = pd.read_parquet(data_path)               # reading parquet file
data.head()                                     # printing a few lines to see if everything is ok

Unnamed: 0,mean_b,var_b,skew_b,kurt_b,entr_b,mean_g,var_g,skew_g,kurt_g,entr_g,...,kurt_r,entr_r,major_axis_length,minor_axis_length,area,perimeter,roundness,aspect_ratio,class,class_int
0,237.701602,235.550005,-0.916309,0.337702,5.498779,231.293858,301.677048,-0.746753,-0.196393,5.868242,...,-0.155548,5.841225,146.282043,72.265778,8084.5,371.404109,0.736495,2.024223,Arb,0
1,236.918079,157.647628,-0.768407,0.436432,5.485904,232.733608,191.903693,-0.752232,0.325161,5.688247,...,0.349649,5.657413,137.059006,65.893982,6933.5,354.232535,0.694361,2.079993,Arb,0
2,216.536679,259.919503,-0.839951,-0.140669,5.501954,210.975351,283.247363,-0.772167,-0.337009,5.604155,...,-0.316691,5.572969,132.384201,67.964844,6929.5,351.178713,0.706082,1.947834,Arb,0
3,212.078894,167.887738,-0.647664,0.06817,5.637232,207.893328,195.452218,-0.596941,-0.158326,5.740592,...,-0.192698,5.72903,149.286682,71.326492,8045.5,384.232536,0.684817,2.093005,Arb,0
4,233.318557,211.242371,-0.424428,-0.734968,5.665341,221.524494,225.958931,-0.303483,-0.890377,5.749867,...,-0.871835,5.735764,140.248886,71.801796,7663.5,365.060963,0.722614,1.953278,Arb,0


In [4]:
# creating the columns for standardazing
columns = data.columns.tolist()
columns.remove('class')      # deleting class and class_int from features' columns
columns.remove('class_int')

# I am using z-score for standarizing. The output of this standarizing is between -3 and +3.
# In z-score standarization, we use mean and standard deviation of the data set with observed sample.
# The score determines how far a sample is from the mean, in scale of -3 to +3.

scale= StandardScaler()                            # creating an instance of standardizer
data[columns] = scale.fit_transform(data[columns]) # applying the transformation 
data.head()                                        # printing the data to check it

Unnamed: 0,mean_b,var_b,skew_b,kurt_b,entr_b,mean_g,var_g,skew_g,kurt_g,entr_g,...,kurt_r,entr_r,major_axis_length,minor_axis_length,area,perimeter,roundness,aspect_ratio,class,class_int
0,0.702268,1.296104,-0.518657,-0.603753,0.737087,1.381771,1.453728,-0.918784,-0.539606,1.00259,...,-0.548778,0.96707,-0.662878,1.46842,0.858662,-0.382661,1.466364,-1.133834,Arb,0
1,0.637647,0.349329,-0.344452,-0.57653,0.711366,1.527036,0.185969,-0.925793,-0.351919,0.566264,...,-0.335125,0.485281,-0.936849,0.908895,-0.000719,-0.70284,1.079154,-1.080698,Arb,0
2,-1.0433,1.592275,-0.428719,-0.735656,0.743432,-0.668283,1.240886,-0.951299,-0.590208,0.362418,...,-0.616928,0.263944,-1.075714,1.090743,-0.003706,-0.759781,1.186873,-1.206614,Arb,0
3,-1.410954,0.473781,-0.202237,-0.678072,1.0137,-0.979246,0.22695,-0.727107,-0.525907,0.693155,...,-0.56449,0.672997,-0.573624,1.385939,0.829544,-0.143464,0.991443,-1.068301,Arb,0
4,0.340778,1.000685,0.060698,-0.899523,1.069858,0.396082,0.579269,-0.35164,-0.789344,0.715637,...,-0.851704,0.690646,-0.842093,1.427676,0.544327,-0.500935,1.338797,-1.201427,Arb,0


### KNN Evaluation

In [5]:
# I used Exercise 2 solution as a reference here
# Also this website 
# https://scikit-learn.org/stable/auto_examples/model_selection/plot_nested_cross_validation_iris.html

y = data['class'].values       # creating labels. each sample belongs to a different class.
X_train = data[columns].values # seperating classes and class values from features for training

k_range=range(1,30)            # creating a range for K to be tested
knn = KNeighborsClassifier()   # creating the knn model
parameters={'n_neighbors': k_range}  # assigning k_range to be the parameter for knn

outer_cv = KFold(n_splits=10)  # creating the outer cross validation with 10 folds
inner_cv = RepeatedKFold(n_splits=5, n_repeats=3) # creating the inner cross validation with 5 folds
                                                  # and 3 repetitions

clf = GridSearchCV(knn, parameters, cv=inner_cv)  # creating a cross validation with knn estimator
# evaluating our model with cross validation and saving the scores and estimator
cv_final = cross_validate(clf, X=X_train, y=y, cv=outer_cv, return_estimator=True)

cross_val_score = cv_final['test_score']          # saving the scores seperately
cross_val_estimators = cv_final['estimator']      # saving the estimator

In [6]:
cross_val_score  # printing the scores

array([1.        , 1.        , 1.        , 1.        , 0.96666667,
       0.93333333, 0.96666667, 1.        , 0.93333333, 0.93333333])

In [7]:
# printing the best params
for es in cross_val_estimators:
    print(es.best_params_)

{'n_neighbors': 1}
{'n_neighbors': 1}
{'n_neighbors': 3}
{'n_neighbors': 1}
{'n_neighbors': 1}
{'n_neighbors': 3}
{'n_neighbors': 2}
{'n_neighbors': 1}
{'n_neighbors': 1}
{'n_neighbors': 3}


#### result

Most often hyper parametr: n_neighbors = 1

In [16]:
# I am using the code from my own solution for part 2

# spiliting the features and classes into X_train, X_test, y_train, y_test
X_train, X_test, y_train, y_test = train_test_split(data[columns], data['class_int'], test_size=0.2)

classifier = KNeighborsClassifier(n_neighbors=1) # best K
classifier.fit(X_train, y_train)                 # training the classifier
y_pred = classifier.predict(X_test)              # make predictions by the model
print(metrics.accuracy_score(y_test,y_pred))     # printing the accuracy
result = metrics.confusion_matrix(y_test, y_pred)# getting the confusion matrix

# printing the result in an understandable format
print("Confusion Matrix:")
print(result)
result1 = metrics.classification_report(y_test, y_pred)
print("Classification Report:",)
print (result1)

0.9833333333333333
Confusion Matrix:
[[19  0  0]
 [ 0 25  0]
 [ 1  0 15]]
Classification Report:
              precision    recall  f1-score   support

           0       0.95      1.00      0.97        19
           1       1.00      1.00      1.00        25
           2       1.00      0.94      0.97        16

    accuracy                           0.98        60
   macro avg       0.98      0.98      0.98        60
weighted avg       0.98      0.98      0.98        60



### Random Forest Evaluation

In [8]:
# I used Exercise 2 solution as a reference here

y = data['class'].values       # creating labels. each sample belongs to a different class.
X_train = data[columns].values # seperating classes and class values from features for training

max_depths = [2, 4, 6, 8, 10, 12]    # Random Forest Params
max_features = [2, 3, 4, 5, 6, 7, 8]
parameters={'max_depth': max_depths,
           'max_features': max_features}

r_f_c = RandomForestClassifier(random_state=20)
outer_cv = KFold(n_splits=10)                    # creating outer folds of cross validation
inner_cv = RepeatedKFold(n_splits=5, n_repeats=3)# creating inner folds of cross validation

clf = GridSearchCV(r_f_c, parameters, cv=inner_cv) # creating inner cross validation
# creating outter cross validation
cv_final = cross_validate(clf, X=X_train, y=y, cv=outer_cv, return_estimator=True)

cross_val_score = cv_final['test_score']
cross_val_estimators = cv_final['estimator']

In [9]:
cross_val_score  # printing the scores

array([1.        , 1.        , 0.96666667, 0.96666667, 1.        ,
       0.96666667, 1.        , 1.        , 1.        , 0.86666667])

In [10]:
# printing the best params
for es in cross_val_estimators:
    print(es.best_params_)

{'max_depth': 4, 'max_features': 8}
{'max_depth': 4, 'max_features': 7}
{'max_depth': 4, 'max_features': 5}
{'max_depth': 6, 'max_features': 5}
{'max_depth': 6, 'max_features': 6}
{'max_depth': 4, 'max_features': 4}
{'max_depth': 4, 'max_features': 5}
{'max_depth': 4, 'max_features': 5}
{'max_depth': 8, 'max_features': 3}
{'max_depth': 2, 'max_features': 6}


#### result

Most often hyper parametr: max_depth: 4, max_features: 7

In [18]:
# I used Exercise 2 solution as a reference here

X_train, X_test, y_train, y_test = train_test_split(data[columns], data['class_int'], test_size=0.2)
clf = RandomForestClassifier(max_depth=4, max_features=7) # best params
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
print(metrics.accuracy_score(y_test,y_pred))     # printing the accuracy
result = metrics.confusion_matrix(y_test, y_pred)# getting the confusion matrix

# printing the result in an understandable format
print("Confusion Matrix:")
print(result)
result1 = metrics.classification_report(y_test, y_pred)
print("Classification Report:",)
print (result1)

0.9666666666666667
Confusion Matrix:
[[21  0  0]
 [ 0 16  0]
 [ 0  2 21]]
Classification Report:
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        21
           1       0.89      1.00      0.94        16
           2       1.00      0.91      0.95        23

    accuracy                           0.97        60
   macro avg       0.96      0.97      0.97        60
weighted avg       0.97      0.97      0.97        60



### MLP

In [11]:
# I used Exercise 2 solution as a reference here

y = data['class'].values
X_train = data[columns].values

# creating MLP classifier
mlp = MLPClassifier(max_iter=500, early_stopping=True, random_state=20)
outer_cv = KFold(n_splits=10)                    # creating outer folds of cross validation
inner_cv = RepeatedKFold(n_splits=5, n_repeats=3)# creating inner folds of cross validation

# creating parameters for MLP classifier
hidden_layer_sizes = [(i,) for i in range(3,22)]
activation = ['logistic', 'relu']
solver = ['sgd', 'adam']
val_fr = [0.1, 0.5]

# creating parameters for MLP classifier
parameter_space = {
    'hidden_layer_sizes': hidden_layer_sizes,
    'activation': activation,
    'solver': solver,
    'validation_fraction': val_fr
}

clf = GridSearchCV(mlp, parameter_space, cv=inner_cv) # creating inner cross validation
# creating outter cross validation
cv_final = cross_validate(clf, X=X_train, y=y, cv=outer_cv, return_estimator=True)
cross_val_score = cv_final['test_score']
cross_val_estimators = cv_final['estimator']

In [12]:
cross_val_score  # printing the scores

array([0.93333333, 1.        , 0.96666667, 1.        , 1.        ,
       0.9       , 0.9       , 1.        , 0.93333333, 0.83333333])

In [13]:
# printing the best params
for es in cross_val_estimators:
    print(es.best_params_)

{'activation': 'relu', 'hidden_layer_sizes': (16,), 'solver': 'sgd', 'validation_fraction': 0.5}
{'activation': 'relu', 'hidden_layer_sizes': (20,), 'solver': 'adam', 'validation_fraction': 0.5}
{'activation': 'relu', 'hidden_layer_sizes': (16,), 'solver': 'adam', 'validation_fraction': 0.5}
{'activation': 'relu', 'hidden_layer_sizes': (16,), 'solver': 'adam', 'validation_fraction': 0.5}
{'activation': 'relu', 'hidden_layer_sizes': (16,), 'solver': 'adam', 'validation_fraction': 0.5}
{'activation': 'relu', 'hidden_layer_sizes': (16,), 'solver': 'adam', 'validation_fraction': 0.5}
{'activation': 'relu', 'hidden_layer_sizes': (20,), 'solver': 'adam', 'validation_fraction': 0.5}
{'activation': 'relu', 'hidden_layer_sizes': (16,), 'solver': 'adam', 'validation_fraction': 0.5}
{'activation': 'relu', 'hidden_layer_sizes': (16,), 'solver': 'adam', 'validation_fraction': 0.5}
{'activation': 'relu', 'hidden_layer_sizes': (16,), 'solver': 'adam', 'validation_fraction': 0.5}


#### result

Most often hyper parametr: 'activation': 'relu', 'hidden_layer_sizes': (20,), 'solver': 'adam', 'validation_fraction': 0.5

In [20]:
# I used Exercise 2 solution as a reference here

X_train, X_test, y_train, y_test = train_test_split(data[columns], data['class_int'], test_size=0.2)
clf = MLPClassifier( hidden_layer_sizes = 20, activation = 'relu', solver = 'adam',
                    validation_fraction = 0.5, early_stopping=True).fit(X_train.values, y_train)
y_pred = clf.predict(X_test)
print(metrics.accuracy_score(y_test,y_pred))     # printing the accuracy
result = metrics.confusion_matrix(y_test, y_pred)# getting the confusion matrix

# printing the result in an understandable format
print("Confusion Matrix:")
print(result)
result1 = metrics.classification_report(y_test, y_pred)
print("Classification Report:",)
print (result1)

0.9166666666666666
Confusion Matrix:
[[18  0  0]
 [ 0 21  2]
 [ 1  2 16]]
Classification Report:
              precision    recall  f1-score   support

           0       0.95      1.00      0.97        18
           1       0.91      0.91      0.91        23
           2       0.89      0.84      0.86        19

    accuracy                           0.92        60
   macro avg       0.92      0.92      0.92        60
weighted avg       0.92      0.92      0.92        60





## Discussion

Discussing my results

- Which model performs the best? Why?
- Ponder the limitations and generalization of the models. How well will the classifiers perform for data outside this data set?
- Compare your results with the original article. Are they comparable?
- Ponder applications for these type of models (classifying rice or other plant species), who could benefit from them? Ponder also what would be interesting to study more on this area?
- What did you learn? What was difficult? Could you improve your own working process in some way?

### My answers

- Based on my results Random Forest is working the best. The accuracy of the knn is higher, but precision and recal of the Random Forest match better and the results by this model are more trustworthy.
  
- Speaking of the same species with a data set capable of similar features, we evaluated and tested the model (untimately) with unseen data. The expectation is to perform with an accuracy near to what we have. Although, for other species or a different kind of data set, the result can not be reliable.


- The highest accuracy reported on the paper is 99.91% which is much higher than ours! Nevertheless, we skipped some parts of the data set because of the limited computation capacity. This might affect the results too.

- Rice is not something dangerous, but these classifications can be used to classify poisonous or non-poisonous species (for example in muchrooms). Its interesting for me that I can command machines do something that I dont have any ideas about :) . I am sure I can not guess the species my self.

- I have never worked on image data as an anlytics task before. All I had was computer vision. It gave me a different prespective and was very injoyable as the template and hints were reallt helpful.