<a id=top></a> 
**Table of Content**

- [Trying the models on independent Dataset](#indipendentdataset)
  - [Creating our data arrays](#arrays)
  - [Classification Report of the Best Model](#report)
  - [All Results: a summarizing table](#results)
    
- [Try it out with your video](#tryitout)  

- [APPENDIX](#appendix)  
    - [All the classifications' reports](#results2)
      - [No normalization](#nonorm)
      - [Normalization](#norm)

<a id=indipendentdataset></a> 
# Trying the models on independent Dataset

As aforementioned, the main dataset - which we have worked on so far - has the primary issue of having been trained on just a single person. Here, we challenge the model on a **new indipendent dataset** made by different individuals, trying to understand if the model is actually good.\
This independent dataset is composed of videos recorded by us (55) and of some videos scraped from Instagram (10). With this analysis we want to understand whether the model is able to generalize results, and thus, if it is really that accurate on predicting the squat mistakes. 

The dataset contains 65 videos and it is well balanced across classes. 


<a id=arrays></a> 
## Creating the input data

To build the input array from the new dataset, we repeat the same procedure of keypoint extraction described in Section 2.2.6 of the Notebook `1_Pose_Estimation.ipynb`, and we compute the post-processing steps contained in the Notebook `2_OpenPose_Postprocessing.ipynb`. \

So, first, we extracted the poses with OpenPose, then we filled the NaN values and we smoothed and swapped the coordinates. Finally, we built the euclidian distances matrices. 

In this notebook we call the function `load_Xy` from `../module/transform_load_data.py`, where we gather all the functions needed for these post processing steps. Instead, the *Resnet* architecture is constructed through the function `create_res_net` imported from  `../module/utils_prediction.py`.


In [None]:
import os
import numpy as np
import pandas as pd
import seaborn as sns
import tensorflow as tf
from sklearn.metrics import classification_report
import sys
sys.path.append('../module/')
from utils_prediction import create_res_net
from transform_load_data import load_Xy

In [None]:
"""Retriving the body coordinates of the new videos.
The body coordinates for each frames are stored in csv files in the folder 
'../our_pose' in subfolders called as the squat classification. """

classes = sorted(os.listdir('../our_videos/'))

pose_insta = PoseRetrivalAll(input_dir = '../our_videos/',
                             list_folder = classes,
                             output_folder = 'our_pose')
pose_insta.model()


In [None]:
"""The csv files contained in '../our_pose/' are read, and the post-processing 
steps are applied. Namely, the filling of undetected keypoints, swapping of wrong
coordinates, and the smoothing of the movement. Lastly, the Euclidean distance is computed
from the post-processed coordinates, and store in the following arrays"""

our_poseX, our_poseY, names_files = load_Xy('../our_pose/')
our_poseX.shape, our_poseY.shape

Building X


((65, 105, 150, 1), (65,))

In [None]:
np.save('../arrays/our_poseX', our_poseX)
np.save('../arrays/our_poseY', our_poseY)

<a id=report></a> 
## Best Classifier

Those new distance matrices along with their classifications are used to test the different models + normalization presented in the Notebook `3_Models.ipynb`. We proceed by importing the weights of those models (contained in the folder `../weights`) and predict the classes for each video in the new independent dataset. \
The predictions for all the models, with the corresponding classification reports, are reported in the [Appendix](#appendix). \
Here, we only show the classification that leads to the best result in term of accuracy. The configuration of the best model is *Resnet* applied on normalized (torso-normalization) input with SGD optimizer and batch size set to 32.

In [None]:
from utils_normalization import normalize
our_poseX1 = normalize(('LShoulder', 'LHip'), our_poseX, 15)

# SGD 32 batch
model = create_res_net('SGD')
model.load_weights('../weights/training32batch_standSGD_norm1.hdf5')

y_pred_our= model.predict(our_poseX1)
y_pred_our = np.argmax(y_pred_our, axis = 1)
print(classification_report(our_poseY, y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.33      0.82      0.47        11
   bad_back_warp       0.69      0.82      0.75        11
        bad_head       0.50      0.12      0.20         8
bad_innner_thigh       1.00      0.25      0.40         8
     bad_shallow       0.67      0.50      0.57         8
         bad_toe       0.50      0.40      0.44        10
            good       1.00      0.78      0.88         9

        accuracy                           0.55        65
       macro avg       0.67      0.53      0.53        65
    weighted avg       0.66      0.55      0.54        65



***comments***

    The classification report shows how our model performs in the new dataset. 
    Overall, the results are discrete, but it is important noticing that the model has less predictive power in this 
    new dataset. The main reason is that the training dataset is a single individual dataset, and apart from the 
    different body structures - that we tried to control with the different normalizations - each individual may have 
    a unique way of performing the exercise, and the mistakes as well. Increasing the size of the training dataset 
    including different athletes would be a good starting point for further improvements. 
    This is reflected in lower scores in the class bad_toe. Indeed, we believe that is due to the fact that on the 
    training set the individual really emphatizes this mistake, making it difficult to reproduce.
    Additionally, it is possible to see that the difficulties in predicting the bad_head class is also present in this 
    dataset, as it was with the original dataset. 
    In general, it seems that with this dataset the algorithm is less powerful in exactly identifying which squat
    error it is, while it performs well at identifying the correctness of the exercise. Indeed, both precision and 
    recall for the good class are high. 

<a id=results></a> 
## Results

In this section, we report a table summarizing all the results of the models applied along with their configuration in term of *optimizer*, *batch size* and *normalization* used. The last two columns show the accuracy on the test set of the original dataset, and on the indepedent video dataset above introduced, used as test set. \
Note that the weights come from the epoch that maximizes the accuracy on the validation set (in general, we have always looked at the validation test to avoid overfitting on the test set).

In [None]:
# Summary table with results
summary_df = pd.read_csv('../log/Final Results All Models.csv')

cm = sns.light_palette((240, 85, 70), input="husl", as_cmap=True)
subset = ['Accuracy Test Set',	'Accuracy Indipendent Test Set']
summary_df.style.background_gradient(cmap = cm, axis = 0, subset = subset,
                                     low = 0, high = 1.2)\
                                     .format("{:.2f}", subset = subset)

Unnamed: 0,Model,Optimizer,Batch Size,Normalization Type,Accuracy Test Set,Accuracy Indipendent Test Set
0,ResNet30,Adam,32,,0.8,0.49
1,ResNet30,Adam,16,,0.81,0.4
2,ResNet30,SGD,32,,0.85,0.48
3,ResNet30,SGD,16,,0.84,0.42
4,ResNet30,AdaDelta,32,,0.62,0.31
5,ResNet30,SGD,32,Normalization Standing,0.82,0.55
6,ResNet30,SGD,32,Normalization Max Min,0.81,0.35
7,ResNet30,SGD,32,Normalization Arm,0.79,0.51
8,AlexNet,Adam,16,,0.82,0.43
9,AlexNet,SGD,16,,0.82,0.48


***comments***

    Overall, the majority of the models presented show good predictive performances. 
    The approach of using the distance matrices as input of the network is more competitive (and much faster) than the CONV-LSTM, in which the entire video is used as input. 
    Additionally, as we expected, the model performs worse on the independent dataset. However, it is worth noticing that normalization helps in the prediction of the squat classification on different body shapes and people. 
    The best results are obtained with the torso-normalization (accuracy over 55 %) and the arm-normalization (accuracy over 51 %).
    However, in order to use in production our squat classifier, it would be necessary to increase the sample size, training the model on more people

<a id=tryitout></a> 
# Try it out with your video!
In this section, we are building a **prototype application**. After receiving a new video as input, it is able to extract the skeleton coordinates to notify the correctness of the exercise or any potential mistake.
This prototype can be thought as a **virtual coach**: indeed, some sentences come along with the execution to help the user understand the potential mistake.\
The prototype is saved in `../module/prototype.py`, and can be called directly from the terminal. The code is very easy to execute:
```
python prototype.py --v PATH_FILE
```

It will return the prediction according to our best model.

*Notice* that this prototype can accept  as *PATH_FILE*  both the directory of an already retrived posed (i.e. a csv file), and of a new input video (in the latter case, it will take longer because *OpenPose* must run in order to have a prediction). \
As an optional argument we have **--p** (plot: bool) which will allow to see the pose extracted as a skeleton shape in the video input.

In [None]:
"""Redirecting in the right folder"""
%cd ../module

In [None]:
"""Here, we tried out our prototype by retrieving the CSV file in which coordinates
were already stored. Remember that it can be used also with a totally new video!"""
!python prototype.py --v ../our_pose/good/Insta0.csv

You have done an excellent squat!


<a id=appendix></a> 
# APPENDIX
In this part, we return all the predictions along with the classification reports done for the independent dataset for all the configurations of models. 

<a id=results2></a> 
### All other classification reports 

<a id=nonorm></a> 
#### Results for models without Normalization

In [None]:
our_poseX = np.load('../arrays/our_poseX.npy')
our_poseY = np.load('../arrays/our_poseY.npy')
our_poseX.shape, our_poseY.shape

((65, 105, 150, 1), (65,))

In [None]:
# Adam 32 Batch
model = create_res_net('adam')
model.load_weights('../weights/adam32_weights.87-0.97.hdf5')

y_pred_our= model.predict(our_poseX)
y_pred_our = np.argmax(y_pred_our, axis = 1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.45      0.82      0.58        11
   bad_back_warp       0.64      0.64      0.64        11
        bad_head       0.67      0.25      0.36         8
bad_innner_thigh       0.60      0.38      0.46         8
     bad_shallow       0.43      0.38      0.40         8
         bad_toe       0.60      0.30      0.40        10
            good       0.36      0.56      0.43         9

        accuracy                           0.49        65
       macro avg       0.53      0.47      0.47        65
    weighted avg       0.53      0.49      0.48        65



In [None]:
# Adam 16 Batch
model = create_res_net('adam')
model.load_weights('../weights/adam16_weights.100-0.81.hdf5')

y_pred_our= model.predict(our_poseX)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes)) 

                  precision    recall  f1-score   support

  bad_back_round       0.46      0.55      0.50        11
   bad_back_warp       0.44      0.64      0.52        11
        bad_head       0.14      0.12      0.13         8
bad_innner_thigh       0.50      0.25      0.33         8
     bad_shallow       0.80      0.50      0.62         8
         bad_toe       0.33      0.30      0.32        10
            good       0.27      0.33      0.30         9

        accuracy                           0.40        65
       macro avg       0.42      0.38      0.39        65
    weighted avg       0.42      0.40      0.40        65



In [None]:
# SGD 32 Batch
model = create_res_net('sgd')
model.load_weights('../weights/sgd32_weights.26-0.61.hdf5')

y_pred_our= model.predict(our_poseX)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.36      0.82      0.50        11
   bad_back_warp       0.70      0.64      0.67        11
        bad_head       0.38      0.38      0.38         8
bad_innner_thigh       1.00      0.25      0.40         8
     bad_shallow       0.50      0.38      0.43         8
         bad_toe       0.43      0.30      0.35        10
            good       0.57      0.44      0.50         9

        accuracy                           0.48        65
       macro avg       0.56      0.46      0.46        65
    weighted avg       0.56      0.48      0.47        65



In [None]:
# SGD 16 Batch
model = create_res_net('sgd')
model.load_weights('../weights/sgd16_weights.95-0.67.hdf5')

y_pred_our= model.predict(our_poseX)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.33      0.73      0.46        11
   bad_back_warp       0.53      0.73      0.62        11
        bad_head       0.40      0.25      0.31         8
bad_innner_thigh       0.67      0.25      0.36         8
     bad_shallow       0.60      0.38      0.46         8
         bad_toe       0.43      0.30      0.35        10
            good       0.17      0.11      0.13         9

        accuracy                           0.42        65
       macro avg       0.45      0.39      0.38        65
    weighted avg       0.44      0.42      0.39        65



In [None]:
# AdaDelta 
model = create_res_net('adadelta')
model.load_weights('../weights/AdaDelta32_weights.hdf5')

y_pred_our= model.predict(our_poseX)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.30      0.82      0.44        11
   bad_back_warp       0.57      0.36      0.44        11
        bad_head       0.10      0.12      0.11         8
bad_innner_thigh       0.00      0.00      0.00         8
     bad_shallow       0.50      0.38      0.43         8
         bad_toe       0.29      0.20      0.24        10
            good       0.25      0.11      0.15         9

        accuracy                           0.31        65
       macro avg       0.29      0.28      0.26        65
    weighted avg       0.30      0.31      0.27        65



In [None]:
# AlexNet Adam 16
model = tf.keras.models.load_model('../weights/OtherModels/AlexNet_adam_weights_16batch.hdf5')

y_pred_our= model.predict(our_poseX)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.40      0.55      0.46        11
   bad_back_warp       0.70      0.64      0.67        11
        bad_head       0.17      0.12      0.14         8
bad_innner_thigh       0.67      0.25      0.36         8
     bad_shallow       0.40      0.50      0.44         8
         bad_toe       0.31      0.40      0.35        10
            good       0.50      0.44      0.47         9

        accuracy                           0.43        65
       macro avg       0.45      0.41      0.41        65
    weighted avg       0.45      0.43      0.43        65



In [None]:
# AlexNet SGD 16
model = tf.keras.models.load_model('../weights/OtherModels/AlexNet_SGD_weights_16batch.hdf5')

y_pred_our= model.predict(our_poseX)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.47      0.64      0.54        11
   bad_back_warp       0.57      0.73      0.64        11
        bad_head       0.20      0.12      0.15         8
bad_innner_thigh       0.50      0.38      0.43         8
     bad_shallow       0.62      0.62      0.62         8
         bad_toe       0.33      0.30      0.32        10
            good       0.50      0.44      0.47         9

        accuracy                           0.48        65
       macro avg       0.46      0.46      0.45        65
    weighted avg       0.46      0.48      0.46        65



In [None]:
# VGG16 ADAM
model = tf.keras.models.load_model('../weights/OtherModels/VGG16_adam_weights_16batch.hdf5')

y_pred_our= model.predict(our_poseX)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.38      0.55      0.44        11
   bad_back_warp       0.55      0.55      0.55        11
        bad_head       0.62      0.62      0.62         8
bad_innner_thigh       0.60      0.38      0.46         8
     bad_shallow       0.75      0.38      0.50         8
         bad_toe       0.43      0.30      0.35        10
            good       0.43      0.67      0.52         9

        accuracy                           0.49        65
       macro avg       0.54      0.49      0.49        65
    weighted avg       0.52      0.49      0.49        65



In [None]:
# VGG16 SGD
model = tf.keras.models.load_model('../weights/OtherModels/VGG16_SGD_weights_16batch.hdf5')

y_pred_our= model.predict(our_poseX)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.41      0.64      0.50        11
   bad_back_warp       0.62      0.73      0.67        11
        bad_head       0.20      0.12      0.15         8
bad_innner_thigh       0.75      0.38      0.50         8
     bad_shallow       0.67      0.50      0.57         8
         bad_toe       0.50      0.40      0.44        10
            good       0.25      0.33      0.29         9

        accuracy                           0.46        65
       macro avg       0.48      0.44      0.45        65
    weighted avg       0.48      0.46      0.46        65



In [None]:
# CONV-LSTM
model = tf.keras.models.load_model('../weights/OtherModels/CNNLSTM_SGD_weights_16batch.hdf5')

our_poseX_lstm = our_poseX.reshape(65,1,105,150,1)
y_pred_our= model.predict(our_poseX_lstm)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.30      0.55      0.39        11
   bad_back_warp       0.73      0.73      0.73        11
        bad_head       0.20      0.25      0.22         8
bad_innner_thigh       0.67      0.25      0.36         8
     bad_shallow       0.38      0.38      0.38         8
         bad_toe       0.43      0.30      0.35        10
            good       0.67      0.44      0.53         9

        accuracy                           0.43        65
       macro avg       0.48      0.41      0.42        65
    weighted avg       0.48      0.43      0.43        65



<a id=norm></a> 
#### Results with Normalization

##### Normalization 1

In [None]:
from utils_normalization import normalize

our_poseX1 = normalize(('LShoulder', 'LHip'), our_poseX, 15)

In [None]:
# SGD 32 batch
model = create_res_net('SGD')
model.load_weights('../weights/training32batch_standSGD_norm1.hdf5')

y_pred_our= model.predict(our_poseX1)
y_pred_our = np.argmax(y_pred_our, axis = 1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.33      0.82      0.47        11
   bad_back_warp       0.69      0.82      0.75        11
        bad_head       0.50      0.12      0.20         8
bad_innner_thigh       1.00      0.25      0.40         8
     bad_shallow       0.67      0.50      0.57         8
         bad_toe       0.50      0.40      0.44        10
            good       1.00      0.78      0.88         9

        accuracy                           0.55        65
       macro avg       0.67      0.53      0.53        65
    weighted avg       0.66      0.55      0.54        65



In [None]:
# Adam 32 batch
model = create_res_net('adam')
model.load_weights('../weights/OtherModels/stand_norm_adam_32batchweights.85-1.03.hdf5')

y_pred_our= model.predict(our_poseX1)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.39      0.82      0.53        11
   bad_back_warp       0.44      0.73      0.55        11
        bad_head       0.14      0.12      0.13         8
bad_innner_thigh       1.00      0.38      0.55         8
     bad_shallow       0.50      0.38      0.43         8
         bad_toe       0.75      0.30      0.43        10
            good       0.25      0.11      0.15         9

        accuracy                           0.43        65
       macro avg       0.50      0.40      0.40        65
    weighted avg       0.49      0.43      0.41        65



In [None]:
# Adam 16 batch
model = create_res_net('adam')
model.load_weights('../weights/OtherModels/stand_norm_adam_16batch.98-0.92.hdf5')

y_pred_our= model.predict(our_poseX1)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.38      0.82      0.51        11
   bad_back_warp       0.64      0.64      0.64        11
        bad_head       0.25      0.25      0.25         8
bad_innner_thigh       0.60      0.38      0.46         8
     bad_shallow       0.50      0.25      0.33         8
         bad_toe       0.75      0.30      0.43        10
            good       0.56      0.56      0.56         9

        accuracy                           0.48        65
       macro avg       0.52      0.46      0.45        65
    weighted avg       0.53      0.48      0.47        65



##### Normalization 2

In [None]:
from utils_normalization import normalization_minmax

our_poseX2 = normalization_minmax(our_poseX)

In [None]:
# SGD 32 batch size
model = create_res_net('SGD')
model.load_weights('../weights/norm2_32_weightsSGD.86-0.65.hdf5')

y_pred_our= model.predict(our_poseX2)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.32      0.73      0.44        11
   bad_back_warp       0.67      0.36      0.47        11
        bad_head       0.00      0.00      0.00         8
bad_innner_thigh       0.38      0.38      0.38         8
     bad_shallow       0.00      0.00      0.00         8
         bad_toe       0.36      0.40      0.38        10
            good       0.29      0.44      0.35         9

        accuracy                           0.35        65
       macro avg       0.29      0.33      0.29        65
    weighted avg       0.31      0.35      0.31        65



In [None]:
# Adam 32 Batch Size
model = create_res_net('adam')
model.load_weights('../weights/OtherModels/norm2_32_weightADAM.hdf5')

y_pred_our= model.predict(our_poseX2)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.36      0.73      0.48        11
   bad_back_warp       0.54      0.64      0.58        11
        bad_head       0.29      0.25      0.27         8
bad_innner_thigh       0.60      0.38      0.46         8
     bad_shallow       0.60      0.38      0.46         8
         bad_toe       0.40      0.20      0.27        10
            good       0.25      0.22      0.24         9

        accuracy                           0.42        65
       macro avg       0.43      0.40      0.39        65
    weighted avg       0.43      0.42      0.40        65



In [None]:
# Adam 16 Batch Size
model = create_res_net('adam')
model.load_weights('../weights/OtherModels/norm2_16_weightADAM.hdf5')

y_pred_our= model.predict(our_poseX2)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.38      0.73      0.50        11
   bad_back_warp       0.50      0.55      0.52        11
        bad_head       0.33      0.12      0.18         8
bad_innner_thigh       0.43      0.38      0.40         8
     bad_shallow       0.60      0.38      0.46         8
         bad_toe       0.43      0.30      0.35        10
            good       0.30      0.33      0.32         9

        accuracy                           0.42        65
       macro avg       0.42      0.40      0.39        65
    weighted avg       0.42      0.42      0.40        65



##### Normalization 3

In [None]:
our_poseX3 = normalize(('RElbow', 'RWrist'), our_poseX, 0)

In [None]:
# SGD 32 Batch Size
model = create_res_net('SGD')
model.load_weights('../weights/normWrist_SGD32.hdf5')

y_pred_our= model.predict(our_poseX3)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.35      0.73      0.47        11
   bad_back_warp       0.69      0.82      0.75        11
        bad_head       0.33      0.25      0.29         8
bad_innner_thigh       0.75      0.38      0.50         8
     bad_shallow       0.75      0.38      0.50         8
         bad_toe       0.60      0.30      0.40        10
            good       0.50      0.56      0.53         9

        accuracy                           0.51        65
       macro avg       0.57      0.49      0.49        65
    weighted avg       0.56      0.51      0.50        65



In [None]:
# SGD 16 Batch Size
model = create_res_net('SGD')
model.load_weights('../weights/OtherModels/normWrist_SGD16.hdf5')

y_pred_our= model.predict(our_poseX3)
y_pred_our = np.argmax(y_pred_our, axis =1)
print(classification_report(our_poseY,y_pred_our, target_names = classes))

                  precision    recall  f1-score   support

  bad_back_round       0.32      0.73      0.44        11
   bad_back_warp       0.60      0.55      0.57        11
        bad_head       0.40      0.25      0.31         8
bad_innner_thigh       1.00      0.38      0.55         8
     bad_shallow       0.67      0.25      0.36         8
         bad_toe       0.33      0.30      0.32        10
            good       0.40      0.44      0.42         9

        accuracy                           0.43        65
       macro avg       0.53      0.41      0.42        65
    weighted avg       0.52      0.43      0.43        65

