## Face extraction using visual words

---

In this notebook I will use bag-of-visual generated by the [last notebook](https://colab.research.google.com/drive/1J5B1rTAGaAfFelf8P9d4lXzjjH1j_WBr?usp=sharing) and apply multiple models in order to select the best for bounding box. 

---

references:
  - sklearn for multiregression : [models](https://scikit-learn.org/stable/modules/multiclass.html)
  - save and load models : [post](https://machinelearningmastery.com/save-load-machine-learning-models-python-scikit-learn/)

In [1]:
!gdown 1-4MCqrUz_oOBJaj66yCXjQtlS9_IAK2Z

Downloading...
From: https://drive.google.com/uc?id=1-4MCqrUz_oOBJaj66yCXjQtlS9_IAK2Z
To: /content/celeb_with_visual_words1024.csv
100% 90.9M/90.9M [00:01<00:00, 79.3MB/s]


In [2]:
!mkdir models

In [3]:
from sklearn.metrics import classification_report,plot_confusion_matrix,confusion_matrix
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import train_test_split
from sklearn.multioutput import MultiOutputRegressor
from sklearn.neighbors import KNeighborsClassifier
from matplotlib import pyplot as plt
from sklearn.svm import LinearSVC 
from sklearn import metrics
import plotly.express as px
import seaborn as sns
from tqdm import tqdm
import pandas as pd
import numpy as np
import pickle
import ast


tqdm.pandas()
sns.set_theme()
MODEL_DIR_NAME = 'models/'

In [4]:
def validation_metrics(y_pred, y_true, show_img=True):
  rmse = np.sqrt(np.square(y_pred - y_true).mean(axis=-1)) * 480
  fig = px.histogram(rmse)
  fig.show()
  return np.mean(rmse)

In [5]:
df = pd.read_csv('celeb_with_visual_words1024.csv')
df['visual_words_histogram'] = [np.array(ast.literal_eval(cur_vword)) for cur_vword in df['visual_words_histogram']]
df[['x0', 'y0', 'x1', 'y1']] = df[['x0', 'y0', 'x1', 'y1']] / 480 
df['visual_words_histogram'] = df['visual_words_histogram'].apply(lambda x: x/sum(x))
df.head(3)

Unnamed: 0,img_location,x0,y0,x1,y1,visual_words_histogram
0,celeb_data_resized/000001.jpg,0.232274,0.103348,0.784841,0.558952,"[0.002145922746781116, 0.0, 0.0, 0.00071530758..."
1,celeb_data_resized/000002.jpg,0.170213,0.158249,0.692671,0.673401,"[0.001098901098901099, 0.0032967032967032967, ..."
2,celeb_data_resized/000003.jpg,0.432,0.209964,0.614,0.658363,"[0.003937007874015748, 0.0, 0.0, 0.0, 0.0, 0.0..."


In [6]:
df_train, df_test = train_test_split(df, test_size=0.3, random_state=42)

In [7]:
len(df_train), len(df_test)

(7028, 3012)

In [8]:
x_train = np.array(list(df_train['visual_words_histogram'].values))
y_train = np.array(list(df_train[['x0', 'y0', 'x1', 'y1']].values))

x_test = np.array(list(df_test['visual_words_histogram'].values))
y_test = np.array(list(df_test[['x0', 'y0', 'x1', 'y1']].values))

In [9]:
def train_model_and_test_sk_model(x_train, y_train, x_test_, y_test, sk_model, filename):
  sk_model.fit(x_train, y_train)
  y_pred = sk_model.predict(x_test) 
  pickle.dump(sk_model, open(MODEL_DIR_NAME + filename, 'wb'))

  print("rsme : ", validation_metrics(y_test, y_pred))


## Defining the models for bounding box generation

---

Now we define a model predict the bounding boxes of the image based on bag of visual words. 

So first we can try some base models :

 - GradientBoostingRegressor
 - SGDRegressor
 - KNNr
 - simple NN


In order to check the model performance we generate the histogram of RMSE of each example, them we display the mean of the RMSE distribution. The perfect model will have a mean RMSE equal to zero and a histogram concentrated around zero.

#### SKlearn gradient boosting

---

In [10]:
sk_model = MultiOutputRegressor(GradientBoostingRegressor(random_state=42, verbose=1))
train_model_and_test_sk_model(x_train, y_train, x_test, y_test,
                              sk_model, filename='gradient_model.sav')

      Iter       Train Loss   Remaining Time 
         1           0.0141            2.27m
         2           0.0140            2.22m
         3           0.0139            2.19m
         4           0.0138            2.16m
         5           0.0138            2.13m
         6           0.0137            2.11m
         7           0.0136            2.08m
         8           0.0136            2.06m
         9           0.0135            2.04m
        10           0.0135            2.02m
        20           0.0129            1.79m
        30           0.0125            1.56m
        40           0.0120            1.34m
        50           0.0117            1.12m


KeyboardInterrupt: ignored

### SVG for regression

---

In [11]:
from sklearn.linear_model import SGDRegressor

In [12]:
sk_model = MultiOutputRegressor(SGDRegressor(random_state=42, verbose=1))
train_model_and_test_sk_model(x_train, y_train, x_test, y_test, sk_model, 'SGDRegressor.sav')

-- Epoch 1
Norm: 0.01, NNZs: 1024, Bias: 0.284975, T: 7028, Avg. loss: 0.007946
Total training time: 0.03 seconds.
-- Epoch 2
Norm: 0.01, NNZs: 1024, Bias: 0.290016, T: 14056, Avg. loss: 0.007075
Total training time: 0.06 seconds.
-- Epoch 3
Norm: 0.01, NNZs: 1024, Bias: 0.284122, T: 21084, Avg. loss: 0.007079
Total training time: 0.08 seconds.
-- Epoch 4
Norm: 0.01, NNZs: 1024, Bias: 0.283599, T: 28112, Avg. loss: 0.007077
Total training time: 0.11 seconds.
-- Epoch 5
Norm: 0.01, NNZs: 1024, Bias: 0.283103, T: 35140, Avg. loss: 0.007077
Total training time: 0.14 seconds.
-- Epoch 6
Norm: 0.01, NNZs: 1024, Bias: 0.286996, T: 42168, Avg. loss: 0.007076
Total training time: 0.16 seconds.
Convergence after 6 epochs took 0.16 seconds
-- Epoch 1
Norm: 0.00, NNZs: 1024, Bias: 0.146645, T: 7028, Avg. loss: 0.002483
Total training time: 0.03 seconds.
-- Epoch 2
Norm: 0.00, NNZs: 1024, Bias: 0.144433, T: 14056, Avg. loss: 0.002256
Total training time: 0.05 seconds.
-- Epoch 3
Norm: 0.01, NNZs: 

rsme :  59.44734899964705


KNN for regression

--- 

In [13]:
from sklearn.neighbors import KNeighborsRegressor

In [18]:
sk_model = MultiOutputRegressor(KNeighborsRegressor(n_neighbors=5))
train_model_and_test_sk_model(x_train, y_train, x_test, y_test, sk_model, 'knnRegression.sav')

rsme :  68.20681808995815


Simple NN for regression

--- 

In [15]:
from sklearn.neural_network import MLPRegressor

In [16]:
sk_model = MultiOutputRegressor(MLPRegressor(random_state=42, verbose=1))
train_model_and_test_sk_model(x_train, y_train, x_test, y_test, sk_model, 'simpleNN.sav')

Iteration 1, loss = 0.01353166
Iteration 2, loss = 0.00717641
Iteration 3, loss = 0.00704849
Iteration 4, loss = 0.00696858
Iteration 5, loss = 0.00691018
Iteration 6, loss = 0.00683548
Iteration 7, loss = 0.00676286
Iteration 8, loss = 0.00670667
Iteration 9, loss = 0.00661833
Iteration 10, loss = 0.00654637
Iteration 11, loss = 0.00646533
Iteration 12, loss = 0.00638021
Iteration 13, loss = 0.00631342
Iteration 14, loss = 0.00626319
Training loss did not improve more than tol=0.000100 for 10 consecutive epochs. Stopping.
Iteration 1, loss = 0.00320835
Iteration 2, loss = 0.00227378
Iteration 3, loss = 0.00223627
Iteration 4, loss = 0.00220337
Iteration 5, loss = 0.00217807
Iteration 6, loss = 0.00214907
Iteration 7, loss = 0.00212780
Iteration 8, loss = 0.00210713
Iteration 9, loss = 0.00208650
Iteration 10, loss = 0.00206226
Iteration 11, loss = 0.00205867
Iteration 12, loss = 0.00203204
Iteration 13, loss = 0.00203386
Training loss did not improve more than tol=0.000100 for 10 cons

rsme :  55.866226855564605


In [17]:
!zip -r models_for_face_extraction_celeb.zip models

  adding: models/ (stored 0%)
  adding: models/SGDRegressor.sav (deflated 7%)
  adding: models/simpleNN.sav (deflated 3%)
  adding: models/knnRegression.sav (deflated 93%)
