## <center>Feature Extractor : VGG16 + ML Algorithm [Inference Notebook]</center>

### <center>Welcome Curious Reader!</center>

- You are now going to explore and understand this Inference Notebook created for the competition : [PetFinder.my - PawPularity](https://www.kaggle.com/c/petfinder-pawpularity-score)
    
- This competition opens the door for facing the challenge of using Image Data & Categorical Data to predict a Continuous Value. 

- **Aim** : To understand the use of pickle files to load trained ML models and predict output values for submission purposes.

## <center>Import The Necessary Libraries & Define Data Access Variables </center>

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import cv2
import os
from tqdm import tqdm
import tensorflow as tf
import pickle

test = pd.read_csv('../input/petfinder-pawpularity-score/test.csv')
test_images_path = '../input/petfinder-pawpularity-score/test'
test_images_list = os.listdir(test_images_path)
sample_submission = pd.read_csv('../input/petfinder-pawpularity-score/sample_submission.csv')
print('Total Number of Testing Images : ',len(test_images_list))

## <center>Create Test Image Batches</center>

In [None]:
test_images = []
for i in tqdm(test_images_list):
    path = os.path.join(test_images_path,i)
    image = cv2.imread(path)
    image = image / 255
    image = cv2.resize(image,(128,128))
    test_images.append(image)
test_images = np.array(test_images)  
print(len(test_images))

### <center>VGG16 : Feature Extractor</center>

In [None]:
from tensorflow.keras.applications.vgg16 import VGG16
model = VGG16(weights = '../input/vgg16-no-top-weights/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',include_top=False,input_shape=(128,128,3))
for layer in model.layers:
    layer.trainable = False
    
test_feature_extractor = model.predict(test_images)
test_features = test_feature_extractor.reshape(test_feature_extractor.shape[0],-1)

In [None]:
print('Input to Feature Extractor Shape : ',test_images.shape)
print('Output of Feature Extractor Shape : ',test_feature_extractor.shape)
print('Input to Machine Learning Algorithm Shape',test_features.shape)

### <font color = 'red'>Note</font> : Below cells were executed at different times, these cells were commented & executed once again before publishing. Thus, you won't be able to see the outputs.

## <center>Prediction</center>

## <center>Linear Regression</center>

In [None]:
pickle_filename = '../input/k/tanmay111999/training-notebook/lr_pickle.pkl'
with open(pickle_filename, 'rb') as file:  
    lr = pickle.load(file)

In [None]:
prediction = lr.predict(test_features)
submission = pd.DataFrame()
submission['Id'] = test['Id']
submission['Pawpularity'] = prediction * 100
submission.to_csv('submission.csv',index = False)

- Negative Values predicted
- Submission Score : 104.44

## <center>Support Vector Regressor</center>

In [None]:
pickle_filename = '../input/k/tanmay111999/training-notebook/svr_pickle.pkl'
with open(pickle_filename, 'rb') as file:  
    svr = pickle.load(file)

In [None]:
prediction = svr.predict(test_features)
submission = pd.DataFrame()
submission['Id'] = test['Id']
submission['Pawpularity'] = prediction * 100
submission.to_csv('submission.csv',index = False)

- Predicted values ranged from 32 - 34 
- Submission Score : 20.72

## <center>Xgboost Regressor</center>

In [None]:
pickle_filename = '../input/k/tanmay111999/training-notebook/xgb_pickle.pkl'
with open(pickle_filename, 'rb') as file:  
    xgb = pickle.load(file)

In [None]:
prediction = xgb.predict(test_features)
submission = pd.DataFrame()
submission['Id'] = test['Id']
submission['Pawpularity'] = prediction * 100
submission.to_csv('submission.csv',index = False)

- Predicted Values ranged from 42 - 43  
- Submission Score : 20.95

## <center>LGBM Regressor</center>

In [None]:
pickle_filename = '../input/k/tanmay111999/training-notebook/lgbm_pickle.pkl'
with open(pickle_filename, 'rb') as file:  
    lgbm = pickle.load(file)

In [None]:
prediction = lgbm.predict(test_features)
submission = pd.DataFrame()
submission['Id'] = test['Id']
submission['Pawpularity'] = prediction * 100
submission.to_csv('submission.csv',index = False)

- Values ranged from 37 - 38
- Submission Score : 20.48

## <center>Stack : Linear Regression, Support Vector Regressor, Xgboost Regressor, LGBM Regressor</center>

In [None]:
pickle_filename = '../input/k/tanmay111999/training-notebook/stack_pickle.pkl'
with open(pickle_filename, 'rb') as file:  
    stack = pickle.load(file)

In [None]:
prediction = stack.predict(test_features)
submission = pd.DataFrame()
submission['Id'] = test['Id']
submission['Pawpularity'] = (prediction * 100)
submission.to_csv('submission.csv',index = False)

- Values ranged from 37 - 38
- Submission Score : 20.49

## <center>Conclusion</center>

1. Pickle files are a great help.
2. Model Performance Table :

| Model  |  RMSE Submission Score |  O/P Values |
| :--- | :---: | ---: |
|   LR   | 104.44  | Negative Values  |
|  SVR   | 20.72 | 32 - 34  |
|  XGB   | 20.95 | 42 - 43  |
|  LGBM  | 20.48 | 37 - 38  |
|  STACK | 20.49 | 37 - 38  |

3. LGBM recorded the lowest RMSE scores.

Links :
1. [Training Notebook](https://www.kaggle.com/tanmay111999/feature-extractor-ml-algo-infer)
2. [Discussion Post](https://www.kaggle.com/c/petfinder-pawpularity-score/discussion/279212)

## <center>If you like the content of the notebook, please do upvote!</center>
### <center>Feedback is appreciated!</center>
### <center>Stay Safe!</center>