# Training a Classification model on the extracted features

Instead of using the full dataset, we just consider $10\%$, totaling 2500 images. We consider the Logistic Regression model to perform the classification. Transfer learning some time does not require much data, here is an example, we obtain $0.9856$ as accuracy score. 

## Importing Libraries


In [1]:
from config import dogs_vs_cats_config as config
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score
import h5py
import pickle
import numpy as np

## Importing the features

In [2]:
features = h5py.File(config.FEATURES, "r")

## Traning the model

In [3]:
i = int(features["labels"].shape[0] * 0.1)
print(i)
ii = int(i*0.75)
print(ii)

2500
1875


In [4]:
# List of parameters
params = {"C": [0.1, 1.0, 10.0, 100.0, 1000.0, 10000.0], "solver" : ["newton-cg", "lbfgs"]}

In [5]:
model = GridSearchCV(LogisticRegression(max_iter=1000), params, cv = 5, n_jobs=1) # with cross validation equal to 5

In [6]:
# Fitting the model
model.fit(features["features"][:ii], features["labels"][:ii]) # [:i] we consider the training the staring from the index 0 into i

GridSearchCV(cv=5, estimator=LogisticRegression(max_iter=1000), n_jobs=1,
             param_grid={'C': [0.1, 1.0, 10.0, 100.0, 1000.0, 10000.0],
                         'solver': ['newton-cg', 'lbfgs']})

In [7]:
print("Best parameters {}".format(model.best_params_))

Best parameters {'C': 0.1, 'solver': 'newton-cg'}


In [8]:
predictions = model.predict(features["features"][ii:i]) #[i:] we consider the test set starting from i until the last index

In [9]:
target_names = ['cats', 'dogs']

In [10]:
cr = classification_report(features["labels"][ii:i], predictions)

In [11]:
print(cr)

              precision    recall  f1-score   support

           0       0.98      0.99      0.99       319
           1       0.99      0.98      0.99       306

    accuracy                           0.99       625
   macro avg       0.99      0.99      0.99       625
weighted avg       0.99      0.99      0.99       625



In [12]:
acc = accuracy_score(features["labels"][ii:i], predictions)
print("[INFO] score: {}".format(acc))

[INFO] score: 0.9856
