In this assignment we will look at a typical image based machine learning task.

## Image classification 

For this task the whole image is used to classify what's happening.

For this specific task, we will be trying to classify COVID-19 using pneumonia x-rays.  Please note, the literature has mostly suggested CT scans are not an effective way of figuring out what type of disease you have.  This exercise is for academic purposes _only_.

Steps:


1. Download the pneumonia data.  

You can find it here:

https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

move the folder to this directory and unzip it.  Please don't change any folder names or the below script will not work.  Also make sure the folder is in the same directory as this notebook!

2. load the pneumonia data into a dataframe:

In [1]:
import glob

def load_training_data():
    paths = [
        "chest_xray/train/NORMAL/*",
        "chest_xray/train/PNEUMONIA/*"
    ]
    labels = []
    image_paths = []
    for path in paths:
        for im_path in glob.glob(path):
            if path == "chest_xray/train/NORMAL/*":
                labels.append("NORMAL")
            if path == "chest_xray/train/PNEUMONIA/*":
                labels.append("PNEUMONIA")
            image_paths.append(im_path)
    return image_paths, labels

def load_testing_data():
    paths = [
        "chest_xray/test/NORMAL/*",
        "chest_xray/test/PNEUMONIA/*"
    ]
    labels = []
    image_paths = []
    for path in paths:
        for im_path in glob.glob(path):
            if path == "chest_xray/test/NORMAL/*":
                labels.append("NORMAL")
            if path == "chest_xray/test/PNEUMONIA/*":
                labels.append("PNEUMONIA")
            image_paths.append(im_path)
    return image_paths, labels

train_paths, train_labels = load_training_data()
test_paths, test_labels = load_testing_data()

3. read the data into memory, I recommend open-cv for this:

`python -m pip install opencv-python` 

if you don't already have it!

In [2]:
import cv2
def load_images(image_paths):
    images = []
    for image_path in image_paths:
        img = cv2.imread(image_path, cv2.IMREAD_COLOR)
        if img is not None:
            images.append(img)
    return images

train_images = load_images(train_paths)
test_images = load_images(test_paths)

4. resize the images to a standard size - 

Note: it ought to be a box.  So the width and height should be the same size.

In [3]:
def resize_images(images):
    imgs = []
    for img in images:
        img = cv2.resize(img,(64,64))
        imgs.append(img)
    return imgs

train_images = resize_images(train_images)
test_images = resize_images(test_images)
train_images[0].shape

(64, 64, 3)

5. Greyscale the images

In [4]:
def greyscale_images(images):
    gray_imgs = []
    for img in images:
       # gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        gray = img
        gray_imgs.append(gray)
    return gray_imgs

train_images = greyscale_images(train_images)
test_images = greyscale_images(test_images)

6. prepare the data for training the model.

For this you'll need to transform the test and train image objects into a numpy array.

In [5]:
import numpy as np

def features_to_np_array(images):
    return np.array(images)

train_images = features_to_np_array(train_images)
test_images = features_to_np_array(test_images)

Next you'll need to do the same for the labels:

Note: You'll need to apply the `to_categorical` function after transforming to a numpy array

In [6]:
from tensorflow.keras.utils import to_categorical

LABELS = {'NORMAL': 0, 'PNEUMONIA': 1}

def labels_to_np_array(labels):
    Y = np.array([LABELS[l] for l in labels])
    return to_categorical(Y, len(LABELS))

train_labels = labels_to_np_array(train_labels)
test_labels = labels_to_np_array(test_labels)

  from ._conv import register_converters as _register_converters


7. Seperate into train and test with `train_test_split` from scikit-learn

In [7]:
# train test split code goes here
from sklearn.model_selection import train_test_split

X_train, X_val, y_train, y_val=train_test_split(train_images, train_labels, test_size=0.23, random_state=42)

8. Make the last four layers of VGG16 with imagenet weights trainable and then retrain the model.

To understand how to do this, please see the following tutorial:

https://www.learnopencv.com/keras-tutorial-fine-tuning-using-pre-trained-models/

In [8]:
from keras.applications import VGG16
from keras import models
from keras import layers
from keras import optimizers
EPOCHS = 5
#Load the VGG model
vgg_conv = VGG16(weights='imagenet', include_top=False, input_shape=(64, 64, 3))

# Freeze the layers except the last 4 layers
for layer in vgg_conv.layers[:-4]:
    layer.trainable = False

# Create the model
model = models.Sequential()

# Add the vgg convolutional base model
model.add(vgg_conv)

# Add new layers
model.add(layers.Flatten())
model.add(layers.Dense(1024, activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(2, activation='softmax'))

# Show a summary of the model. Check the number of trainable parameters
#model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=EPOCHS, validation_data=(X_val, y_val), verbose=1)
    

Instructions for updating:
Colocations handled automatically by placer.


Using TensorFlow backend.


Instructions for updating:
Use tf.cast instead.
Train on 4016 samples, validate on 1200 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.callbacks.History at 0x45448ab00>

8. Check your score with classification_report from scikit-learn

Now that you've trained your model, call `model.predict` to get the predicted values for classification.  
Then compare your predicted values with y_test

In [23]:
loss , accuracy = model.evaluate(test_images , test_labels , batch_size = 32)
print('Test accuracy: {:2.2f}%'.format(accuracy*100))

Test accuracy: 62.50%


In [24]:
from sklearn.metrics import classification_report

# classification report code goes here.
y_pred = model.predict(test_images)
y_pred = np.argmax(y_pred, axis=1)
print(classification_report(test_labels.argmax(axis=1), y_pred, target_names=LABELS.keys()))


              precision    recall  f1-score   support

      NORMAL       0.00      0.00      0.00       234
   PNEUMONIA       0.62      1.00      0.77       390

    accuracy                           0.62       624
   macro avg       0.31      0.50      0.38       624
weighted avg       0.39      0.62      0.48       624



  _warn_prf(average, modifier, msg_start, len(result))


9. Data augmentation

Now that you have a classifier, let's see if data augmentation improves things!  

You can use the `ImageDataGenerator` that comes with keras.  Here's how to import it:

`from tensorflow.keras.preprocessing.image import ImageDataGenerator`

Here's the documentation: https://keras.io/preprocessing/image/

Here's an example of it getting used in the wild, in case you get stuck:

https://www.pyimagesearch.com/2020/03/16/detecting-covid-19-in-x-ray-images-with-keras-tensorflow-and-deep-learning/

In [25]:
# augment your data here
from tensorflow.keras.preprocessing.image import ImageDataGenerator
image_data_generator = ImageDataGenerator(rotation_range=20, fill_mode="nearest")

10. retrain your classifier

Now that you have augmented training data, please retrain your classifier.  The code should basically be the same.

In [26]:
INIT_LR = 1e-3
BS = 8
# new training code goes here
H = model.fit_generator(image_data_generator.flow(X_train, y_train, batch_size=BS),
                        steps_per_epoch=len(X_train) // BS,
                        validation_data=(X_val, y_val),
                        validation_steps=len(X_val) // BS,
                        epochs=EPOCHS)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


11. re-evaluate your classifier

Now that you've augmented the data, please re-evaluate your classifer.  Use classification report like before.

In [28]:
loss , accuracy = model.evaluate(test_images , test_labels , batch_size = 32)
print('Test accuracy: {:2.2f}%'.format(accuracy*100))

Test accuracy: 62.50%


In [27]:
# classification report goes here
y_pred = model.predict(test_images)
y_pred = np.argmax(y_pred, axis=1)
print(classification_report(test_labels.argmax(axis=1), y_pred, target_names=LABELS.keys()))


              precision    recall  f1-score   support

      NORMAL       0.00      0.00      0.00       234
   PNEUMONIA       0.62      1.00      0.77       390

    accuracy                           0.62       624
   macro avg       0.31      0.50      0.38       624
weighted avg       0.39      0.62      0.48       624



  _warn_prf(average, modifier, msg_start, len(result))


12. Evaluate the difference with data augmentation and without:

Did things improve?  Did they stay the same?  Did they get worse?  Please try to come up with an explanation of why you got the results you did.

No Improvement. They are same. May be because of the overfitting.

### Explanation of results go here

13. Getting COVID19 data

Now that you have a trained classifier with pneumonia, we are going to use this with COVID data.  

Clone this repo:

https://github.com/ieee8023/covid-chestxray-dataset

use the clone command: `git clone [REPO]`

to get the data locally.  

Make sure to run this command in the same folder as this jupyter notebook.

14. Read the data into memory

The set up for this data repository is a little different.  Please use the following code to read the data into memory:

In [37]:
import pandas as pd

def get_covid19():
    base = "covid-chestxray-dataset/"
    metadata = pd.read_csv(base+"metadata.csv")
    labels = []
    image_paths = []
    for index, row in metadata.iterrows():
        labels.append(row["finding"])
        image_paths.append(base+row["filename"])
    return labels, image_paths

labels, covid_image_paths = get_covid19()

15. preprocess images

you'll need to run the following functions on this data:

1. load_images
2. resize_images
3. greyscale_images
4. features_to_np_array
5. labels_to_np_array

Make sure to run each of those functions in order!

In [38]:
# add your function calls to covid_image_paths here
covid_images = load_images(covid_image_paths)
covid_images = resize_images(covid_images)
covid_images = greyscale_images(covid_images)

16. Strip out labels other than 'No Finding' and 'COVID-19' from the dataset

There are two straight forward ways to do this:

1) use a for-loop and keep track of indices

2) read labels and features into a dataframe and then filter to those two label types.  Your choice!

In [42]:
# label reduction code goes here
filtered_labels = []
for l in labels:
    if l == 'COVID-19' or l == 'No Finding':
        filtered_labels.append(l)

labels = filtered_labels

17. Predict on the new images

Here you'll use the classifier you trained on just pneumonia/not pneumonia to try and classify COVID-19 and no finding.  You'll use the pneumonia/not pneumonia classifier as a featurizer to do this.

Much of the code has been written, you'll just need to supply your trained classifier as input.

Please predict the labels from the classifier.  Then run `classification_report` to see how well your classifier did.

In [70]:
#prediction code goes here
import cv2
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
import glob
import code

def extract_features_covid(model, width, height):
    base = "covid-chestxray-dataset/"
    metadata = pd.read_csv(base+"metadata.csv")
    labels = []
    feature_list = []
    for index, row in metadata.iterrows():
        if row["finding"] == "COVID-19":
            labels.append("COVID")
            im_path = base+"images/"+row["filename"]
            im = cv2.imread(im_path, cv2.IMREAD_COLOR)
            if im is None:
                continue
            im = cv2.resize(im, (width, height))
            features = model.predict(im.reshape(1,width,height,3))
            features_np = np.array(features)
            feature_list.append(features_np.flatten())

    return np.array(feature_list), labels

def extract_features_not_covid(model, width, height):
    feature_list = []
    labels = []
    paths = [
        "chest_xray/test/NORMAL/*",
        "chest_xray/test/PNEUMONIA/*",
        "chest_xray/train/NORMAL/*",
        "chest_xray/train/PNEUMONIA/*"
        
    ]
    for path in paths:
        for im_path in glob.glob(path):
            if path == "chest_xray/train/NORMAL/*":
                labels.append("CLEAR TRAIN")
            if path == "chest_xray/test/NORMAL/*":
                labels.append("CLEAR TEST")
            if path == "chest_xray/train/PNEUMONIA/*":
                labels.append("PNEUMONIA")
            im = cv2.imread(im_path, cv2.IMREAD_COLOR)
            im = cv2.resize(im, (width, height))
            features = model.predict(im.reshape(1,width,height,3))
            features_np = np.array(features)
            feature_list.append(features_np.flatten())

    return np.array(feature_list), labels

# please make a copy of your tuned model and save it to variable:
untuned_model = model

# please specify the width and height you used for the image preprocessing
width =64
height = 64

covid_features, covid_labels = extract_features_covid(untuned_model, width, height)
non_covid_features, non_covid_labels = extract_features_not_covid(untuned_model, width, height)
features =  np.concatenate((covid_features, non_covid_features), axis=0)
labels =  np.concatenate((covid_labels, non_covid_labels), axis=0)
X_train = []
y_train = []
X_test = []
y_test = []
for index, label in enumerate(labels):
    if label == "CLEAR TRAIN":
        X_train.append(features[index])
        y_train.append(0)
    if label == "PNEUMONIA":
        X_train.append(features[index])
        y_train.append(1)
    if label == "COVID":
        X_test.append(features[index])
        y_test.append(1)
    if label == "CLEAR TEST":
        X_test.append(features[index])
        y_test.append(0)

logit_clf = LogisticRegression()
logit_clf.fit(X_train, y_train)
y_pred = logit_clf.predict(X_test)
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       0.00      0.00      0.00       234
           1       0.53      1.00      0.69       262

    accuracy                           0.53       496
   macro avg       0.26      0.50      0.35       496
weighted avg       0.28      0.53      0.37       496



  _warn_prf(average, modifier, msg_start, len(result))


18. Compare and contrast how the classifier did on Pneumonia versus COVID-19

Did it do as well?  Worse?  About the same?  What conclusions can you draw?


Answer: it get worse here. 

### Add your answers here!

Now that we've looked at a bunch of base cases, let's see if we can improve things by changing the model architecture.  We'll do this with a bunch of discrete steps

1. Change the number of trainable layers

Here you will make more of the layers trainable.  For this we are going to use cross validation to try and figure out which the optimal number of trainable layers.  Please us from the last 6 layers to one layer.  So your range should be:

```
trainable_range = [-6, -5, -4, -3, -2, -1]
```

Also, your X and y data should be the pneumonia data only.  Since that's what we trained on.  We should not assume we have access to the COVID data, except for testing, which will do later on.

Here's a blog post detailing how to set this up: https://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/

Note you'll need to set the number of trainable layers inside of `model_create` in order to make this tunable.  

Please report mean and standard deviation for accuracy.

In [72]:
# Use scikit-learn to grid search the activation function
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier

# Cross validation code goes here
def create_model(trainable_layer_range=-6):
    vgg_conv = VGG16(weights='imagenet', include_top=False, input_shape=(64, 64, 3))

    # Freeze the layers except the last 4 layers
    for layer in vgg_conv.layers[:trainable_layer_range]:
        layer.trainable = False

    # Create the model
    model = models.Sequential()

    # Add the vgg convolutional base model
    model.add(vgg_conv)

    # Add new layers
    model.add(layers.Flatten())
    model.add(layers.Dense(1024, activation='relu'))
    model.add(layers.Dropout(0.5))
    model.add(layers.Dense(2, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model


kerasCl_model = KerasClassifier(build_fn=create_model, verbose=1)
trainable_range = [-6, -5, -4, -3, -2, -1]

param_grid = dict(trainable_layer_range=trainable_range)
grid = GridSearchCV(estimator=kerasCl_model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(train_images, train_labels)


# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))
    



Epoch 1/1
Best: 0.685445 using {'trainable_layer_range': -1}
0.076289 (0.107889) with: {'trainable_layer_range': -6}
0.409622 (0.427788) with: {'trainable_layer_range': -5}
0.409622 (0.427788) with: {'trainable_layer_range': -4}
0.409622 (0.427788) with: {'trainable_layer_range': -3}
0.409622 (0.427788) with: {'trainable_layer_range': -2}
0.685445 (0.325541) with: {'trainable_layer_range': -1}


2. Analyze your results

Do you think that changing the number of tunable layers matters?  Does it improve classification accuracy enough to warrant changing the number of tunable layers?


Analysis: Yes. A little bit improvement in trainable layer with -1 

### Analysis and explanation go here

2. Tune over a layer activation function

Please set the number of tunable layers to 4 again.

Now we are going to make the layer activation tunable.  

To do this, please change the model_create function so that each layer has it's own tunable activation function.  Then run your new cross validation code.

In [73]:
# Cross validation code goes here
def create_model(activation='relu'):
    vgg_conv = VGG16(weights='imagenet', include_top=False, input_shape=(64, 64, 3))

    # Freeze the layers except the last 4 layers
    for layer in vgg_conv.layers[:-4]:
        layer.trainable = False

    # Create the model
    model = models.Sequential()

    # Add the vgg convolutional base model
    model.add(vgg_conv)

    # Add new layers
    model.add(layers.Flatten())
    model.add(layers.Dense(1024, activation=activation))
    model.add(layers.Dropout(0.5))
    model.add(layers.Dense(2, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model


activation = ['softmax', 'relu', 'tanh', 'sigmoid', 'linear']
param_grid = dict(activation=activation)

kerasCl_model = KerasClassifier(build_fn=create_model, verbose=1)

param_grid = dict(activation=activation)
grid = GridSearchCV(estimator=kerasCl_model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(train_images, train_labels)


# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))



Epoch 1/1
Best: 0.742956 using {'activation': 'softmax'}
0.742956 (0.363516) with: {'activation': 'softmax'}
0.742956 (0.363516) with: {'activation': 'relu'}
0.409622 (0.427788) with: {'activation': 'tanh'}
0.742956 (0.363516) with: {'activation': 'sigmoid'}
0.742956 (0.363516) with: {'activation': 'linear'}


3. Analyze your results

Does your choice of activation function matter?  When does the activation function perform best?  

Things to consider:

* Specifically does choosing the same activation function for all of the layers do best? 
* Does choosing different activation functions for each of the layers do best?
* Are they all within the same approximate accuracy range?
* do things vary wildly?

Analysis: activation functions 'relu'  'sigmoid' and 'linear' are same here. it is noted that tanh is doing worse here.

### Analysis and explanation go here

4. Tune over more hyperparameters

Now that we've tuned the activation functions, let's try tuning more parameters.  This time add tuning for the following parameters:

* number of neurons per layer
* weight initialization
* optimizer
* weight constraint
* activation function
* learning rate

Here is a great post on the range of values you should consider: https://www.wandb.com/articles/fundamentals-of-neural-networks

Here is some code that is also useful: https://www.kaggle.com/lavanyashukla01/training-a-neural-network-start-here

for understanding this practically.

In [76]:
# cross validation code goes here
def create_model(activation='relu', dropout=0.5, optimizer='adam', neurons=1):
    vgg_conv = VGG16(weights='imagenet', include_top=False, input_shape=(64, 64, 3))

    # Freeze the layers except the last 4 layers
    for layer in vgg_conv.layers[:-4]:
        layer.trainable = False

    # Create the model
    model = models.Sequential()

    # Add the vgg convolutional base model
    model.add(vgg_conv)

    # Add new layers
    model.add(layers.Flatten())
    model.add(layers.Dense(neurons, activation=activation))
    model.add(layers.Dropout(dropout))
    model.add(layers.Dense(2, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model


activation = ['relu',  'sigmoid']
optimizer = ['RMSprop', 'Adam']
neurons = [512, 1024]
param_grid = dict(activation=activation, optimizer=optimizer, dropout=dropout, neurons= neurons)

kerasCl_model = KerasClassifier(build_fn=create_model, verbose=1)

grid = GridSearchCV(estimator=kerasCl_model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(train_images, train_labels)


# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))



Epoch 1/1
Best: 0.742956 using {'activation': 'relu', 'dropout': 0.2, 'neurons': 512, 'optimizer': 'Adam'}
0.076289 (0.107889) with: {'activation': 'relu', 'dropout': 0.2, 'neurons': 512, 'optimizer': 'RMSprop'}
0.742956 (0.363516) with: {'activation': 'relu', 'dropout': 0.2, 'neurons': 512, 'optimizer': 'Adam'}
0.742956 (0.363516) with: {'activation': 'relu', 'dropout': 0.2, 'neurons': 1024, 'optimizer': 'RMSprop'}
0.409622 (0.427788) with: {'activation': 'relu', 'dropout': 0.2, 'neurons': 1024, 'optimizer': 'Adam'}
0.742956 (0.363516) with: {'activation': 'relu', 'dropout': 0.3, 'neurons': 512, 'optimizer': 'RMSprop'}
0.409622 (0.427788) with: {'activation': 'relu', 'dropout': 0.3, 'neurons': 512, 'optimizer': 'Adam'}
0.742956 (0.363516) with: {'activation': 'relu', 'dropout': 0.3, 'neurons': 1024, 'optimizer': 'RMSprop'}
0.742956 (0.363516) with: {'activation': 'relu', 'dropout': 0.3, 'neurons': 1024, 'optimizer': 'Adam'}
0.742956 (0.363516) with: {'activation': 'relu', 'dropout': 0

5. Tune over data augmentation

Here you'll take the best hyperparameters from your neural network, with 4 trainable layers, and then add them to a pipeline.  We will then tune over data augmentation parameters.  Report out your mean and standard deviation of accuracy.

Here we will create a scikit-learn pipline:

https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

If you need an example with gridsearch and pipeline:

https://scikit-learn.org/stable/tutorial/statistical_inference/putting_together.html

As a reminder, here is the documentation for data augmentation:

https://keras.io/preprocessing/image/

In [None]:
#cross validation code goes here

6. Analyze your results

Now that you've tuned over model parameters and preprocessing, what has a bigger impact?  Why do you think that might be the case?

### Analysis and explanation goes here

7. Using your best model and preprocessing to train a new model

Now you should select the best hyperparameters for the neural network and the best hyperparameters for the preprocesser and then combine them into a scikit-learn pipeline.  Next train a classifier with these new tuned hyperparameters.

In [86]:
optimized_model = grid_result.best_estimator_.model
optimized_model.summary()

Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
vgg16 (Model)                (None, 2, 2, 512)         14714688  
_________________________________________________________________
flatten_6 (Flatten)          (None, 2048)              0         
_________________________________________________________________
dense_11 (Dense)             (None, 512)               1049088   
_________________________________________________________________
dropout_6 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_12 (Dense)             (None, 2)                 1026      
Total params: 15,764,802
Trainable params: 8,129,538
Non-trainable params: 7,635,264
_________________________________________________________________


In [87]:
#classifer generation code goes here
X_train, X_val, y_train, y_val=train_test_split(train_images, train_labels, test_size=0.23, random_state=42)
optimized_model.fit(X_train, y_train, epochs=EPOCHS, validation_data=(X_val, y_val), verbose=1)

Train on 4016 samples, validate on 1200 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.callbacks.History at 0x4857f6f28>

8. Let's see if things improved - time for `classification_report`

Now that you've tuned your model, let's see how well it does on our test set!  First call predict on the test data to get a prediction.  Then use `classification_report` to see how well the model does.

In [89]:
# prediction code goes here
y_pred = optimized_model.predict(test_images)
y_pred = np.argmax(y_pred, axis=1)
print(classification_report(test_labels.argmax(axis=1), y_pred, target_names=LABELS.keys()))

              precision    recall  f1-score   support

      NORMAL       0.00      0.00      0.00       234
   PNEUMONIA       0.62      1.00      0.77       390

    accuracy                           0.62       624
   macro avg       0.31      0.50      0.38       624
weighted avg       0.39      0.62      0.48       624



  _warn_prf(average, modifier, msg_start, len(result))


9. Analsis and comparison

Now that you've seen how well your classifier does when it's been tuned, compare this with your previous model, that was untuned.  Are the precision, recall and f1-scores substantially different?  Why or why not?

### Analysis and explanation goes here

10. Prediction on COVID binary classification task with tuned model

Now you'll use your tuned classifier to try and predict on the binary COVID19 case.  Please change the model to your tuned model!

In [90]:
#prediction code goes here

# please make a copy of your tuned model and save it to variable:
tuned_model = optimized_model

# please specify the width and height you used for the image preprocessing
width =64
height = 64

covid_features, covid_labels = extract_features_covid(tuned_model, width, height)
non_covid_features, non_covid_labels = extract_features_not_covid(tuned_model, width, height)
features =  np.concatenate((covid_features, non_covid_features), axis=0)
labels =  np.concatenate((covid_labels, non_covid_labels), axis=0)
X_train = []
y_train = []
X_test = []
y_test = []
for index, label in enumerate(labels):
    if label == "CLEAR TRAIN":
        X_train.append(features[index])
        y_train.append(0)
    if label == "PNEUMONIA":
        X_train.append(features[index])
        y_train.append(1)
    if label == "COVID":
        X_test.append(features[index])
        y_test.append(1)
    if label == "CLEAR TEST":
        X_test.append(features[index])
        y_test.append(0)

logit_clf = LogisticRegression()
logit_clf.fit(X_train, y_train)
y_pred = logit_clf.predict(X_test)
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       0.00      0.00      0.00       234
           1       0.53      1.00      0.69       262

    accuracy                           0.53       496
   macro avg       0.26      0.50      0.35       496
weighted avg       0.28      0.53      0.37       496



  _warn_prf(average, modifier, msg_start, len(result))


11. Analyze your results

Now that you've seen the results of your tuned model, compare those with the results of the untuned model.  Did things get better? Worse?  Why do you think this may or may not be the case?

### Analysis of your results goes here

there is no improvement over untuned model. It looks both are suffering from overfitting.