<a href="https://colab.research.google.com/github/Applied-Machine-Learning-2022/project-5-jcar0-uark/blob/ahmed/colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#### Copyright 2020 Google LLC.

In [None]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Image Classification Project

In this project we will build an image classification model and use the model to identify if the lungs pictured indicate that the patient has pneumonia. The outcome of the model will be true or false for each image.

The [data is hosted on Kaggle](https://www.kaggle.com/rob717/pneumonia-dataset) and consists of 5,863 x-ray images. Each image is classified as 'pneumonia' or 'normal'.

## Ethical Considerations

We will frame the problem as:

> *A hospital is having issues correctly diagnosing patients with pneumonia. Their current solution is to have two trained technicians examine every patient scan. Unfortunately, there are many times when two technicians are not available, and the scans have to wait for multiple days to be interpreted.*
>
> *They hope to fix this issue by creating a model that can identify if a patient has pneumonia. They will have one technician and the model both examine the scans and make a prediction. If the two agree, then the diagnosis is accepted. If the two disagree, then a second technician is brought in to provide their analysis and break the tie.*

Discuss some of the ethical considerations of building and using this model. 

* Consider potential bias in the data that we have been provided. 
* Should this model err toward precision or accuracy?
* What are the implications of massively over-classifying patients as having pneumonia?
* What are the implications of massively under-classifying patients as having pneumonia?
* Are there any concerns with having only one technician make the initial call?

The questions above are prompts. Feel free to bring in other considerations you might have.

### **Student Solution**

Consider potential bias in the data that we have been provided.

> Although the data being provided contains over 5,863 x-ray samples, it still may contain bias in some ways. In this dataset we are considering potential pneumonia patients, however out of the patients analyzed many features about the patient and their demographic are withheld. This could lead to a bias in the data since our model could potentially be trained with data unvaried in race, gender, or age. Another potential bias could be the quality of the images being used as well as the accuracy of the data being trained.


Should this model err toward precision or accuracy?

> Precision and accuracy are both import when building and training a model. However, accuracy is vital with the goal of this model. Since, in the scenario our objective is to reduce the diagnosis from two-trained technicians to one, the model needs to pass the validation test. Only if it is accurate with one other trained technician will a diagnosis be complete. Our model could be precise and still be incorrect more times than it is correct.

What are the implications of massively over-classifying patients as having pneumonia?

> Over-classifying, or under-classifying patients can always become a problem. In this example, over-classifying could lead to the model not aligning with a professionals opinion, therefor leading to the patient having to go through additional examinations because of inconsistent diagnosis. An over-classification would also lead to a high congestion of patients making it harder for patients actually sick to schedule examinations.

What are the implications of massively under-classifying patients as having pneumonia?

> The implications of underclassifying a patient would mean that we would have false positives,which could be fatal if the doctor prescribes the wrong medicine to the wrong patient. It would also mean that doctors would have to personally identify if a patient is positive or negative with pneumonia, which would waste time. This would lead to over crowded hospitals and an increase in mortality rate. 

Are there any concerns with having only one technician make the initial call?

> The problem with only having one technician making the intital call would be that it could be made for the wrong reasons if they do not consult with other technicians. This could lead to either resources being sent to places not needed where they could have been used elsewhere. That is why it is important for technicians to review each other's calls to see if it is needed or not.

---

## Modeling

In this section of the lab, you will build, train, test, and validate a model or models. The data is the ["Detecting Pneumonia" dataset](https://www.kaggle.com/rob717/pneumonia-dataset). You will build a binary classifier that determines if an x-ray image has pneumonia or not.

You'll need to:

* Download the dataset
* Perform EDA on the dataset
* Build a model that can classify the data
* Train the model using the training portion of the dataset. (It is already split out.)
* Test at least three different models or model configurations using the testing portion of the dataset. This step can include changing model types, adding and removing layers or nodes from a neural network, or any other parameter tuning that you find potentially useful. Score the model (using accuracy, precision, recall, F1, or some other relevant score(s)) for each configuration.
* After finding the "best" model and parameters, use the validation portion of the dataset to perform one final sanity check by scoring the model once more with the hold-out data.
* If you train a neural network (or other model that you can get epoch-per-epoch performance), graph that performance over each epoch.

Explain your work!

> *Note: You'll likely want to [enable GPU in this lab](https://colab.research.google.com/notebooks/gpu.ipynb) if it is not already enabled.*

If you get to a working solution you're happy with and want another challenge, you'll find pre-trained models on the [landing page of the dataset](https://www.kaggle.com/paultimothymooney/detecting-pneumonia-in-x-ray-images). Try to load one of those and see how it compares to your best model.

Use as many text and code cells as you need to for your solution.

### **Student Solution**

#### Download the dataset


In [None]:
# Imports

import matplotlib.pyplot as plt
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator
import numpy as np
from sklearn.linear_model import LogisticRegression
import os

In [None]:
# Download the dataset
! chmod 600 kaggle.json && (ls ~/.kaggle 2>/dev/null || mkdir ~/.kaggle) && cp kaggle.json ~/.kaggle/ && echo 'Done'
! kaggle datasets download paultimothymooney/chest-xray-pneumonia
! unzip chest-xray-pneumonia.zip

In [None]:
# Explore some of the images in both classes

train_images = tf.keras.preprocessing.image_dataset_from_directory(
    'chest_xray/train/',
    batch_size = 32
)

classes = train_images.class_names
plt.figure(figsize = (10, 10))
for images, labels in train_images.take(1):
    for i in range(12):
        plt.subplot(3, 4, i + 1)
        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(classes[labels[i]])
        plt.axis(False)

#### Train and Test our Models

In [None]:
train_datagen = ImageDataGenerator(rescale = 1./255, shear_range = 0.2, zoom_range = 0.2, horizontal_flip = True)

test_val_datagen = ImageDataGenerator(rescale = 1./255)  #Image normalization.

# load and iterate Training dataset
train_it = train_datagen.flow_from_directory('chest_xray/train/', class_mode='binary', batch_size=32, shuffle = True, target_size=(64, 64))

# load and iterate Testing dataset
test_it = test_val_datagen.flow_from_directory('chest_xray/test/', class_mode='binary', batch_size=32, target_size=(64, 64))

# load and iterate Validation dataset
val_it = test_val_datagen.flow_from_directory('chest_xray/val/', class_mode='binary', batch_size=32, target_size=(64, 64))

##### Model 1

In [None]:
CNN_model1 = tf.keras.Sequential([
    tf.keras.layers.Conv2D(16, 3, padding='same', activation='sigmoid',
                           input_shape=(64, 64, 3)),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(32, 3, padding='same', activation='sigmoid'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='sigmoid'),
    tf.keras.layers.Dense(1, activation='softmax')
])

CNN_model1.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Training
history1 = CNN_model1.fit(
    train_it,
    steps_per_epoch = 163,
    epochs = 5,
)

# Scoring
print("Model 1 Testing:")
CNN_model1.evaluate(test_it)

In [None]:
# Plot Accuracy & Loss across Epochs

plt.plot(history1.history['accuracy'])
plt.plot(history1.history['loss'])
plt.title('Training Accuracy & Loss')
plt.xlabel('Epoch')
plt.legend(['Accuracy', 'Loss'], loc='upper left')
plt.show()

##### Model 2

In [None]:
CNN_model2 =  tf.keras.Sequential([
    tf.keras.layers.Conv2D(32, (3, 3), activation="relu", input_shape=(64, 64, 3)),
    tf.keras.layers.MaxPooling2D(pool_size = (2, 2)),
    tf.keras.layers.MaxPooling2D(pool_size = (2, 2)),
    tf.keras.layers.Conv2D(32, (3, 3), activation="sigmoid"),
    tf.keras.layers.Conv2D(32, (3, 3), activation="relu"),
    tf.keras.layers.MaxPooling2D(pool_size = (2, 2)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(activation = 'relu', units = 128),
    tf.keras.layers.Dense(activation = 'sigmoid', units = 1)
])

CNN_model2.compile(optimizer = 'adam', 
               loss = 'binary_crossentropy', 
               metrics = ['accuracy'])

# Training
history2 = CNN_model2.fit(
    train_it,
    steps_per_epoch = 163, 
    epochs = 10, 
)

# Scoring 
print("Model 2 Testing:")
CNN_model2.evaluate(test_it)

In [None]:
# Plot Accuracy & Loss across Epochs

plt.plot(history2.history['accuracy'])
plt.plot(history2.history['loss'])
plt.title('Training Accuracy & Loss')
plt.xlabel('Epoch')
plt.legend(['Accuracy', 'Loss'], loc='upper left')
plt.show()

##### Model 3

In [None]:
# Logistic Regression Model

X_train = []
X_test = []

for i in range(len(train_it[0][0])):
  X_train.append(train_it[0][0][i].flatten())

for i in range(len(test_it[0][0])):
  X_test.append(test_it[0][0][i].flatten())

y_train = np.array(train_it[0][1])
y_test = np.array(test_it[0][1])

# Training
model3 = LogisticRegression(random_state=0, max_iter=1000).fit(X_train, y_train)

# Scoring
model3.score(X_test,y_test)

#### Validate our best Model

In [None]:
# Use our validation data to check our model

CNN_model2.evaluate(val_it)

---