# Image Preprocessing and Binary Classification with Keras

## Objective
In this week's exercise, you will:
1. Learn how to image preprocessing in keras.
2. Build and train a multilayer neural network for binary classification on a real-world dataset of cats and dogs.

---

## Step 1: Import Libraries
Let's start by importing the necessary libraries.


In [4]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
import os
import random

---

## Step 2: Load and Preprocess the Data
We will use the Keras `ImageDataGenerator` for image augmentation and preprocessing.
First, unzip the uploaded dataset.


In [2]:
!unzip -q /content/kagglecatsanddogs_5340.zip

## Step 3: Learn about undersampling and implement it
Research online what undersampling and random undersampling is. It is a very powerful technique used often in machine Learning. Find out when it is used and undersample your dataset using "random undersampling"

In [5]:
# undersample your dataset here
def undersample_data(directory, class_name, target_size):
  class_dir = os.path.join(directory, class_name) #path to the class foler
  images = os.listdir(class_dir) # listing all images in the class directory

  #Number of images in the class
  print(f"Total images in {class_name}:{len(images)}")

  #check if number of images is greater than or equal to the target size
  if len(images) < target_size:
    print(f"!! The class {class_name} has fewer images than {target_size}")
    return images #no undersampling done and return all the images

  # select a random subset of images
  undersampled_images = random.sample(images,target_size)

  #printing list of undersampled images
  print(f"Undersampled images from {class_name}")
  print(undersampled_images)

#set the path to unzipped dataset
train_dir = "/content/PetImages"

undersampled_dogs = undersample_data(train_dir, "Dog", 100)
undersampled_cats = undersample_data(train_dir, "Cat", 100)

Total images in Dog:12501
Undersampled images from Dog
['9965.jpg', '4424.jpg', '10841.jpg', '2203.jpg', '2024.jpg', '5273.jpg', '7447.jpg', '8251.jpg', '4445.jpg', '7351.jpg', '5638.jpg', '11221.jpg', '278.jpg', '7001.jpg', '1554.jpg', '12345.jpg', '3448.jpg', '777.jpg', '10248.jpg', '155.jpg', '2930.jpg', '3898.jpg', '8721.jpg', '12250.jpg', '594.jpg', '4628.jpg', '11733.jpg', '11927.jpg', '8577.jpg', '6308.jpg', '7145.jpg', '11981.jpg', '1967.jpg', '4557.jpg', '7759.jpg', '9775.jpg', '5194.jpg', '6489.jpg', '2074.jpg', '4134.jpg', '7474.jpg', '11936.jpg', '10852.jpg', '1331.jpg', '3177.jpg', '3225.jpg', '11643.jpg', '8830.jpg', '7740.jpg', '10976.jpg', '6317.jpg', '8528.jpg', '10649.jpg', '5313.jpg', '3165.jpg', '3820.jpg', '8031.jpg', '4884.jpg', '6663.jpg', '4193.jpg', '8569.jpg', '8541.jpg', '6414.jpg', '9522.jpg', '6026.jpg', '7477.jpg', '3496.jpg', '9441.jpg', '4306.jpg', '9395.jpg', '7676.jpg', '6233.jpg', '4733.jpg', '2644.jpg', '888.jpg', '5321.jpg', '7320.jpg', '2630.jpg', 

---

## Step 4: Set Up ImageDataGenerator (or well more specifically the new version)
Were Sorry - the videos from the coursera course are sometimes not the most up to date. In this case the 'ImageDataGenerator' function is deprecated (look here https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator) and will be removed in the future versions. The concept behind the new reccomended function is very similar though.
The new reccomendation is loading images with tf.keras.utils.image_dataset_from_directory and transforming the output tf.data.Dataset with preprocessing layers.

You may use Chat GPT for this task and you can also check the following tutorials <br>
https://www.tensorflow.org/tutorials/load_data/images <br>
https://www.tensorflow.org/tutorials/load_data/images <br>
https://www.tensorflow.org/guide/keras/preprocessing_layers <br>

In [None]:
# TODO create a dataset using the recommended methods

---

## Step 5: Build a Multilayer Neural Network
Now, let's build a multilayer neural network for binary classification.


In [None]:
# TODO build a model

# TODO compile the model


---

## Step 6: Train the Model
Train the model using the Dataset you created


---

## Step 7: Evaluate the Model
After training, you may upload some test images to evaluate your model.


In [None]:
from tensorflow.keras.preprocessing import image
import numpy as np
from google.colab import files

def load_and_predict(model):
    uploaded_files = files.upload()

    for fn in uploaded_files.keys():
        path = '/content/' + fn
        img = image.load_img(path, target_size=(150, 150))

        x = image.img_to_array(img)
        x = np.expand_dims(x, axis=0) / 255.0

        classes = model.predict(x)
        result = "a dog" if classes[0] > 0.5 else "a cat"

        print(f'The model predicts that the image is of {result}')

# Call the function to upload images and get predictions
load_and_predict(model)