**What is a computer vision problem?**
* Binary classification problem
* Multi-class classification
* Object detection


#Introduction to convolutional neural networks and computer vision with tensorflow

Computer vision is the practice of writing
algorithms which can discover patterns in visual data. Such as the camera of a self-driving car recognize the car in front.

## Get the data 

the images we're working with are from the food101 dataset (101 different classes of food): https://www.kaggle.com/datasets/dansbecker/food-101

However we've modified it to only use 2 classses (pizza and steak) using the image data modification notebook (check danial's github)

>**Note:** we start with a smaller dataset so we can experiment quickly and figure what works (or better yet what doesn't work) before scaling up

In [1]:
import zipfile

#download to gg colab
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/pizza_steak.zip

#unzip downloaded file

zip_ref = zipfile.ZipFile("pizza_steak.zip")
zip_ref.extractall()
zip_ref.close()

'wget' is not recognized as an internal or external command,
operable program or batch file.


FileNotFoundError: [Errno 2] No such file or directory: 'pizza_steak.zip'

## Inspect the data (become one with it)

A very crucial step at the beginning of any machine learning project is becoming one with the data.

And for a computer vision project, this usually means visualizing many samples of your data


In [None]:
!ls pizza_steak


In [None]:
!ls pizza_steak/train/

In [None]:
!ls pizza_steak/train/steak

In [None]:
import os

#walkthrough the pizza_steak directory and list number of files

for dirpath, dirnames, filenames in os.walk("pizza_steak"):
  print(f"there are {len(dirnames)} directories and {len(filenames)} images in '{dirpath}'.")

In [None]:
# Another way to find out how many images in a file

num_steak_images_train = len(os.listdir("pizza_steak/train/steak"))

num_steak_images_train

To visualize our images, first let's get the class names programmatically

In [None]:
#get the classnames programmatically

import pathlib
import numpy as np

data_dir = pathlib.Path("pizza_steak/train")
class_names = np.array(sorted([item.name for item in data_dir.glob("*")])) # Created a list of class_names from the sub directories 
# if ds_store appear then we will need to remove it
print(class_names)

In [None]:
#let's visualize our images
import matplotlib.pyplot as plt
import matplotlib.image as mping
import random

def view_random_image(target_dir, target_class):
  #setup the target directory (we'll view images from here)
  target_folder = target_dir +"/"+ target_class

  #get a random image path
  random_image = random.sample(os.listdir(target_folder),1)
  print(random_image)
  #read in the image and plot it using matplotlib

  img = mping.imread(target_folder + "/" + random_image[0])
  plt.imshow(img)
  plt.title(target_class)
  plt.axis("off")

  print(f"image shape: {img.shape}") #show the shape of the image
  return img

In [None]:
#view random image from the training dataset
img = view_random_image(target_dir = "pizza_steak/train",
                        target_class = "pizza")

In [None]:
import tensorflow as tf
tf.constant(img)

In [None]:
#view the image shape
img.shape # return width, height, colour channels

In [None]:
#get all the pixel values between 0 and 1
img/225.

## an end-to-end example

let's build a convolutional neural network to find patterns in our images, more specifically we need a way to:
* load our images
* preprocess images
* build a CNN to find patterns in our images
* compile our CNN
*fit our CNN to our training data

In [None]:
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator

#set the seed
tf.random.set_seed(42)

#preprocess data (get all of the pixel values between 0 and 1, also called scaling/normalization)
train_datagen = ImageDataGenerator(rescale = 1./255)
valid_datagen = ImageDataGenerator(rescale = 1./255)

#set up paths to our data directories
train_dir = "pizza_steak/train"
test_dir = "pizza_steak/test"

#import data from directories and turn it into batches
train_data = train_datagen.flow_from_directory(directory = train_dir,
                                               batch_size=32,
                                               target_size = (224,224),
                                               class_mode = "binary",
                                               seed=42)
test_data = valid_datagen.flow_from_directory(directory = test_dir,
                                              batch_size = 32,
                                              target_size = (224,224),
                                              class_mode = "binary",
                                              seed = 42)

print(test_data)
#Build a CNN model(same as tiny VGG on the CNN explainer website)
model_1 = tf.keras.Sequential([
    tf.keras.layers.Conv2D(filters = 10,
                           kernel_size = 3,
                           activation = "relu",
                           input_shape=(224,224,3)),
    tf.keras.layers.Conv2D(10,3,activation="relu"),
    tf.keras.layers.MaxPool2D(pool_size=2,
                             padding ="valid"),
    tf.keras.layers.Conv2D(10, 3, activation="relu"),
    tf.keras.layers.Conv2D(10, 3, activation = "relu"),
    tf.keras.layers.MaxPool2D(2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(1, activation="sigmoid")

])


#Compile our function

model_1.compile(loss = "binary_crossentropy",
                optimizer = tf.keras.optimizers.legacy.Adam(),
                metrics = ["accuracy"])

#fit the model
history_1 = model_1.fit(train_data,
                        epochs = 5,
                        steps_per_epoch=len(train_data),
                        validation_data=test_data,
                        validation_steps = len(test_data))

>**note:** if the above cell is taking longer than 10s seconds per epoch, make sure you're using a GPU by going to Runtime -> Change runtime type -> Hardware accelerator -> GPU (you may have to return some cells above after doing this)

In [None]:
#Get a model summary
model_1.summary()

## using the same model as before

let's replication the model we've built in the previous section to see if it works with our image data

In [None]:
#set random seed
tf.random.set_seed(42)

#Create a model to replication the tensorflow playground model

model_2 = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(224,224,3)),
    tf.keras.layers.Dense(4, activation ="relu"),
    tf.keras.layers.Dense(4, activation="relu"),
    tf.keras.layers.Dense(1, activation="sigmoid")
])


#compile the model
model_2.compile(loss=tf.keras.losses.BinaryCrossentropy(),
                optimizer = tf.keras.optimizers.Adam(),
                metrics=["accuracy"])

#fit the model
history_2 = model_2.fit(train_data,
                        epochs = 5,
                        steps_per_epoch = len(train_data),
                        validation_data = test_data,
                        validation_steps = len(test_data))

In [None]:
#get a summary of model_2
model_2.summary()

Despite having 20x more parameters than our CNN (model_1), model_2 performs terribly... Let's try to improve it

In [None]:
#update the model above

#set random seed
tf.random.set_seed(42)

#create the model
model_3=tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(224,224,3)),
    tf.keras.layers.Dense(100, activation = "relu"),
    tf.keras.layers.Dense(100, activation = "relu"),
    tf.keras.layers.Dense(100, activation = "relu"),
    tf.keras.layers.Dense(1, activation = "sigmoid")
])


#compile the model
model_3.compile(loss = tf.keras.losses.BinaryCrossentropy(),
                optimizer = tf.keras.optimizers.legacy.Adam(),
                metrics=["accuracy"])


#fit the model
model_3.fit(train_data,
            epochs = 5,
            steps_per_epoch = len(train_data),
            validation_data = test_data,
            validation_steps = len(test_data))

In [None]:
#get a summary of model_3

model_3.summary()

**note:** you can think of trainable parameters as **patterns a model can learn from data**. Intuitively, you might think more is better. And in lots of cases, it is. But in this case, the difference here is the 2 different styles of model we're using. Where a series of dense layers has a number of different learnable parameters connected to each other and hence a higher number of possible learnable patterns, **a convolutional seeks to sort our and learn the most important patterns in an image** so even through these are less learnable parameters in our convolutional neural network, these are often more helpful in deciphering between **features** in an image.

## Binary classification: let's break it down

1. become 1 with the data (visualize, visualize, visualize)
2. preprocess the data (prepared it for our model, the main step here was scaling/ normalizing and turing our data into batches)
3. Create a model (start with a baseline)
4. Fit the model
5. Evaluate the model
6. Adjust different parameters and improve the model (try to beat our baseline)
7. Repear until satisfied (experiment, experiment, experiment)


### 1. become one with the data

In [None]:
#visualize the data
plt.figure()
plt.subplot(1,2,1)
steak_img = view_random_image("pizza_steak/train", "steak")
plt.subplot(1,2,2)
pizza_img = view_random_image("pizza_steak/train", "pizza")

### 2. preprocess the data (prepare it for the model)

In [None]:
#Define directory data paths
train_dir = "pizza_steak/train/"
test_dir = "pizza_steak/test/"



Next step is to turn our data into **batches**.


a batch is a small subset of our data. rather than look at all the data at one time, a model might only look at 32 at a time.

It does this for a couple of reasons:
1. 10,000 images or more might not fit into the memory of your processor (GPU)
2. Trying to learn the patterns in 10,000 in 1 hit could result in the model not being able to learn very well.

Why 32?

because 32 is good for your health...

In [None]:
# Create train and test data generator and rescale the data
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1/255.)
test_datagen = ImageDataGenerator(rescale = 1/255.)


In [None]:

#path to our data
train_dir = "pizza_steak/train"
test_dir = "pizza_steak/test"

#generate the data
train_data = train_datagen.flow_from_directory(directory = train_dir, #target directory of image
                                               target_size = (224,224), #target size of image (height, width)
                                               class_mode = "binary", #type of data you're working with
                                               batch_size = 32, # size of mini batches to load data into
                                               seed = 42
                                               )
test_data = test_datagen.flow_from_directory(directory = test_dir,
                                             target_size = (224,224),
                                             class_mode = "binary",
                                             batch_size = 32,
                                             seed = 42)

In [None]:
#get a sample of a training data batch
images, labels = train_data.next() # get the "next" batch of images/ label in the train data
len(images), len(labels)

In [None]:
#how many batches are there
len(train_data)

In [None]:
# get the first 2 images
images[:2], images[0].shape

In [None]:
images[7].shape

In [None]:
# view the first batch of labels
labels

###3. Create a CNN model (start with a baseline)

A baseline is a relatively simple model or existing result that you setup when begining a machine learning experiment and then as you keep experimenting, you try to beat the baseline

> **Note:** In deep learning, there is almost infinite amount of architectures you could create. So one of the beset ways to get started is to start with something  simple and see if it works on your data and then introduce complexity as required (e.g. look at which current model is performing best in the file of your problem).

In [None]:
#make the creating of our model a little easier
from tensorflow.keras.optimizers.legacy import Adam
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPool2D, Activation
from tensorflow.keras import Sequential

In [None]:
#create the model
model_4 = Sequential([
    Conv2D(filters = 10, #filter is the number of sliding windows going across the imput => higher => more complex
           kernel_size = (3,3), #the size of the sliding window goes across an input
           strides =(1,1), # the size of the step sliding window takes across an input
           padding = "valid", # if same => output shape is same as input shape, if valid => output shape gets compressed
           activation = "relu",
           input_shape = (224,224,3)), #input layer ( specifiy input shape)
    Conv2D(10,3,activation = "relu"),
    Conv2D(10,3, activation = "relu"),
    Flatten(),
    Dense(1 , activation = "sigmoid") # output layer (working with binary classification so only 1 output neuron)

    
])

### 3. comile the model

In [None]:
#compile the model
model_4.compile(loss = "binary_crossentropy",
                optimizer = Adam(),
                metrics=["accuracy"])

In [None]:
#get a summary of our model
model_4.summary()

###4. fit the model

In [None]:
#check length of training and testing data generator
len(train_data), len(test_data)

In [None]:
#fit the model
history_4 = model_4.fit(train_data, #this is a combination of labels and sample data
                        epochs = 5,
                        steps_per_epoch = len(train_data),
                        validation_data = test_data,
                        validation_steps = len(test_data))

In [None]:
model_1.evaluate(test_data)

In [None]:
model_4.evaluate(test_data)

### 5. Evaluating our model

it looks like our model is learning something, let's evaluate it

In [None]:
#let's plot the loss curves
import pandas as pd
pd.DataFrame(history_4.history).plot(figsize = (10,7))

In [None]:
#plot the validation and training curves separately
def plot_loss_curves(history):
  """
  returns separate loss curves for training and validation metrics
  """
  
  loss = history.history["loss"]
  val_loss = history.history["val_loss"]

  accuracy = history.history["accuracy"]
  val_accuracy = history.history["val_accuracy"]

  epochs = range(len(history.history["loss"])) #how many epochs did we run for

  #plot loss
  plt.plot(epochs, loss, label="training_loss")
  plt.plot(epochs, val_loss, label="val_loss")
  plt.title("loss")
  plt.xlabel("epochs")
  plt.legend()


  plt.figure()
  #plot accuracy
  plt.plot(epochs, accuracy, label="training_accuracy")
  plt.plot(epochs, val_accuracy, label="val_accuracy")
  plt.title("accuracy")
  plt.xlabel("epochs")
  plt.legend()

> **Note:** when a model's **validation loss starts to increase**, it's likely that the model is **overfitting** the training dataset. This means, it's learning the patterns in the training dataset *too well* and thus the model's ability to generalize to unseen data will be diminished

In [None]:
#check out the loss and accuracy
plot_loss_curves(history_4)

###6. adjust the model parameters

Fitting a machine learning model com in 3 steps:

0. create a baseline
1. Beat the baseline by overfitting a larger model
2. Reduce overfitting


ways to induce overfitting:
* increase the number of conv layers
* increate the number of conv filters
* add another dense layer to the output of our flattened layer

Reduce overfitting:
* add data augmentation
* add regularization layers (such as MaxPool2D
* Add more data...

>**note:** reducing overfitting is also known as **regularization**

In [None]:
#Create the model (this is going to be our new baseline)

model_5 = Sequential([
    Conv2D(10,3,activation = "relu", input_shape=(224,224,3)),
    MaxPool2D(pool_size=2),
    Conv2D(10,3,activation="relu"),
    MaxPool2D(),
    Conv2D(10,3,activation="relu"),
    MaxPool2D(),
    Flatten(),
    Dense(1, activation="sigmoid")
])

In [None]:
#Compile the model
model_5.compile(loss="binary_crossentropy",
                optimizer = Adam(),
                metrics=["accuracy"])

history_5 = model_5.fit(train_data,
                        epochs = 5,
                        steps_per_epoch=len(train_data),
                        validation_data = test_data,
                        validation_steps = len(test_data))

In [None]:
#get a summary of or model with max pooling
model_5.summary()

In [None]:
#plot loss curves
plot_loss_curves(history_5)

### opening our bag of tricks and finding data augmentation

In [None]:
#Create ImageDataGenerator training instance with data augmentation

train_datagen_augmented = ImageDataGenerator(rescale=1/255.,
                                             rotation_range=0.2, #how much do you want to rotate an image
                                             shear_range=0.2, # how much do you want to shear an image
                                             zoom_range=0.2, #zoom in randomly an image
                                             width_shift_range=0.2, #move random in x-axis
                                             height_shift_range = 0.2, #move random in y-axis
                                             horizontal_flip=True) #do you want to flip an image
#Create ImageDataGenerator without data augmentation
train_datagen = ImageDataGenerator(rescale = 1/255.)

#create ImageDataGenerator for test data
test_datagen = ImageDataGenerator(rescale=1/255.)


>**Question:** what is data augmentation?

Data augmentation is the process of altering our training data, leading it to have more diversity and in turn allowing our models to learn more generalizable (hopefully) patterns. Altering might means adjusting the rotation of an image, flipping it or cropping it....

Let's write some code to visulaize data augmentation..

In [None]:
IMG_SIZE = (224,224)


#import data and augment it from training directory
print("augmented training data")

train_data_augmented = train_datagen_augmented.flow_from_directory(train_dir,
                                                                   target_size= IMG_SIZE,
                                                                   batch_size = 32,
                                                                   class_mode="binary",
                                                                   shuffle=False) #for demonstation purposes only

#create non-augmented train data batches
print("non-augmented train data")
train_data = train_datagen.flow_from_directory(train_dir,
                                               target_size=IMG_SIZE,
                                               batch_size=32,
                                               class_mode="binary",
                                               shuffle=False)

#create non-augmented test data batches
print("non-augmented test data")
test_data = test_datagen.flow_from_directory(test_dir,
                                             target_size=IMG_SIZE,
                                             batch_size=32,
                                             class_mode="binary"
                                             )



**Note:** Data augmentation is usually only performed on the training data. Using `ImageDataGenerator` built-in data augmentation parameters our image are left as they are in the directories but are modified as they're lo.aded into the model


Finally... let's visualize some augmented data!!!

In [None]:
#Get sample augmented data batches
images, labels = train_data.next()
augmented_images, augmented_labels = train_data_augmented.next() #note: labels aren't augmented... only data images

In [None]:
#show the original images and augmented image
import random
random_number = random.randint(0,32) # our batch sizes are 32
print(f"showing image number:{random_number}")
plt.imshow(images[random_number])

plt.title(f"original image")
plt.axis(False)
plt.figure()

plt.imshow(augmented_images[random_number])
plt.title(f"augmented image")
plt.axis(False)


Now we've seen what augmented training data looks like, let's build a model and see how it learns on augmented data

In [None]:
# Create a model (same as model 5)

model_6 = Sequential ([
    Conv2D(10,3,activation = "relu"),
    MaxPool2D(pool_size=2),
    Conv2D(10,3,activation = "relu"),
    MaxPool2D(),
    Conv2D(10,3,activation = "relu"),
    MaxPool2D(),
    Flatten(),
    Dense(1, activation="sigmoid")
])

#Compile the model
model_6.compile(loss = "binary_crossentropy",
                optimizer = tf.keras.optimizers.Adam(),
                metrics=["accuracy"])

#Fit the model
history_6=model_6.fit(train_data_augmented, #fitting model_6 on augmented trainind data
                      epochs = 5,
                      steps_per_epoch = len(train_data_augmented),
                      validation_data = test_data,
                      validation_steps = len(test_data))

In [None]:
#Check our model training curves
plot_loss_curves(history_6)

Let's shuffle our augmented training data and train another model (the same as before) on it and see what happens

In [None]:
#reimport data and augmented it and shuffle from training directory
train_data_augmented_shuffled = train_datagen_augmented.flow_from_directory(train_dir,
                                                                            target_size = IMG_SIZE,
                                                                            batch_size = 32,
                                                                            class_mode = "binary",
                                                                            seed=42)

In [None]:
#set random seed
tf.random.set_seed(42)

#Create a model
model_7 = Sequential([
    Conv2D(10,3,activation="relu",input_shape = (224,224,3)),
    MaxPool2D(pool_size = 2),
    Conv2D(10,3,activation="relu"),
    MaxPool2D(),
    Conv2D(10,3,activation="relu"),
    MaxPool2D(),
    Flatten(),
    Dense(1, activation = "sigmoid")

])

#compile the model
model_7.compile(loss = tf.keras.losses.BinaryCrossentropy(),
                optimizer = tf.keras.optimizers.legacy.Adam(),
                metrics=["accuracy"])


#fit the model
history_7 = model_7.fit(train_data_augmented_shuffled, #we're fitting on augmented data and shuffle
            epochs= 5,
            steps_per_epoch=len(train_data_augmented_shuffled),
            validation_data = test_data,
            validation_steps = len(test_data)
            )

In [None]:
#summary of model 7
model_7.summary()

In [None]:
plot_loss_curves(history_7)

### 7. repeat until satisfied

Since we've already beaten our baseline, there're only a few things that we could try to continue improve our model:
* Increase the number of model layers (e.g. add more `Conv2D`/`MaxPool2D` layers)
* Increase the number of filters in each convolutional layer (e.g. from 10 to 32 or even 64)
* Train for longer(more epochs)
* Find an ideal learning rate
* Get more data (give the model more opportunities to learn)
* Use **transfer learning** to leverage what another image model has learnt and adjust it for our own use case

> **practice:** Recreate the model on CNN explainer website (same as model_1) and see how it performs on the augmented shuffled training data

In [None]:
#for practice
#create the model
model_p = Sequential ([
    Conv2D(10,3,activation = "relu", input_shape = (224,224,3)),
    Conv2D(10,3,activation= "relu"),
    MaxPool2D(),
    Conv2D(10,3,activation = "relu"),
    Conv2D(10,3,activation = "relu"),
    MaxPool2D(),
    Flatten(),
    Dense(1, activation="sigmoid")
])

#compile the model
model_p.compile(loss = "binary_crossentropy",
                optimizer = Adam(),
                metrics = ["accuracy"])


#fit the model
history_p = model_p.fit(train_data_augmented_shuffled,
                        epochs = 5,
                        steps_per_epoch = len(train_data_augmented_shuffled),
                        validation_data = test_data,
                        validation_steps = len(test_data))

## making a prediction with our trained model on our own custom data

In [None]:
#Classes  we're working with 
print(class_names)

In [None]:
#view our example 
import matplotlib.image as mping
import matplotlib.pyplot as plt

!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/03-steak.jpeg
steak = mping.imread("03-steak.jpeg")

In [None]:
#plot the image
plt.imshow(steak)
plt.axis(False)

In [None]:
#Check the shape of our image
steak.shape

**Note:** when you train a neural network and you want to make a prediction with it on your own custom data, it's important than your custom data (or new data) is preprocessed into the same format as the data your model was trained on

In [None]:
#Create a function to import an image and resize it to be able to add into our model

def load_and_prep_image(filename, img_shape=224):
 """
 read an image from file name, turns it into a tensor and reshapes it to (img_shape, img_shape, color_channels)
 """

 #read in the image
 img = tf.io.read_file(filename)
 #decode the read file into a tensor
 img = tf.image.decode_image(img)
 #resize the image
 img = tf.image.resize(img, size = [img_shape, img_shape])
 #rescale the image and get all values between 0 and 1
 img= img/255.
 #expand the dimension of the img to include the batch size
 img = tf.expand_dims(img, axis = 0)
 return img

In [None]:
#load in an preprocess our custom image
steak = load_and_prep_image("03-steak.jpeg")
steak.shape

In [None]:
pred = model_7.predict(steak)

Looks like our custom image is being put through our model, however it currently outputs a prediction probability' wouldnt it be nice if we could visualize the image as well as the model's prediction

In [None]:
#remind ourselves of our class name
class_names



In [None]:
#we can index the predicted class by rounding the prediction probability and indexing it on the class_names
pred_class = class_names[int(tf.round(pred))]
pred_class

In [None]:
def pred_and_plot(model, filename, class_names = class_names):
  """
  Imports an image located at filename, makes a prediction with model and plots the image with the 
  predicted class as the title.
  """

  #import the target image and preprocessed it
  img = load_and_prep_image(filename)

  #make a prediction
  pred = model.predict(img)

  #get the predicted class
  pred_class = class_names[int(tf.round(pred))]

  #plot the image and predicted class
  plt.imshow(tf.squeeze(img))
  plt.title(f"prediction:{pred_class}")
  plt.axis(False)

In [None]:
#test our model on a custom image
pred_and_plot(model_7, "03-steak.jpeg")

In [None]:
#get a custom pizza image
!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/03-pizza-dad.jpeg



In [None]:
pred_and_plot(model_7, "03-steak.jpeg")

## multi-class image classification

We've been through a bunch of the following steps with a binary classification problem (pizza vs. steak)m now we're going to step things up a notch with 10 classes of food (multi-class classification)

1. Become one with the data
2. Preprocess the data (get it ready for a model)
3. Create a model (start with a baseline)
4. Fit the model (overfitting it to make sure it works)
5. Evaluate the model
6. Adjust different hyperparameters and improve the model (try to beat baseline/reduce overfitting)
7. Repeat until satisfied

## 1. import and become one with the data

In [None]:
import zipfile

!wget https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_all_data.zip

#unzip our data
zip_ref = zipfile.ZipFile("10_food_classes_all_data.zip","r")
zip_ref.extractall()
zip_ref.close()

In [None]:
import os
#walkthough 10 classes of food image data
for dirpath, dirnames, filenames in os.walk("10_food_classes_all_data"):
  print(f"there are {len(dirnames)} directories and {len(filenames)} images  '{dirpath}'.")

In [None]:
!ls -la 10_food_classes_all_data/

In [None]:
#setup the train and test directories
train_dir = "10_food_classes_all_data/train/"
test_dir = "10_food_classes_all_data/test/"

In [None]:
#let's get the class names
import pathlib
import numpy as np
data_dir = pathlib.Path(train_dir)
class_names = np.array(sorted([item.name for item in data_dir.glob("*")]))
print(class_names)

In [None]:
#Visualize , visualize, visualize
import random 
img = view_random_image(target_dir = train_dir,
                        target_class = random.choice(class_names))

In [None]:
random.choice(class_names)

### 2. Preprocess the data (prepare it for a model)