This notebook is about the general workflow procedure for an image classification task. 

# Steps involved in image classification task
1. Data acquisition
2. Data processing
3. Model building
4. Model compilation
5. Model training
6. Model evaluation
7. Model predictions
8. Deployment of model as app, web, IoT etc...

We will be using tensorflow and keras for performing image classification. These are popular deep learning frameworks and are very user friendly, ideal for beginners. These libraries contain function to prepare the data, build models, compile, train, evaluate them and use them for prediciton and deployment.

In [None]:
import numpy as np  # numpy- NUMerical PYthon, used for array based operations
import tensorflow as tf
from tensorflow import keras 

# Data acquisition 

We are dealing with images, hence our data would be an image dataset. If you are working with google colab then you can go with datasets in google drive. There is an option to integrate the drive with colab. 

If you are using jupyter notebook, then you need to keep the dataset in the local machine (computer) and then mention the location in notebook.

In [None]:
## if using colab
from google.colab import drive
drive.mount('/content/drive')
train_path='/content/drive/MyDrive/Dataset/train'
test_path='/content/drive/MyDrive/Dataset/test'

In [None]:
## if using jupyter
train_path='C:\Users\sairam\Desktop\data science\datasets\train'
test_path= "C:\Users\sairam\Desktop\data science\datasets\test"

This is the general format for an image dataset. The train folder images are used to train the model and the test folder images are used to validate/test the trained model. Generally, 80% of data are kept in train and 20% in test.

DIRECTORY
       TRAIN
           class 1
               Image 1
               Image 2
               Image 3
               .
               .
               Image n
           class 2
               Image 1
               Image 2
               Image 3
               .
               .
               Image n
        TEST
            class 1
               Image 1
               Image 2
               Image 3
               .
               .
               Image n
           class 2
               Image 1
               Image 2
               Image 3
               .
               .
               Image n

# Data processing

The main work in data processing in relation to image classification is to prepare the data. We also perform an operation called as augmentation, an attempt to improve the size of the dataset. This is done by creating copies of images by applying different operations. 

The function in keras to perform augmentations is **ImageDataGenerator**. The imagedatagenerator is used to implement data augmentation. We can process the image as we want and create batches of image data.

In [None]:
train_datagen = image.ImageDataGenerator(rescale = 1./255, zoom_range = 0.2, width_shift_range= 0.2, height_shift_range= 0,2,vertical_flip=True)
 
test_dataset = image.ImageDataGenerator(rescale=1./255) #no transformations are applied to the testing dataset to maintain its originality

Rescaling refers to the conversion of scale (range) of data. The images in digital world have pixels whose intensities range from 0 to 255 where black being 0 and white being 255. Dividing all the pixels by the max value (255) will it to convert to decimals values ranging from 0 to 1.

## Data preparation

The next step in data processing involves the preparation of data for model. There are two functions present in keras for this and these methods are based on the type of dataset. 

The first method is called as flow_from_directory. This method is applicable when you have the data in folders with the class name. The second method is called as flow_from_dataframe. This method is applicable when you have the images in folders and the corresponding class in present in excel sheet (dataframe).  

In [None]:
train_generator = train_datagen.flow_from_directory(
    '/content/drive/MyDrive/Dataset/train',
    target_size = (224,224),
    batch_size = 32,
    class_mode = 'binary') ## or 'categorical'
validation_generator = test_dataset.flow_from_directory(
    '/content/drive/MyDrive/Dataset/val',
    target_size = (224,224),
    batch_size = 32,
    shuffle=True,
    class_mode = 'binary') ## or 'categorical'

Here we mention the location of images, the target size we want (because model accepts same size images only), batch size (no of images processed at a time: more batch (less time, less acc) less batch (more time, more acc), and type of class (either binary or categorical). Note that the transforms we mentioned before are applied to that input images in this step.

# Model creation

The next step is to create the deep learning algorithm. The algorithms used for image classification in the field of deep learning is CNN (Convolutional Neural Network). A base CNN comprises of four layers, convolution, pooling, flatten and dense. 

However, basic CNN's take a lot of time to train and produce good results. Hence to bypass this we go for a concept called as transfer learning. As the name suggests, we take a CNN that is trained on a task, modify it and use it for our task. It is our wish to train them or not (freezing). This method greatly conserves time and these models are known to produce good results for images in general. 

The CNN models (pre-trained) can be obtained from this link...
https://keras.io/api/applications/

The pre-trained versions of different models are present here, we can download the weights of these CNNs and use for our task.

In [None]:
base_for_model = tf.keras.applications.MobileNetV2(weights='imagenet', input_shape=(224,224,3), include_top=False)

All of these models are trained on ImageNet, which is a 1000 class dataset. So we need to remove the last layer (since it has 1000 neurons for 1000 classes) and add our own layer to it. input_shape=(224,224,3) => this is the input dimensions of the model. this is the parameter mentioned in the processing part. the 3 refers to the three channels R,G,B

In [None]:
for layer in base_for_model.layers:
    layer.trainable = False

Now we are freezing the layers present in the pre-trained model. These layers wont participate in training and this would greatly contribute in conservation of time. If kept true then those layers also train and this is called as fine-tuning.

In [None]:
## for binary
model = Sequential()
model.add(base_for_model) 
model.add(Flatten())
model.add(Dense(1, activation='sigmoid')) 
model.summary()

In [None]:
## for multiclass
model = Sequential()
model.add(base_for_model) 
model.add(Flatten())
model.add(Dense(n_classes, activation='softmax')) 
model.summary()

The Sequential method from keras creates an empty container onto which we stack up our layers. We append Flatten to the pre-trained model so that we convert the 2-D to 1-D for dense layers. Then we append the output layer which differs in the case of binary and multiclass

# Model compilation

Before training the model, it is important to compile the model with some parameters. There are three parameters used to compile a model.

1. Loss function- mention what function you are going to use to compare the model prediction with that of truth
https://keras.io/api/losses/

2. optimizer- mention what optimiser you are going to use to update the weights and biases of the CNN using loss function
https://keras.io/api/optimizers/

3. metrics- mention what metrics you are going to use to monitor the performance of model
https://keras.io/api/metrics/

In [None]:
#binary
model.compile(optimizer='adam',loss= 'binary_crossentropy', metrics='binary_accuracy')

In [None]:
#multiclass
model.compile(optimizer='adam',loss= 'categorical_crossentropy', metrics='accuracy')

# Model training

Once we have compiled, the model is good to go for training. We use the training images to train the model. The function used for this is model.fit()

epochs refers to the stages of training. It is of the users choice. For each epoch, there are some steps, this is mentioned by steps_per_epoch. There is a formula to calculate the steps. it is Steps_per_epoch= Training images/batch size. You can fix the steps within the value, going beyond this will cause error.

Fitting images into a model and training takes a lot of time. The reason is that the optimiser needs to calculate the weight and bias for millions of neurons in every stage. We can use GPU to boost the computation power and lessen the time taken. Google Colab and Kaggle provide free gpu through cloud. NVIDIA provides graphic cards with gpu to train CNN.

In [None]:
history = model.fit(
    train_generator,
    steps_per_epoch=19,
    epochs = 15, #own choice
    validation_data = validation_generator,
)

**Note- If the accuracy increases and loss decreases in every epoch, then it means the model is training. Also if the loss value is very high or negative, then it means you are using the wrong function (binary loss for categorical task and vice versa)**

# Model evaluation

Here we evaluate the performance of the trained model. The values shown during training refer to in each epoch. The values shown here refer to those of the completely trained model.

In [None]:
model.evaluate(train_generator)
model.evaluate(validation_generator)

The results produced for the validation data is more important than that of training. Because the validation images are not exposed to the model and the prediction on those are equivalent to real time predictions.

# Model prediction

Now we can use the trained model to predict on new images. We need to process the input image before feeding it to the model. It involves resizing to target size, converting to array and then adding a dimension at the end (for color channel). Then we give it to the model for prediction. It will return an array from which we infer the output, it differs in binary and categorical. 


Binary
it returns an array with one element. If that element is less than 0.5, we assign it to class 0 or we assign it to class 1. We do this because we have used sigmoid activation function.


Multiclass
it returns an array with elements equal to number of classes. Out of those elements, only one value would be the maximum. It means that the model is predicting the image to be that class. We need to find the class pertaining to the position of the maximum element. This is done by argmax function in numpy. We do this because we use softmax activation function.

In [None]:
#binary
from tensorflow.keras.utils import load_img, img_to_array
img = load_img('abc.jpeg',target_size=(224,224))
img = img_to_array(img)
img = np.expand_dims(img,axis=0)  
ypred = model.predict(img)
ypred=ypred[0]
if ypred>0.5:
    ypred="class_1"
else:
    ypred="class_0"  
print(ypred)

In [None]:
#multiclass
from tensorflow.keras.utils import load_img, img_to_array
img = load_img('abc.jpeg',target_size=(224,224))
img = img_to_array(img)
img = np.expand_dims(img,axis=0)  
ypred = model.predict(img)
ypred=np.argmax(ypred,axis=0)
if ypred==0:
    ypred="class_0"
elif ypred==1:
    ypred="class_1"
.
.
.
elif ypred==n:
    ypred='class_n'
print(ypred)