<a href="https://colab.research.google.com/github/mohanrajmit/AGE_ESTIMATION/blob/master/Object_Detection_api.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#### 1. Install TensorFlow Object Detection API
- For latest install instructions, check the [github](https://github.com/tensorflow/models/tree/master/research/object_detection) page or [readthedocs](https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html) site for this API.

In [None]:
#Clone github repository
!git clone https://github.com/tensorflow/models

Install TensorFlow Object Detection API

In [None]:
%%bash
cd models/research
# Compile protos.
protoc object_detection/protos/*.proto --python_out=.
# Install TensorFlow Object Detection API.
cp object_detection/packages/tf2/setup.py .
python -m pip install --use-feature=2020-resolver .

In [None]:
#Check if Object Detection API is installed
!pip list | grep object-detection

#### 2. Download dataset

Create a directory for the project (e.g detection) and move to project directory.

In [None]:
#Come to the home directory - change if not running on  Colab
%cd /content

In [None]:
#Create a folder for your project e.g in this case, detection. You can choose any name for the folder 
!mkdir detection

In [None]:
#Go to the project folder
%cd detection

In [None]:
!ls -l

Download dataset : Information on dataset is available [here](http://host.robots.ox.ac.uk/pascal/VOC/). We are downloading Pascal VOC 2007 dataset here (VOC 2012 is also available)

In [None]:
#Get PASCAL VOC dataset
!wget http://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar --quiet

In [None]:
#Check the project folder
!ls -l

Extract files from the tar file

In [None]:
!tar -xf VOCtrainval_06-Nov-2007.tar

In [None]:
#Check the current folder for extracted data
!ls -l

In [None]:
!ls -l VOCdevkit

In [None]:
!ls -l VOCdevkit/VOC2007

JPEGImages folder has actual images and 'Annotations' folder has Class labels and bounding box information for each image.

In [None]:
!ls -l VOCdevkit/VOC2007/JPEGImages

In [None]:
#Number of images
!ls -l VOCdevkit/VOC2007/JPEGImages| wc -l

Review image annotation in XML

In [None]:
!ls -l VOCdevkit/VOC2007/Annotations

In [None]:
#Number of XML files - it should one for each image
!ls -l VOCdevkit/VOC2007/Annotations | wc -l

In [None]:
#Lets check
!cat VOCdevkit/VOC2007/Annotations/008422.xml

Set images folder and XMLs folder

In [None]:
#Change it for your own dataset
img_path = 'VOCdevkit/VOC2007/JPEGImages'
xml_path = 'VOCdevkit/VOC2007/Annotations'

#### 3. Data Pre-processing : Convert XML to CSV

Use xml_to_csv.py file provided. This script will read all XML files and save the information in a CSV file.
 Here we are copying the script file from Google Drive.

In [None]:
#We should have 'xml_to_csv.py' script in our current folder
!ls -l

Generate CSV file from all XML files using copied script. The script requires two parameters
1. -i <xml_files_folder_name> : indicating XML files are stored
2. -o <output_file_path> : indicating what name should be given to CSV output file and where it should be stored.

In [None]:
#Running the script with options
!python xml_to_csv.py -i {xml_path} -o detection_data.csv

In [None]:
#We should have CSV file in current folder
!ls -l

Load csv file as dataframe

In [None]:
import pandas as pd
df = pd.read_csv('detection_data.csv')
df.head(5)

In [None]:
#How many object across all images
df.shape

In [None]:
#List of Classes
df['class'].unique()

In [None]:
#Number of labels
len(df['class'].unique())

We will need to Label encode classes e.g assign a unique index number for each class

In [None]:
#Use Label encoder available in Scikit Learn
from sklearn import preprocessing

In [None]:
#Label Encode class and add a 'label' column to the dataframe
le = preprocessing.LabelEncoder()
df['label'] = le.fit_transform(df['class'])
df.head()

In [None]:
#unique values in the label
df.label.unique()

In [None]:
#Object detection API expects index to start from 1 (and not 0)
df['label'] = df['label'] + 1
df.label.unique()

In [None]:
#Dataframe should have label column now
df.head()

In [None]:
#Create a dictionary of Label and Class. This will be useful for building our second input to Model training
label_class_dict = dict(zip(df['label'], df['class']))
print(label_class_dict)

**Split data between training and test**

First we have to split images between training and test. Then we can use that information to split dataframe between training and test. This will make sure objects from same image are not split between training and test.

In [None]:
#Get information on all images
all_files = df['filename'].unique()
all_files.shape

In [None]:
#Split images between training and test
import numpy as np

#80% of the data will be used for training
mask = np.random.rand(all_files.shape[0]) < 0.8

#Get Training and Test images
train_images = all_files[mask]
test_images = all_files[~mask] 

In [None]:
#Check number of images in training and test
train_images.shape, test_images.shape

In [None]:
train_images[:10]

In [None]:
#Split dataframe between training and test
train_df = df[df['filename'].isin(train_images)]
test_df = df[df['filename'].isin(test_images)]

In [None]:
train_df.shape, test_df.shape

**Visualizing the data**

In [None]:
#We will use opencv and matplotlib
from matplotlib import pyplot as plt
import cv2

In [None]:
#Pickup a random image number
img_num = np.random.randint(0, df.shape[0])

#Read the image
img_file = df.loc[img_num,'filename']
img = cv2.imread(img_path + '/' + img_file)

#Find all rows which have same file name
rows_with_file = df[df['filename'] == img_file].index.tolist()

#Draw rectangle(s) as per bounding box information
print('Number of objects', len(rows_with_file))
for i in rows_with_file:

    #Get bounding box
    xmin, ymin, xmax, ymax = df.loc[i, ['xmin', 'ymin', 'xmax', 'ymax']]
    #Get Label
    label = df.loc[i, 'class']
    #Add bounding box
    cv2.rectangle(img, (xmin,ymin), (xmax, ymax), (0,255,0), 2)
    #Add text
    cv2.putText(img,label,(xmin, ymin+10),cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 255, 255), 2)

#Convert BGR format (used by opencv to RGB format used by matplotlib)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

#cv2.imshow('img', img)

#Draw image using matplotlib
plt.figure(figsize=(10,7))
plt.imshow(img)
plt.show()

In [None]:
df.groupby(['class']).count()

Save training and test data as csv

In [None]:
train_df.to_csv('train.csv', index=False)
test_df.to_csv('test.csv', index=False)

In [None]:
#We should have training and test csv files in current directory
!ls -l

#### 4. Generate tfrecord from CSV
Tensorflow object detection API requires data in tfrecord format. This can be done using generate_tfrecord.py file.

In [None]:
#Make sure the script file is now available
!ls -l

The script file requires 3 inputs

1. --csv_input=<csv_file_path> : where is csv file located which was prepared in previous step
2. --img_path=<images_folder> : where are the actual images stored
3. --output_path=<output_file_path> : where the script can save the generated tfrecord file and what should be file name.

We will run script for training and test csv separately to create two tfrecord files.

In [None]:
#generate tfrecord for training data
!python create_tfrecord.py --csv_input=train.csv  --img_path={img_path} --output_path=train.tfrecord

In [None]:
#generate tfrecord for test data
!python generate_tfrecord.py --csv_input=test.csv  --img_path={img_path} --output_path=test.tfrecord

In [None]:
#train.record and test.record files should be available now
!ls -l

#### 5. Create Label Mapping File

In [None]:
#Dict which was created earlier will be used for building Label Mapping file
label_class_dict

In [None]:
#Build a pbtxt label file using label and class name
#This is required by Object detection API
#You can prepare it manually as well.

pbtxt_file_txt = ''
for label in sorted(label_class_dict.keys()):
    
    pbtxt_file_txt += "item {\n  id: " + str(label) + "\n  name: '" +  label_class_dict[label] + "'\n}\n\n"

with open('label_map.txt','w') as pbfile:
    pbfile.write(pbtxt_file_txt)

In [None]:
!ls -l

In [None]:
#Review the file content
!cat label_map.txt

#### 6. Download a pre-trained model

A list of pre-trained models is available at [TensorFlow model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md). We will use 'ssd_mobilenet_v1_coco' model for transfer learning.

In [None]:
#Download the model from zoo
!wget -q http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz

In [None]:
!ls -l

In [None]:
#Extract tar file content
!tar -xf ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz

In [None]:
!ls -l

In [None]:
#Check the extracted folder
!ls -l ssd_mobilenet_v2_320x320_coco17_tpu-8

In [None]:
#Check the extracted folder
!ls -l ssd_mobilenet_v2_320x320_coco17_tpu-8/checkpoint

In [None]:
!ls -l ssd_mobilenet_v2_320x320_coco17_tpu-8/saved_model

#### 7. Prepare Training configuration file

1. Change num_classes parameter to 20 (as we have 20 categories in pascal voc dataset)
2. For 'train_input_reader' change 'input_path' to filepath of train.record file.
3. For 'train_input_reader' change 'label_map_path' to filepath of pascal_voc.pbtxt file.
4. Repeat above two steps for 'eval_input_reader'.
5. Change fine_tune_checkpoint to filepath where pre-trained model.ckpt file is available e.g ssd_mobilenet_v2_320x320_coco17_tpu-8/checkpoint/ckpt-0
6. Change 'batch_size' accordingly to available memory.
7. Change 'num_steps' to indicate how long the training will done e.g. 200000. For demo purpose, we are keeping it to 20 so that we can finish training quickly.


You can copy a sample configuration for the chosen pre-trained model (SSD MobileNet v2 320x320) in this case from [Configs](https://github.com/tensorflow/models/tree/master/research/object_detection/configs/tf2) folder. Here are things which need to be changed at a minimum.

Set Config file name

In [None]:
config_file = 'ssd_mobilenet_v2_modified.config'

#### 8. Training the model

In [None]:
#Copy training file from 'models/research/object_detection' folder to current folder
!cp /content/models/research/object_detection/model_main_tf2.py .

In [None]:
!ls -l

Start training 

- Please note that Object detection take long time to train. The training may take few days if run on single GPU machine (depending on num of steps indicated). Try to keep training the model till loss comes close to 1 (or goes below 1). The script takes 3 inputs

1. --model_dir=<folder_name> : where model will be saved periodically as training progresses
2. --pipeline_config_path=<config_file_path> :where is model training configuration file located.

In [None]:
#Create a training folder to store model checkpoints/snapshots as training progresses
!mkdir training

In [None]:
#Check training folder
!ls -l training

Start Tensorboard

In [None]:
%load_ext tensorboard

In [None]:
%tensorboard --logdir training

In [None]:
#Fixing the CuDNN version issue - TEMPORARY
# !apt install --allow-change-held-packages libcudnn8=8.1.0.77-1+cuda11.2

In [None]:
#start training
!python model_main_tf2.py --model_dir=training/ --pipeline_config_path={config_file} --checkpoint_every_n=100 --alsologtostderr

In [None]:
!ls -l training

In [None]:
!ls -l training/train

Model Evaluation

#### Training and Evaluation in Background

If we want to evaluate our model on training data regularly, we have to run both training and evaluation script in parallel. Model evaluation on test data gets done everytime model checkpoint is saved during training.

In [None]:
#Check training logs - uncomment line below
#!cat train.txt

In [None]:
#Check evaluation logs - uncomment line below
#!cat eval.txt

#### 9. Export trained model

From the saved model checkpoints, we will create a frozen trained model. Frozen here means to remove model nodes which are no longer needed in prediction. This reduces model size.

In [None]:
#Copy export_inference_graph.py file from models/research/object_detection to current directory
!cp /content/models/research/object_detection/exporter_main_v2.py .

In [None]:
!ls -l

The export_inference_graph.py script file requires the following input:

1. --input_type <input_node_name> : This will be used during prediction to set model input
2. --pipeline_config_path <model_training_config_file_path> : where is model training config file located.
3. --trained_checkpoint_prefix <file_path__model_checkpoint> : Which checkpoint should be used to create final model.
4. --output_directory <frozen_model_directory> : where should the frozen model created by script should be stored.

In [None]:
#Provide input name, config file location, training folder
!python exporter_main_v2.py --input_type "image_tensor" --pipeline_config_path {config_file} --trained_checkpoint_dir training/ --output_directory detection_model

In [None]:
label_class_dict

#### Model Prediction

In [None]:
import tensorflow as tf

Load Saved model

In [None]:
saved_model_path = 'detection_model/saved_model' #/gdrive/My Drive/AI-ML/models/pascal_voc_tf2/
model = tf.saved_model.load(saved_model_path)

Function to get model prediction

In [None]:
#Function to get predictions from a Detection model
def detector_prediction(image_file, img_array=None, confidence_threshold=0.5):

    """
    image_file: File path of the image for which prediction needs to be done
    img_array: only considered if image_file is not specified
    confidence_threshold: Minimum confidence/probability for prediction to be considered
    """
    #Load image
    if(image_file):
        img = tf.keras.preprocessing.image.load_img(image_file)
        img_array = tf.keras.preprocessing.image.img_to_array(img).astype('uint8')
    
    #Make it a batch of one example
    img_array = tf.expand_dims(img_array, axis=0)

    #Prediction
    output = model(img_array) #get list of tensors discussed above as output
    
    #print(output)
    detection_classes = output['detection_classes'].numpy()[0]
    detection_scores = output['detection_scores'].numpy()[0] #get detection scores
    detection_boxes = output['detection_boxes'].numpy()[0]

    #Select predictions for which probability is higher than confidence_threshold
    selected_predictions = detection_scores >= confidence_threshold

    selected_prediction_scores = detection_scores[selected_predictions]
    selected_prediction_classes = detection_classes[selected_predictions]
    selected_prediction_boxes = detection_boxes[selected_predictions]

    #De-normalize box co-ordinates (multiply x-coordinates by image width and y-coords by image height)
    img_h, img_w = img_array.shape[1:3]

    for i in range(selected_prediction_boxes.shape[0]):
        
        selected_prediction_boxes[i,0] *= img_h #ymin * img_w
        selected_prediction_boxes[i,1] *= img_w #xmin * img_h
        selected_prediction_boxes[i,2] *= img_h #ymax * img_w
        selected_prediction_boxes[i,3] *= img_w #xmax * img_h

    #Make all co-ordinates as integer
    selected_prediction_boxes= selected_prediction_boxes.astype(int)

    #Convert class indexes to actual class labels
    predicted_classes = []
    for i in range(selected_prediction_classes.shape[0]):
        predicted_classes.append(label_class_dict[int(selected_prediction_classes[i])])

    #Number of predictions
    selected_num_predictions = selected_prediction_boxes.shape[0]

    return {'Total Predictions': selected_num_predictions,
            'Classes': predicted_classes, 
            'Scores': selected_prediction_scores, 
            'Box coordinates': selected_prediction_boxes}

In [None]:
tf.keras.preprocessing.image.load_img('person_with_bike.jpg')

In [None]:
#Model output
detector_prediction('person_with_bike.jpg', confidence_threshold=0.6)

Visualize model output

In [None]:
import cv2
from matplotlib import pyplot as plt

In [None]:
def visualize_output(image_file, confidence_threshold=0.5):

    #Call model prediction function above
    output = detector_prediction(image_file, confidence_threshold=confidence_threshold)

    #Read image
    img = cv2.imread(image_file)

    #Draw rectangle for predicted boxes, also add predicted classes
    for i in range(output['Box coordinates'].shape[0]):

        box = output['Box coordinates'][i]
        
        #Draw rectangle - (ymin, xmin, ymax, xmax)
        img = cv2.rectangle(img, (box[1], box[0]), (box[3], box[2]), (0,255,0), 2)
        
        #Add Label - Class name and confidence level
        label = output['Classes'][i] + ': ' + str(round(output['Scores'][i],2))
        img = cv2.putText(img, label, (box[1], box[0]-10), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2, cv2.LINE_AA)
    
    #Conver BGR image to RGB to use with Matplotlib
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

    #Display image
    plt.figure(figsize=(10,6))
    plt.imshow(img)
    plt.show()

In [None]:
#Visualize on image
visualize_output('person_with_bike.jpg', confidence_threshold=0.6)