<a href="https://colab.research.google.com/github/emilyrlong/oddy-test/blob/main/Dissertation_1_7_Evaluating_the_Full_Configs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Dissertation 1.7: Evaluating the Full Configs

A notebook to run in parallel with dissertation 1.6. As it trains the model, this notebook will evaluate on the validation set. This code uses the [tutorial](https://neptune.ai/blog/how-to-train-your-own-object-detector-using-tensorflow-object-detection-api) from Anton Margonuv at Neptune.ai. In previous code, we were only utilising a part of the model configs, but we want to use a lot more of the in-built training, validation, and testing features.

Make sure the runtime type is on GPU + Standard RAM.

In [None]:
# Connect colab to Google Drive
from google.colab import drive
drive.mount('/content/drive')

# **Part 1:** Train the Model

## **Step 1**: Installation


In [None]:
# !pip install tensorflow
import tensorflow as tf
print(tf.__version__)

In [None]:
gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  print('Select the Runtime > "Change runtime type" menu to enable a GPU accelerator, ')
  print('and then re-execute this cell.')
else:
  print(gpu_info)

Install the Tensorflow 2 [Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection):

In [None]:
# uncomment the next line if you want to delete an existing models directory
!rm -rf ./models/

# clone the Tensorflow Model Garden
!git clone --depth 1 https://github.com/tensorflow/models/

In [None]:
# install the Object Detection API
!cd models/research/ && protoc object_detection/protos/*.proto --python_out=. && cp object_detection/packages/tf2/setup.py . && python -m pip install .

In [None]:
# Testing the installation of the object detection API
# !python models/research/object_detection/builders/model_builder_tf2_test.py

In [None]:
# Installing the COCO API:
# !pip install cython

In [None]:
# Cloning COCO API
!git clone https://github.com/cocodataset/cocoapi.git

In [None]:
# Copying the python tools into the research folder
%cp -r cocoapi/PythonAPI/pycocotools ./models/research/

## **Step 2**: Import Packages

Let's now import the packages you will use in this assignment.

In [None]:
# !pip install dfply

In [None]:
import matplotlib
import matplotlib.pyplot as plt

import os
import random
import zipfile
import io
import scipy.misc
import numpy as np
import pandas as pd

import glob
import imageio
# from dfply import *
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
from IPython.display import display, Javascript
from IPython.display import Image as IPyImage

try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass

# import tensorflow as tf
tf.get_logger().setLevel('ERROR')

### **Step 2.1**: Import Object Detection API packages

In [None]:
# import the label map utility module
from object_detection.utils import label_map_util

# import module for reading and updating configuration files.
from object_detection.utils import config_util

# import module for visualization. use the alias `viz_utils`
from object_detection.utils import visualization_utils as viz_utils

# import module for building the detection model
from object_detection.builders import model_builder
### END CODE HERE ###

# import module for utilities in Colab
from object_detection.utils import colab_utils

## **Step 3**: Setting Up Workspace and Adding Data

In [None]:
!ls

In [None]:
# Making workspace and data directories
%mkdir workspace
%mkdir workspace/data

In [None]:
# Copying data TFRecords from Google Drive
%cp /content/drive/MyDrive/Dissertation/TFRecords/test.record workspace/data
%cp /content/drive/MyDrive/Dissertation/TFRecords/val.record workspace/data
%cp /content/drive/MyDrive/Dissertation/TFRecords/train.record workspace/data

In [None]:
# Copy the label map to the data folder
%cp /content/drive/MyDrive/Dissertation/labels/label_map.pbtxt workspace/data

##**Step 4**: Downloading the Pre-Trained Models
Copy the link from the model that you want in the TensorFlow 2 Detection [Model Zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md).

In [None]:
# Making a directory for the pre-trained models
%mkdir workspace/pre_trained_models

In [None]:
# Change to the pre_trained_models directory
%cd workspace/pre_trained_models

In [None]:
# Paste the link to the desired model here: ex. RetinaNet (SSD + ResNet50)
!wget http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz
    
# untar (decompress) the tar file
!tar -xf ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz

# copy the checkpoint to the test_data folder models/research/object_detection/test_data/
# !mv ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/checkpoint models/research/object_detection/test_data/

In [None]:
# Download the Faster R-CNN V1 Resnet 50, 640x640 checkpoint
!wget http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.tar.gz
    
# untar (decompress) the tar file
!tar -xf faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.tar.gz

In [None]:
# Download the CenterNet HourGlass104 512 x 512 model - highest mAP of the small image models 
!wget http://download.tensorflow.org/models/object_detection/tf2/20200713/centernet_hg104_512x512_coco17_tpu-8.tar.gz
    
# untar (decompress) the tar file
!tar -xf centernet_hg104_512x512_coco17_tpu-8.tar.gz

In [None]:
# Download the EfficientDet D1 640x640 model - higher mAP than RetinaNet, faster than CenterNet 
!wget http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d1_coco17_tpu-32.tar.gz
    
# untar (decompress) the tar file
!tar -xf efficientdet_d1_coco17_tpu-32.tar.gz

## **Step 5**: Creating Directories for Customised Models

Instead of setting up a folder here, I've made new folders in my Google Drive and added the pipeline config that I worked on before at ```/content/drive/MyDrive/Dissertation/models_workspace/eff_det/v1/eff_det_d1_pipeline.config```. 

Update the config:
1. **num_classes**: 9
2. **batch_size** (train_config): 10
3. **batch_size** (eval_config): 1
4. **fine_tune_checkpoint**: path to downloaded checkpoint
5. **fine_tune_checkpoint_type**: ‘detection’
6. **use_bfloat16**: false
7. **label_map_path**: path to label_map.pbtxt (both in train_input_reader and eval_input_reader)
8. (train_input_reader) **input_path**: path to train.record 
9. (eval_input_reader) **input_path**: path to val.record

## **Step 8**: Evaluate the Model on Validation Data

In [None]:
%cd /content/

In [None]:
# Evaluate the model on validation data
!python /content/models/research/object_detection/model_main_tf2.py \
  --pipeline_config_path=/content/drive/MyDrive/Dissertation/models_workspace/eff_det/v3/eff_det_d1_pipeline_5.config \
  --model_dir=/content/drive/MyDrive/Dissertation/models_workspace/eff_det/v3 \
  --checkpoint_dir=/content/drive/MyDrive/Dissertation/models_workspace/eff_det/v3

In [None]:
%load_ext tensorboard

In [None]:
%tensorboard --logdir=/content/drive/MyDrive/Dissertation/models_workspace/eff_det

In [None]:
!tensorboard dev upload \
  --logdir /content/drive/MyDrive/Dissertation/models_workspace/eff_det/v3 \
  --name "EfficientDet D1 - Config 4 - 100 Epochs" \
  --description "Training and Validation Data for Oddy Tests" \
  --one_shot

In [None]:
# Test data evaluation
!python /content/models/research/object_detection/model_main_tf2.py \
  --pipeline_config_path=/content/drive/MyDrive/Dissertation/models_workspace/eff_det/v1/eff_det_d1_pipeline_v1_test.config \
  --model_dir=/content/drive/MyDrive/Dissertation/models_workspace/eff_det/v1 \
  --checkpoint_dir=/content/drive/MyDrive/Dissertation/models_workspace/eff_det/v1

In [None]:
%tensorboard --logdir=/content/drive/MyDrive/Dissertation/models_workspace/eff_det/v1/eval

In [None]:
# Validation check on CenterNet
!python /content/models/research/object_detection/model_main_tf2.py \
  --pipeline_config_path=/content/drive/MyDrive/Dissertation/saved_models/CenterNet-20E-Aug11/centernet/pipeline_test.config \
  --model_dir=/content/drive/MyDrive/Dissertation/saved_models/CenterNet-20E-Aug11/centernet \
  --checkpoint_dir=/content/drive/MyDrive/Dissertation/saved_models/CenterNet-20E-Aug11/centernet/checkpoint

In [None]:
# Validation check on RetinaNet
!python /content/models/research/object_detection/model_main_tf2.py \
  --pipeline_config_path=/content/drive/MyDrive/Dissertation/saved_models/RetinaNet-50E-Aug11/ssd/pipeline.config \
  --model_dir=/content/drive/MyDrive/Dissertation/saved_models/RetinaNet-50E-Aug11/ssd \
  --checkpoint_dir=/content/drive/MyDrive/Dissertation/saved_models/RetinaNet-50E-Aug11/ssd/checkpoint

In [None]:
# Validation data for final checkpoint of EfficientDet
!python /content/models/research/object_detection/model_main_tf2.py \
  --pipeline_config_path=/content/drive/MyDrive/Dissertation/saved_models/EfficientDet-D1-V3-Aug15/pipeline.config \
  --model_dir=/content/drive/MyDrive/Dissertation/saved_models/EfficientDet-D1-V3-Aug15 \
  --checkpoint_dir=/content/drive/MyDrive/Dissertation/saved_models/EfficientDet-D1-V3-Aug15/checkpoint

## **Step 7**: Tensorboard to Visualise the Results

In [None]:
# If you need to move any files
# !mv /content/workspace/models/faster_rcnn/v1/ckpt* /content/workspace/models/ssd/v1

In [None]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

In [None]:
# %tensorboard --logdir=/content/workspace/models/centernet/v1
%tensorboard --logdir=/content/workspace/models/eff_det/v1

## **Step 9**: Export Model

In [None]:
# Copying the config only
%cp -r /content/workspace/models/eff_det/v1/pipeline.config /content/drive/MyDrive/Dissertation/configs

In [None]:
# Copy the export python file to train the model into the workspace
%cp -r /content/models/research/object_detection/exporter_main_v2.py /content/workspace

In [None]:
# Make new directories for the trained models
%mkdir /content/workspace/exported_models
%mkdir /content/workspace/exported_models/centernet

In [None]:
%cd workspace

In [None]:
# Run the Python exporter script
!python exporter_main_v2.py \
  --pipeline_config_path=/content/workspace/models/centernet/v1/pipeline.config \
  --trained_checkpoint_dir=/content/workspace/models/centernet/v1 \
  --output_directory=/content/workspace/exported_models/centernet/ \
  --input_type=image_tensor

In [None]:
# Save exported model in Google Drive
%cp -r /content/workspace/exported_models/centernet /content/drive/MyDrive/Dissertation/saved_models/CenterNet-20E-Aug11

# **Part 2**: Evaluate the Model

## **Step 2.1**: Import and Clean Label Data
The labeller MakeSense.ai outputted (xmin, ymin, xdiff, ydiff) where xdiff and ydiff are equal to the difference between the minimum and maximum coordinates, so we need to make some new columns.

In [None]:
# Load in the csv from the labels folder in drive
label_df = pd.read_csv('/content/drive/MyDrive/Dissertation/labels/Fulldata_Aug12.csv')
# For the numpys, we need the un-resized data
# label_df = pd.read_csv('/content/drive/MyDrive/Dissertation/labels/Fulldata_Aug2.csv')

In [None]:
label_df

### **Step 2.1.1**: Getting Integer Class Values
We need to make a column with the mapped integer values for the classes.

In [None]:
# Load label map from file
# Function found here: https://github.com/tensorflow/models/blob/master/research/object_detection/utils/label_map_util.py
label_map = label_map_util.load_labelmap('/content/drive/MyDrive/Dissertation/labels/StringIntLabelMap.pbtxt')

In [None]:
# Convert to dictionary
label_dict = label_map_util.get_label_map_dict(label_map,use_display_name=True)
label_dict

In [None]:
# Map the label dictionary to a column to populate the corresponding class integer values
# https://kanoki.org/2019/04/06/pandas-map-dictionary-values-with-dataframe-columns/
label_df['classInt'] = label_df['class'].map(label_dict)

### **Step 2.1.2**: Define the category index dictionary + NumClasses


In [None]:
# define a dictionary describing the corrosion classes
category_index = {
    1 : {
        'id'  : 1, 
        'name': 'Ag-P'
    },
    2 : {
        'id'  : 2,
        'name': 'Ag-T'
    },
    3 : {
        'id'  : 3,
        'name': 'Ag-U'
    },
    4 : {
        'id'  : 4,
        'name': 'Cu-P'
    },
    5 : {
        'id'  : 5,
        'name': 'Cu-T'
    },
    6 : {
        'id'  : 6,
        'name': 'Cu-U'
    },
    7 : {
        'id'  : 7,
        'name': 'Pb-P'
    },
    8 : {
        'id'  : 8,
        'name': 'Pb-T'
    },
    9 : {
        'id'  : 9,
        'name': 'Pb-U'
    }
}

In [None]:
# Specify the number of classes that the model will predict
num_classes = 9

## **Step 2.2**: Configure the model and load checkpoint

### **Step 2.2.1**: Read in the configuration file and build model


In [None]:
# Clears old models
# tf.keras.backend.clear_session()

# define the path to the .config file
pipeline_config = '/content/workspace/models/eff_det/v1/pipeline.config'
# Load the configuration file into a dictionary
configs = config_util.get_configs_from_pipeline_file(pipeline_config)

In [None]:
# Read in the object stored at the key 'model' of the configs dictionary
model_config = configs['model']

In [None]:
# Use the model_builder build function from the config above
detection_model = model_builder.build(model_config = model_config, is_training = False)

In [None]:
print(type(detection_model))
# Expected: <class 'object_detection.meta_architectures.ssd_meta_arch.SSDMetaArch'>

In [None]:
# Run this to check the type of detection_model
# detection_model

In [None]:
# check the class variables that are in detection_model
# vars(detection_model)

### **Step 2.2.2**: Restore the checkpoint

- checkpoint_path: `models -> research -> object_detection -> test_data -> checkpoint -> ckpt-0`. **IMPORTANT**: Do not set the path to include the `.index` extension in the checkpoint file name. Will cause errors later  

In [None]:
checkpoint_path = '/content/workspace/models/eff_det/v1/ckpt-4'
# checkpoint_path = '/content/workspace/models/ssd/v1/ckpt-10'

# Define a checkpoint
checkpoint = tf.compat.v2.train.Checkpoint(model=detection_model)

# Restore the checkpoint to the checkpoint path
checkpoint.restore(checkpoint_path).expect_partial()

### **Step 2.2.3**: Run a dummy image to generate the model variables

Run a dummy image through the model so that variables are created.

In [None]:
# use the detection model's `preprocess()` method and pass a dummy image
tmp_image, tmp_shapes = detection_model.preprocess(tf.zeros([1, 640, 640, 3]))

# run a prediction with the preprocessed image and shapes
tmp_prediction_dict = detection_model.predict(tmp_image, tmp_shapes)

# postprocess the predictions into final detections
tmp_detections = detection_model.postprocess(tmp_prediction_dict, tmp_shapes)

print('Weights restored!')

In [None]:
# Test Code:
assert len(detection_model.trainable_variables) > 0, "Please pass in a dummy image to create the trainable variables."

print(detection_model.weights[0].shape)
print(detection_model.weights[231].shape)
print(detection_model.weights[462].shape)

## **Step 2.3**: Defining Functions

You'll define a couple of utility functions for loading images and plotting detections. This code is provided for you.

### **Function 1**: `plot_detections`

In [None]:
def plot_detections(image_np,
                    boxes,
                    classes,
                    scores,
                    category_index,
                    figsize=(12, 16),
                    image_name=None):
    """Wrapper function to visualize detections.

    Args:
    image_np: uint8 numpy array with shape (img_height, img_width, 3)
    boxes: a numpy array of shape [N, 4]
    classes: a numpy array of shape [N]. Note that class indices are 1-based,
          and match the keys in the label map.
    scores: a numpy array of shape [N] or None.  If scores=None, then
          this function assumes that the boxes to be plotted are groundtruth
          boxes and plot all boxes as black with no classes or scores.
    category_index: a dict containing category dictionaries (each holding
          category index `id` and category name `name`) keyed by category indices.
    figsize: size for the figure.
    image_name: a name for the image file.
    """
    
    image_np_with_annotations = image_np.copy()
    
    viz_utils.visualize_boxes_and_labels_on_image_array(
        image_np_with_annotations,
        boxes,
        classes,
        scores,
        category_index,
        use_normalized_coordinates=True,
        max_boxes_to_draw=7,
        min_score_thresh=0,
        line_thickness = 10)
    
    if image_name:
        plt.imsave(image_name, image_np_with_annotations)
    
    else:
        plt.imshow(image_np_with_annotations)

### **Function 2**: `load_npy_set`

Get the training images from the Google Drive folder and their file names. The images are quite large, so the step which converts them into numpy arrays will take a while.


In [None]:
# A FUNCTION FOR LOADING IMAGES
def load_npy_set(npy_dir):
    """Load a folder of numpy arrays corresponding to images.
    Args: npy_dir - a path to folder of training, validation, or test images. 
    Returns: images_np - a list of the numpy array versions of the images
    """
    # Getting list of npy files
    files = os.listdir(npy_dir)
    # Starting an empty list for the npy arrays
    images_np = []
    # For loop to add each file (npy array) to the image list
    for idx, file in enumerate(files):
      npy_path = os.path.join(npy_dir,file)
      test_img = np.load(npy_path)
      images_np.append(test_img)
      if idx % 10 == 0:
        print('Loading',str(idx),':',file)
    # When finished, print message and return 
    print('Done Loading!')
    return images_np, files

### **Function 3**: `box_lister`
Converting the box coordinates and class labels into a list of numpy arrays. These can be visualised on top of the images and further converted into tensors. 

In [None]:
def box_lister(files):
    # Define a list of ground truth boxes
    gt_boxes = []
    # Define a list of class integers
    classes = []
    # For loop to iterate over the file names
    for file in files:
      # Need to change 'npy' extension to 'jpg'
      file = file.replace('npy','jpg')
      # A smaller dataframe to hold the labels for that particular image 
      image_labels = label_df[label_df['filename']==file]
      # Adding error message for if an image doesn't have any labels
      if len(image_labels) == 0:
        print('Error: file ' + file + ' has no corresponding labels')
        continue
      # Image height
      height = np.unique(image_labels['height'].to_numpy())[0] 
      # Image width
      width = np.unique(image_labels['width'].to_numpy())[0] 
      # Box array: (ymin, xmin, ymax, xmax)
      box_arr = image_labels[['ymin','xmin','ymax','xmax']].to_numpy()
      # Normalizing boxes by width and height
      box_arr = np.divide(box_arr, [height,width,height,width])
      # Appending new array to box list
      gt_boxes.append(box_arr)
      # Getting the class integers as an array and adding to list
      classes.append(image_labels['classInt'].to_numpy())
    return gt_boxes, classes

### **Function 4:** `data_preprocess`
Need some data preprocessing so it is formatted properly for the model:
- Convert the class labels to one-hot representations
- Convert everything (i.e. train images, gt boxes and class labels) to tensors.

In [None]:
def data_preprocess(train_images_np, gt_boxes, classes):
    # The label_id_offset to shift classes to the zeroth index.
    label_id_offset = 1
    # List for image tensors
    train_image_tensors = []
    # lists containing the one-hot encoded classes and ground truth boxes
    gt_classes_one_hot_tensors = []
    gt_box_tensors = []
    # Loop to convert the image numpy arrays, box coordinates, and classes
    for (train_image_np, gt_box_np, class_np) in zip(train_images_np, gt_boxes, classes):
        # convert training image to tensor, add batch dimension, and add to list
        train_image_tensors.append(tf.expand_dims(tf.convert_to_tensor(train_image_np, dtype=tf.float32), axis=0))
        # convert numpy array to tensor, then add to list
        gt_box_tensors.append(tf.convert_to_tensor(gt_box_np, dtype=tf.float32))
        # apply offset to have zero-indexed ground truth classes
        zero_indexed_groundtruth_classes = tf.convert_to_tensor(class_np - label_id_offset)
        # do one-hot encoding to ground truth classes
        gt_classes_one_hot_tensors.append(tf.one_hot(zero_indexed_groundtruth_classes, num_classes))
    print('Done prepping data.')
    return train_image_tensors, gt_box_tensors, gt_classes_one_hot_tensors

### **Function 5**: `whole_image_prep`
This function calls to summarise functions 2 - 4 into one line of code. It loads and converts images, boxes, and classes into multiple formats.

In [None]:
def whole_image_prep(npy_dir):
  print('Starting Image Loading:')
  # Loading the image numpy arrays into a list and 
  images_np, files = load_npy_set(npy_dir)
  print('Starting box coordinate and class lists:')
  # Converting csv box coordinates and classes into numpy arrays and lists
  gt_boxes, classes = box_lister(files)
  print('Converting images, boxes, and classes to tensors:')
  # Preprocessing images, boxes, and classes into (one hot) tensors 
  image_T, gt_box_T, gt_classes_OHT = data_preprocess(images_np, gt_boxes, classes)
  return images_np, files, gt_boxes, classes, image_T, gt_box_T, gt_classes_OHT

### **Function 6**: `plot_image_sample`

In [None]:
def plot_image_sample(images_np,gt_boxes,classes):
    ''' Function to plot a eight images to double check box placements, etc. 
    '''
    %matplotlib inline
    # define the figure size
    plt.figure(figsize=(15, 7))
    # using the plot_detections function to draw the ground truth boxes
    for idx in range(8):
        plt.subplot(2, 4, idx+1)
        plot_detections(
          images_np[idx],
          gt_boxes[idx],
          classes[idx],
          np.ones(classes[idx].shape), # scores set to 1
          category_index = category_index,
        )
    plt.show()

## **Step 2.4:** Loading and Testing Data

In [None]:
# LOADING VALIDATION SET
val_dir = '/content/drive/MyDrive/Dissertation/new_val_npy'
val_images_np, val_files, val_gt_boxes, val_classes, val_image_T, val_gt_box_T, val_gt_classes_OHT = whole_image_prep(val_dir)
# 191 images in 3m 7s, then 4m 49s, then 5m 6s
# 220 images at 960 x 640 in 2m 12s

In [None]:
'''
# LOADING TEST DATA SET
test_npy_dir = '/content/drive/MyDrive/Dissertation/move_to_test'
# Use the function load_image_set to load in the test set as a list of numpy arrays
test_images_np, test_files, test_gt_boxes, test_classes, test_image_T, test_gt_box_T, test_gt_classes_OHT = whole_image_prep(test_npy_dir)
# 190 test data only took 2m 55s to load!
# 960 x 640: 380 images took 3m 41s to load and process!
'''

In [None]:
#plot_image_sample(test_images_np,test_gt_boxes,test_classes)

## **Step 2.5**: Process a test image

Define a function that returns the detection boxes, classes, and scores.

In [None]:
# Uncomment this decorator if you want to run inference eagerly
@tf.function
def detect(input_tensor):
    """Run detection on an input image.

    Args:
    input_tensor: A [1, height, width, 3] Tensor of type tf.float32.
      Note that height and width can be anything since the image will be
      immediately resized according to the needs of the model within this
      function.

    Returns:
    A dict containing 3 Tensors (`detection_boxes`, `detection_classes`,
      and `detection_scores`).
    """
    preprocessed_image, shapes = detection_model.preprocess(input_tensor)
    prediction_dict = detection_model.predict(preprocessed_image, shapes)
    # use the detection model's postprocess() method to get the the final detections
    detections = detection_model.postprocess(prediction_dict, shapes)
    
    return detections

You can now loop through the test images and get the detection scores and bounding boxes to overlay in the original image. We will save each result in a `results` dictionary.

In [None]:
%matplotlib inline

label_id_offset = 1
results = {'boxes': [], 'scores': []}

# Need to adjust this loop to get better results dictionaries??

for i in range(8): # len(val_images_np)
    input_tensor = val_image_T[i]
    detections = detect(input_tensor)
    plt.subplot(2, 4, i+1)
    plot_detections(
      val_images_np[i],
      detections['detection_boxes'][0].numpy()[0:6],
      detections['detection_classes'][0].numpy()[0:6].astype(np.uint32) + label_id_offset,
      detections['detection_scores'][0].numpy()[0:6],
      category_index, 
      figsize=(30, 40)
      )
    results['boxes'].append(detections['detection_boxes'][0][0].numpy())
    results['scores'].append(detections['detection_scores'][0][0].numpy())

In [None]:
detections['detection_scores']

## **Step 2.6**: Turning predictions into TXT files for mAP calculations

In [None]:
def pred_txt_lister(npy_dir):
    # Getting list of npy files
    files = os.listdir(npy_dir)

    # For loop to iterate over the file names
    for i, file in enumerate(files):
      # Run the model on that image
      detection_test = detect(val_image_T[i])
      # Find the non-zero predicted boxes
      pred_boxes = detections['detection_boxes'][0].numpy() 
      pred_boxes = pred_boxes[~np.all(pred_boxes == 0, axis=1)]

      # Need to convert boxes from [0,1] to the proper image scale, i.e. [6000,4000]
      # Getting height and width for the file:
      file = file.replace('npy','jpg')
      image_labels = label_df[label_df['filename']==file]
      height = np.unique(image_labels['height'].to_numpy())[0] 
      width = np.unique(image_labels['width'].to_numpy())[0] 
      # Multiplying boxes by width and height
      pred_boxes = np.multiply(pred_boxes, [height,width,height,width])

      # Finding the number of non-zero boxes
      num_boxes = pred_boxes.shape[0]
      # Getting the predicted classes
      class_array = detection_test['detection_classes'][0].numpy().astype('int')
      # Adding one to the class integers so they start at 1
      class_array = class_array[0:num_boxes] + 1
      # Getting the scores
      scores_array = detections['detection_scores'][0].numpy()[0:num_boxes]

      # Adding all the elements to a dataframe and rearranging columns
      box_df = pd.DataFrame(pred_boxes, columns = ['ymin','xmin','ymax','xmax'])
      box_df['score'] = scores_array
      box_df['class'] = class_array
      box_df = box_df[['class','score','xmin','ymin','xmax','ymax']]

      # Creating a dictionary to map class integers to strings
      reverse_dict = {1:'Ag-P', 2:'Ag-T', 3: 'Ag-U', 4: 'Cu-P', 5: 'Cu-T', 
                      6: 'Cu-U', 7: 'Pb-P', 8: 'Pb-T', 9: 'Pb-U'}
      box_df['class'] = box_df['class'].map(reverse_dict)

      # Getting new path for txt file
      txt_dir = "/content/drive/MyDrive/Dissertation/input/detection-results-final-ckpt"
      file = file.replace('jpg','txt')
      txt_path = os.path.join(txt_dir,file)
      # Saving labels as a txt file
      np.savetxt(txt_path, box_df, fmt = "%s")
      print('Saved file: '+file)

In [None]:
%cd /content

In [None]:
npy_dir = '/content/drive/MyDrive/Dissertation/new_val_npy'
pred_txt_lister(npy_dir)

## **Step 2.7**: Saving ground truth TXT files for the images

In [None]:
def txt_lister(npy_dir):
    # Getting list of npy files
    files = os.listdir(npy_dir)
    # For loop to iterate over the file names
    for file in files:
      # Need to change 'npy' extension to 'jpg'
      file = file.replace('npy','jpg')
      # A smaller dataframe to hold the labels for that particular image 
      image_labels = label_df[label_df['filename']==file][['class','xmin','ymin','xmax','ymax']]
      image_labels = image_labels.to_numpy()
      # Adding error message if an image doesn't have any labels
      if len(image_labels) == 0:
        print('Error: file ' + file + ' has no corresponding labels')
        continue
      # Getting new path for txt file
      txt_dir = "/content/drive/MyDrive/Dissertation/input/ground-truth"
      file = file.replace('jpg','txt')
      txt_path = os.path.join(txt_dir,file)
      # Saving labels as a txt file
      np.savetxt(txt_path, image_labels, fmt = "%s")
      print('Saved file: '+file)

In [None]:
npy_dir = '/content/drive/MyDrive/Dissertation/new_val_npy'
# files = os.listdir(npy_dir)
txt_lister(npy_dir)

## **Step 2.8**: Calculating mAP

The code below and information on mAP is in this Roboflow [blog](https://blog.roboflow.com/mean-average-precision/) and [Colab notebook](https://colab.research.google.com/drive/1pLvZpz0_Ob0yOQ7hxPhVRT04Cb3FGARb#scrollTo=-78frQ4211c8). The mAP python script is from this GitHub [repo](https://github.com/Cartucho/mAP).

In [None]:
%cd /content/
# Cloning from github repo with code
!git clone https://github.com/Cartucho/mAP

In [None]:
# Need to replace the class_list.txt list
%cd /content/
%cp /content/drive/MyDrive/Dissertation/labels/class_list.txt mAP/scripts/extra/class_list.txt

In [None]:
# The folder comes with subfolders for ground-truth & detection-results with txt files
# so we need to remove them
%rm -rf mAP/input/ground-truth/
%mkdir mAP/input/ground-truth/
%rm -rf mAP/input/detection-results/
%mkdir mAP/input/detection-results/
# Also removing the optional images folder
%rm -rf mAP/input/images-optional/

In [None]:
# Copying our own txt files into the new folder
%cp /content/drive/MyDrive/Dissertation/input/ground-truth/*txt mAP/input/ground-truth/
%cp /content/drive/MyDrive/Dissertation/input/detection-results-final-ckpt/*txt mAP/input/detection-results/

In [None]:
# If you need to copy over new detection-results txt files
%cd /content/
%rm -rf mAP/input/detection-results/
%mkdir mAP/input/detection-results/
%cp /content/drive/MyDrive/Dissertation/input/detection-results-ckpt-test/*txt mAP/input/detection-results/

In [None]:
# Go into mAP directory
%cd mAP/

In [None]:
# Run the main python script to get the mAP values
!python main.py -na