<a href="https://colab.research.google.com/github/sohiniroych/AI_with_Sohini_Notebooks/blob/main/IJCNN_Unsupervised_classification_frames.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### More models
[This](https://tfhub.dev/tensorflow/collections/object_detection/1) collection contains TF 2 object detection models that have been trained on the COCO 2017 dataset. [Here](https://tfhub.dev/s?module-type=image-object-detection) you can find all object detection models that are currently hosted on [tfhub.dev](https://tfhub.dev/).

## Imports and Setup

Let's start with the base imports.

In [None]:
# This Colab requires TF 2.5.
!pip install -U tensorflow>=2.5

In [None]:
import os
import pathlib

import matplotlib
import matplotlib.pyplot as plt

import io
import scipy.misc
import numpy as np
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
from six.moves.urllib.request import urlopen

import tensorflow as tf
import tensorflow_hub as hub
import time

tf.get_logger().setLevel('ERROR')

## Utilities

Run the following cell to create some utils that will be needed later:

- Helper method to load an image
- Map of Model Name to TF Hub handle
- List of tuples with Human Keypoints for the COCO 2017 dataset. This is needed for models with keypoints.

In [None]:
# @title Run this!!

def load_image_into_numpy_array(path):
  """Load an image from file into a numpy array.

  Puts image into numpy array to feed into tensorflow graph.
  Note that by convention we put it into a numpy array with shape
  (height, width, channels), where channels=3 for RGB.

  Args:
    path: the file path to the image

  Returns:
    uint8 numpy array with shape (img_height, img_width, 3)
  """
  image = None
  if(path.startswith('http')):
    response = urlopen(path)
    image_data = response.read()
    image_data = BytesIO(image_data)
    image = Image.open(image_data)
  else:
    image_data = tf.io.gfile.GFile(path, 'rb').read()
    image = Image.open(BytesIO(image_data))

  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (1, im_height, im_width, 3)).astype(np.uint8)

COCO17_HUMAN_POSE_KEYPOINTS = [(0, 1),
 (0, 2),
 (1, 3),
 (2, 4),
 (0, 5),
 (0, 6),
 (5, 7),
 (7, 9),
 (6, 8),
 (8, 10),
 (5, 6),
 (5, 11),
 (6, 12),
 (11, 12),
 (11, 13),
 (13, 15),
 (12, 14),
 (14, 16)]


## Visualization tools

To visualize the images with the proper detected boxes, keypoints and segmentation, we will use the TensorFlow Object Detection API. To install it we will clone the repo.

In [None]:
# Clone the tensorflow models repository
!git clone --depth 1 https://github.com/tensorflow/models/

Cloning into 'models'...
remote: Enumerating objects: 3157, done.[K
remote: Counting objects: 100% (3157/3157), done.[K
remote: Compressing objects: 100% (2490/2490), done.[K
remote: Total 3157 (delta 836), reused 1498 (delta 623), pack-reused 0[K
Receiving objects: 100% (3157/3157), 33.37 MiB | 13.56 MiB/s, done.
Resolving deltas: 100% (836/836), done.


Intalling the Object Detection API

In [None]:
%%bash
sudo apt install -y protobuf-compiler
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
cp object_detection/packages/tf2/setup.py .
python -m pip install .


Reading package lists...
Building dependency tree...
Reading state information...
protobuf-compiler is already the newest version (3.0.0-9.1ubuntu1).
The following packages were automatically installed and are no longer required:
  cuda-command-line-tools-10-0 cuda-command-line-tools-10-1
  cuda-command-line-tools-11-0 cuda-compiler-10-0 cuda-compiler-10-1
  cuda-compiler-11-0 cuda-cuobjdump-10-0 cuda-cuobjdump-10-1
  cuda-cuobjdump-11-0 cuda-cupti-10-0 cuda-cupti-10-1 cuda-cupti-11-0
  cuda-cupti-dev-11-0 cuda-documentation-10-0 cuda-documentation-10-1
  cuda-documentation-11-0 cuda-documentation-11-1 cuda-gdb-10-0 cuda-gdb-10-1
  cuda-gdb-11-0 cuda-gpu-library-advisor-10-0 cuda-gpu-library-advisor-10-1
  cuda-libraries-10-0 cuda-libraries-10-1 cuda-libraries-11-0
  cuda-memcheck-10-0 cuda-memcheck-10-1 cuda-memcheck-11-0 cuda-nsight-10-0
  cuda-nsight-10-1 cuda-nsight-11-0 cuda-nsight-11-1 cuda-nsight-compute-10-0
  cuda-nsight-compute-10-1 cuda-nsight-compute-11-0 cuda-nsight-comput



  DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
   pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
multiprocess 0.70.12.2 requires dill>=0.3.4, but you have dill 0.3.1.1 which is incompatible.
google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.27.1 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.


Now we can import the dependencies we will need later

In [None]:
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.utils import ops as utils_ops

%matplotlib inline

### Load label map data (for plotting).

Label maps correspond index numbers to category names, so that when our convolution network predicts `5`, we know that this corresponds to `airplane`.  Here we use internal utility functions, but anything that returns a dictionary mapping integers to appropriate string labels would be fine.

We are going, for simplicity, to load from the repository that we loaded the Object Detection API code

In [None]:
PATH_TO_LABELS = './models/research/object_detection/data/mscoco_label_map.pbtxt'
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

## Build a detection model and load pre-trained model weights

Here we will choose which Object Detection model we will use.
Select the architecture and it will be loaded automatically.
If you want to change the model to try other architectures later, just change the next cell and execute following ones.

## Loading the selected model from TensorFlow Hub

Here we just need the model handle that was selected and use the Tensorflow Hub library to load it to memory.


In [None]:
# Start with teh FasterRCNN model
model_handle="https://tfhub.dev/tensorflow/faster_rcnn/resnet152_v1_1024x1024/1"
print('loading model...')
hub_model = hub.load(model_handle)
print('model loaded!')

loading model...
model loaded!


## Loading an image

Let's try the model on a simple image. To help with this, we provide a list of test images.

Here are some simple things to try out if you are curious:
* Try running inference on your own images, just upload them to colab and load the same way it's done in the cell below.
* Modify some of the input images and see if detection still works.  Some simple things to try out here include flipping the image horizontally, or converting to grayscale (note that we still expect the input image to have 3 channels).

**Be careful:** when using images with an alpha channel, the model expect 3 channels images and the alpha will count as a 4th.



In [None]:
# First lets connect the Gdrive that contains the data
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import pandas as pd
df=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/JAAD_test_frames.csv')
df.head()

Unnamed: 0.1,Unnamed: 0,seq_no,frame_no,Day,Night,Shadows,Category
0,0,video_0024,00000.png,0,1,0,City
1,1,video_0024,00001.png,0,1,0,City
2,2,video_0024,00002.png,0,1,0,City
3,3,video_0024,00003.png,0,1,0,City
4,4,video_0024,00004.png,0,1,0,City


In [None]:
img_path='/content/drive/MyDrive/Colab Notebooks/JAAD/DataX/JAAD/images/'

In [None]:
import os
os.chdir('/content/drive/MyDrive/Colab Notebooks/JAAD/')
!ls

DataX			  losses.py			  Uncertainty_2k.csv
helpers.py		  __pycache__
JAAD_data_analysis.ipynb  Train_SimCLR_JAAD_Subset.ipynb


Order the frames for testing! Read from JAAD_test_frames

In [None]:
test_images=[]
test_labels=[]

seq_t=df["seq_no"]
frame_t=df["frame_no"]
lab_t=df["Night"].astype(str)+df["Shadows"].astype(str)+df["Category"]
nt=int(len(seq_t))

for vi in range(nt):
    #copyfile(img_path+seq[vi]+'\\'+frame[vi],'.\\test\\'+seq[vi]+frame[vi])
    test_images.append(img_path+seq_t[vi]+'/'+frame_t[vi])
    test_labels.append(lab_t[vi])
test_labels=np.array(test_labels)

In [None]:
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
y_test_enc=le.fit_transform(test_labels)
le.inverse_transform([0, 1,2,3,4])

array(['00City', '00Pedestrians', '01City', '01Pedestrians', '10City'],
      dtype='<U13')

In [None]:
# running inference
def model_inference(image_np, hub_model):
  #start_time = time.time()
  results = hub_model(image_np)
  result = {key:value.numpy() for key,value in results.items()}
  #end_time = time.time()
  #print("Inference time:",end_time-start_time)
  return result

In [None]:
# function to compute the 10 point moving average
def compute_10_pt_MA(U,n):
  if(len(U)<10):
    Ua=np.sum(U[0:i])/len(U)
  else:
   Ua=np.sum(U[n-5:n+5])/10
   return Ua

## Generate and Visualize Results

Here is where we will need the TensorFlow Object Detection API to show the squares from the inference step (and the keypoints when available).

Here you can, for example, set `min_score_thresh` to other values (between 0 and 1) to allow more detections in or to filter out more detections.

# Next we analyze the metrics to  track the following:
1. Retain only: person, car, van, truck, train, bicycle
2. Count #P(person, bicycle) [1,2], #C(car, van, bus, truck, train) [3,4,6,7,8]
3. Find median prob of objects of interest (#M).
4. Send #P, #C, #M to wandb per image

Workimg as dataframe is easy!

# Label encoding [Night, Shadow, Category]
* 0 ->'00City'
* 1 ->'00Pedestrians'
* 2 ->'01City'
* 3 ->'01Pedestrians'
* 4 ->'10City'
* 5 -> '10Pedestrians'
* 6 -> '00Freeway'
* 7 -> '10Freeway'
* 8 -> Unknown

In [None]:
def return_scene_category_per_frame(Pe,Ca,Un, BG):
  # Un here is the 10 frame moving average (5 frames before and 5 frames after)
  if((Pe>=0) & (Ca >0)):
    # It is city
    if (BG==1):
      # It is night
      label=4
    else: # It is day, city
      if (Un>0.5): 
        label=2
      else:
        label=0
  elif((Pe>=0) & (Ca==0)):
    # It is Pedestrians only
    if(BG==1): #It is night
      label=5
    else:
      if(Un>0.5):
        label=3
      else:
        label=1
  elif((Pe==0) & (Ca>0)):
    #It is highway
    if(BG==1): #It is night
      label=7
    else:
        label=7
  else:
    label=8

  return(label)


# Threshold of U needs to be changed

# Look for 289, 290, 291
# Shadow might mean poor lighting



In [None]:
def return_filtered_frame(Pe_r,Ca_r,Un_r, Un_a_r, BG_r):
  # So the running plots are fed into this function and it spits out frame numbers of the images that need further tuning
  # Findpeaks, then Un[peaks]>1h
  if((Pe>0) & (Ca >0)):
    # It is city
    if (BG==1):
      # It is night
      label=4
    else: # It is day, city
      if (Un>0.5): 
        label=2
      else:
        label=0
  elif((Pe>=0) & (Ca==0)):
    # It is Pedestrians only
    if(BG==1): #It is night
      label=5
    else:
      if(Un>0.5):
        label=3
      else:
        label=1
  elif((Pe==0) & (Ca>0)):
    #It is highway
    if(BG==1): #It is night
      label=7
    else:
        label=7
  else:
    label=8

  return(label)

In [None]:
# Compute for all test frames
import pandas as pd

Pe=[]
Ca=[]
BGe=[]
Un=[]
Una=[]
outcome=pd.DataFrame(columns=['Persons','Cars','BG','Uncertainty'])
#ypred=np.zeros((nt))
for i in range(2000):
  if(i%100==0):
    print('Completed=',i)
  if (seq_t[i]=='video_0024'):
    BG=1
  else:
    BG=0
  image_np=load_image_into_numpy_array(test_images[i])
  result=model_inference(image_np, hub_model)
  image_np_with_detections = image_np.copy()
  df=pd.DataFrame(columns=["scores","class"])
  cl=np.array(result['detection_classes'][0].astype(int))
  sc=np.array((result['detection_scores'][0]).astype(float))
  df['scores']=sc
  df['class']=cl
  df1=df.loc[df['scores']>0.15]
  U=df1.loc[(df1['class']>0) & (df1['class']<9)]
  if(len(U)>0):
    P=U.loc[(U['class']>0) & (U['class']<3)]
    C=U.loc[(U['class']>2) & (U['class']<9)]
    Pe.append(len(P))
    Ca.append(len(C))
    Un.append(np.std(U['scores'])/np.median(U['scores']))
  else:
    Pe.append(0)
    Ca.append(0)
    Un.append(0)
  BGe.append(BG)


outcome['Persons']=np.array(Pe)
outcome['Cars']=np.array(Ca)
outcome['BG']=np.array(BGe)
outcome['Uncertainty']=np.array(Un)

outcome.to_csv('Uncertainty_2k.csv')

Completed= 0
Completed= 100
Completed= 200
Completed= 300
Completed= 400
Completed= 500
Completed= 600
Completed= 700
Completed= 800
Completed= 900
Completed= 1000
Completed= 1100
Completed= 1200
Completed= 1300
Completed= 1400
Completed= 1500
Completed= 1600
Completed= 1700
Completed= 1800
Completed= 1900


In [None]:
Pe=[]
Ca=[]
BGe=[]
Un=[]
Una=[]
outcome=pd.DataFrame(columns=['Persons','Cars','BG','Uncertainty'])
#ypred=np.zeros((nt))
for i in range(2000, nt):
  if(i%100==0):
    print('Completed=',i)
  if (seq_t[i]=='video_0024'):
    BG=1
  else:
    BG=0
  image_np=load_image_into_numpy_array(test_images[i])
  result=model_inference(image_np, hub_model)
  image_np_with_detections = image_np.copy()
  df=pd.DataFrame(columns=["scores","class"])
  cl=np.array(result['detection_classes'][0].astype(int))
  sc=np.array((result['detection_scores'][0]).astype(float))
  df['scores']=sc
  df['class']=cl
  df1=df.loc[df['scores']>0.15]
  U=df1.loc[(df1['class']>0) & (df1['class']<9)]
  if(len(U)>0):
    P=U.loc[(U['class']>0) & (U['class']<3)]
    C=U.loc[(U['class']>2) & (U['class']<9)]
    Pe.append(len(P))
    Ca.append(len(C))
    Un.append(np.std(U['scores'])/np.median(U['scores']))
  else:
    Pe.append(0)
    Ca.append(0)
    Un.append(0)
  BGe.append(BG)


outcome['Persons']=np.array(Pe)
outcome['Cars']=np.array(Ca)
outcome['BG']=np.array(BGe)
outcome['Uncertainty']=np.array(Un)

outcome.to_csv('Uncertainty_4k.csv')

Completed= 2000
Completed= 2100
Completed= 2200
Completed= 2300
Completed= 2400
Completed= 2500
Completed= 2600
Completed= 2700
Completed= 2800
Completed= 2900
Completed= 3000
Completed= 3100
Completed= 3200
Completed= 3300
Completed= 3400
Completed= 3500
Completed= 3600
Completed= 3700
Completed= 3800
Completed= 3900
Completed= 4000
Completed= 4100
Completed= 4200
Completed= 4300
Completed= 4400
Completed= 4500
Completed= 4600


In [None]:
  if(i<5):
    Una[i]=compute_10_pt_MA(U[0:i],i)
  else:
    Una[i]=compute_10_pt_MA(U[i-5:i+5],i)
  ypred[i]=return_scene_category_per_frame(Pe[i],Ca[i], Una[i], BG[i])

In [None]:

plt.plot(ypred[0:100])
plt.plot(y_test_enc[0:100])

In [None]:
from sklearn.metrics import accuracy_score
print("Accuracy=", accuracy_score(ytrue,ypred))