# **CIANNA COCO example script**

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Deyht/CIANNA/blob/CIANNA/examples/COCO/coco_pred_notebook.ipynb)

---


**Link to the CIANNA github repository**
https://github.com/Deyht/CIANNA

### **CIANNA installation**

#### Query GPU allocation and properties

If nvidia-smi fail, it might indicate that you launched the colab session whithout GPU reservation.  
To change the type of reservation go to "Runtime"->"Change runtime type" and select "GPU" as your hardware accelerator.

In [None]:
%%shell

nvidia-smi

cd /content/

git clone https://github.com/NVIDIA/cuda-samples/

cd /content/cuda-samples/Samples/1_Utilities/deviceQuery/

make SMS="50 60 70 80"

./deviceQuery | grep Capability | cut -c50- > ~/cuda_infos.txt
./deviceQuery | grep "CUDA Driver Version / Runtime Version" | cut -c57- >> ~/cuda_infos.txt

cd ~/

If you are granted a GPU that does not support FP16 computation, it is advised to change the mixed precision method to FP32C_FP32A in the corresponding cells.
See the detail description on mixed precision support with CIANNA on the [Systeme Requirements](https://github.com/Deyht/CIANNA/wiki/1\)-System-Requirements) wiki page.

#### Clone CIANNA git repository

In [None]:
%%shell

cd /content/

git clone https://github.com/Deyht/CIANNA

cd CIANNA

#### Compiling CIANNA for the allocated GPU generation

There is no guaranteed forward or backward compatibility between Nvidia GPU generation, and some capabilities are generation specific. For these reasons, CIANNA must be provided the platform GPU generation at compile time.
The following cell will automatically update all the necessary files based on the detected GPU, and compile CIANNA.

In [None]:
%%shell

cd /content/CIANNA

mult="10"
cat ~/cuda_infos.txt
comp_cap="$(sed '1!d' ~/cuda_infos.txt)"
cuda_vers="$(sed '2!d' ~/cuda_infos.txt)"

lim="11.1"
old_arg=$(awk '{if ($1 < $2) print "-D CUDA_OLD";}' <<<"${cuda_vers} ${lim}")

sm_val=$(awk '{print $1*$2}' <<<"${mult} ${comp_cap}")

gen_val=$(awk '{if ($1 >= 80) print "-D GEN_AMPERE"; else if($1 >= 70) print "-D GEN_VOLTA";}' <<<"${sm_val}")

sed -i "s/.*arch=sm.*/\\t\tcuda_arg=\"\$cuda_arg -D CUDA -D comp_CUDA -lcublas -lcudart -arch=sm_$sm_val $old_arg $gen_val\"/g" compile.cp
sed -i "s/\/cuda-[0-9][0-9].[0-9]/\/cuda-$cuda_vers/g" compile.cp
sed -i "s/\/cuda-[0-9][0-9].[0-9]/\/cuda-$cuda_vers/g" src/python_module_setup.py

./compile.cp CUDA PY_INTERF

mv src/build/lib.linux-x86_64-* src/build/lib.linux-x86_64

#### CIANNA notebook guideline

**IMPORTANT NOTE**   
CIANNA is mainly used in a script fashion and was not designed to run in notebooks. Every cell code that directly invokes CIANNA functions must be run as a script to avoid possible errors.  
To do so, the cell must have the following structure.

```
%%shell

cd /content/CIANNA

python3 - <<EOF

[... your python code ...]

EOF
```

This syntax allows one to easily edit python code in the notebook while running the cell as a script. Note that all the notebook variables can not be accessed by the cell in this context.


## COCO prediction network

The present notebook uses a network trained on the COCO 2017 training dataset. The training dataset comprises almost 118000 images, each associated with target bounding boxes with 80 possible classes. Python training scripts are provided in the corresponding example directory of CIANNA. The network architecture is similar to a darknet-19 with a few adjustments to account for the current CIANNA capabilities. The network was first pre-trained on ImageNET for classification on 1000 classes at a 224x224 resolution and then further pre-trained at a 448x448 resolution. Finally, the network is trained on the PASCAL dataset for detection at a 416x416 resolution.

In this notebook, we apply a trained network to the 5000 images in the COCO 2017 validation dataset, and also provide a simplified script to use this network to perform a detection on an external image.



### Downloading and preparing COCO data

In [None]:
%%shell

cd /content/CIANNA/examples/COCO/

python3 - <<EOF

import numpy as np
from PIL import Image
from tqdm import tqdm
import json
import os


data_path = "./"

#Downloading Training and Validation datasets if not already present
#datapath is define in aux_fct.py

if(not os.path.isdir(data_path+"annotations")):
	os.system("wget -P %s http://images.cocodataset.org/annotations/annotations_trainval2017.zip"%(data_path))
	os.system("unzip %sannotations_trainval2017.zip"%(data_path))
	os.system("rm %sannotations_trainval2017.zip"%(data_path))

if(not os.path.isdir(data_path+"val2017")):
	os.system("wget -P %s http://images.cocodataset.org/zips/val2017.zip"%(data_path))
	os.system("unzip %sval2017.zip"%(data_path))
	os.system("rm %sval2017.zip"%(data_path))


data_prefix_list = ["val2017"]

for data_prefix in data_prefix_list:

	with open(data_path+"annotations/instances_%s.json"%(data_prefix), "r") as f:
		data = json.load(f)
	print (data.keys())

	data_list = {}

	for item in data["images"]:
		data_list[item["id"]] = item["file_name"]

	key_id = list(data_list.keys())
	f_file = list(data_list.values())


	for i in tqdm(range(0, len(key_id))):

		im = Image.open(data_path+data_prefix+"/"+f_file[i], mode='r')
		if(im.format != "RGB"):
			im = im.convert('RGB')

		patch = np.asarray(im)

		np.save(data_path+data_prefix+"/"+f_file[i][:-4], patch, allow_pickle=False)

	data_list = {}
	data_list2 = {}
	data_list3 = {}

	for item in data["annotations"]:
		data_list[item["id"]] = item["bbox"]
		data_list2[item["id"]] = item["image_id"]
		data_list3[item["id"]] = item["category_id"]

	key_id = list(data_list.keys())
	bbox_list = list(data_list.values())
	im_id_list = list(data_list2.values())
	class_list = list(data_list3.values())

	for i in tqdm(range(0, len(key_id))):
		f = open(data_path+data_prefix+"/bbox_%d.txt"%(im_id_list[i]), "a")  # append mode
		f.write("%f %f %f %f %d\n"%(bbox_list[i][0], bbox_list[i][1], max(1.0,bbox_list[i][2]), max(1.0, bbox_list[i][3]), class_list[i]))
		f.close()

EOF


### Performing network prediction

In [None]:
%%shell

cd /content/CIANNA/examples/COCO/

python3 - <<EOF

import numpy as np
import xml.etree.ElementTree as ET
import albumentations as A
import cv2
import json

#Comment to access system wide install
import sys, glob, os
sys.path.insert(0,glob.glob('../../src/build/lib.*/')[-1])
import CIANNA as cnn

def i_ar(int_list):
	return np.array(int_list, dtype="int")

def f_ar(float_list):
	return np.array(float_list, dtype="float32")

def prep_data(block_id):
  print("Preparing data for block %d ..."%(block_id))

  l_b_size = min(block_size, nb_val - block_id*block_size)
  input_val = np.zeros((l_b_size,flat_image_slice*3), dtype="float32")
  targets_val = np.zeros((l_b_size,1+max_nb_obj_per_image*(7+1)), dtype="float32")

  for i in range(0, min(block_size, nb_val - block_id*block_size)):
    i_d = i + block_id*block_size
    no_box = 0
    patch = np.load(data_path+"val2017/%s.npy"%(val_im_path_2017[i_d][:-4]), allow_pickle=False)
    val_size[i_d,:] = (np.shape(patch)[:2])

    if(os.path.exists(data_path+"val2017/bbox_%s.txt"%(val_im_id_2017[i_d]))):
      bbox_list = np.loadtxt(data_path+"val2017/bbox_%s.txt"%(val_im_id_2017[i_d]))
    else:
      no_box = 1

    if(no_box == 0):
      if(bbox_list.ndim == 1):
        bbox_list = np.reshape(bbox_list, (1,5))
      transformed = transform_val(image=patch, bboxes=bbox_list)
      patch_aug = transformed['image']
      bbs_aug = np.asarray(transformed['bboxes'])
    else:
      transformed = transform_val(image=patch, bboxes=[])

      patch_aug = transformed['image']
      bbs_aug = np.array([])

    for depth in range(0,3):
      input_val[i,depth*image_size*image_size:(depth+1)*image_size*image_size] = (patch_aug[:,:,depth].flatten("C")-100.0)/155.0

    targets_val[i,:] = 0.0
    targets_val[i,0] = np.shape(bbs_aug)[0]
    if(targets_val[i,0] > max_nb_obj_per_image):
      print ("Max_obj_per_image limit reached: ", int(targets_val[i,0]))
      targets_val[i,0] = max_nb_obj_per_image
    for k in range(0, int(targets_val[i,0])):

      xmin = bbs_aug[k,0]
      ymin = bbs_aug[k,1]
      xmax = bbs_aug[k,0] + bbs_aug[k,2]
      ymax = bbs_aug[k,1] + bbs_aug[k,3]

      targets_val[i,1+k*8:1+(k+1)*8] = np.array([np.where(class_id_conv[:] == bbs_aug[k,4])[0][0] + 1,xmin,ymin,0.0,xmax,ymax,1.0,0])

  return input_val, targets_val


data_path = "./"

with open("classnames.txt") as f:
  class_list = [line.rstrip('\n') for line in f]

class_id_conv = np.arange(1,92)
class_id_conv = np.delete(class_id_conv, [11,25,28,29,44,65,67,68,70,82,90])

class_list_short = class_list
color_offset = 0

val_list_2017 = {}
with open(data_path+"annotations/instances_val2017.json", "r") as f:
  val2017_instances = json.load(f)
for item in val2017_instances["images"]:
  val_list_2017[item["id"]] = item["file_name"]

val_im_path_2017 = list(val_list_2017.values())
val_im_id_2017 = list(val_list_2017.keys())


image_size = 416
flat_image_slice = image_size*image_size
nb_class = 80
max_nb_obj_per_image = 70
nb_box = 5
yolo_reg_size = 32
yolo_nb_reg = int(image_size/yolo_reg_size)

nb_val = 5000
block_size = 2000

targets_val = np.zeros((nb_val,1+max_nb_obj_per_image*(7+1)), dtype="float32")
val_size = np.zeros((nb_val,2), dtype="int")

transform_val = A.Compose([
    A.LongestMaxSize(max_size=image_size, interpolation=1, p=1.0),
    A.PadIfNeeded(min_width=image_size, min_height=image_size, border_mode=cv2.BORDER_CONSTANT, p=1.0),
  ], bbox_params=A.BboxParams(format='coco'))


load_epoch = 0
if (len(sys.argv) > 1):
  load_epoch = int(sys.argv[1])

cnn.init(in_dim=i_ar([image_size,image_size]), in_nb_ch=3, out_dim=1, b_size=32,
  comp_meth='C_CUDA', dynamic_load=1, mixed_precision="FP16C_FP32A", inference_only=1)

cnn.set_yolo_params()

if(load_epoch > 0):
  cnn.load("net_save/net0_s%04d.dat"%load_epoch,load_epoch, bin=1)
else:
  if(not os.path.isfile("CIANNA_net_model_coco_v1.0_darknet19custom_res416_map50_39.9_fp16.dat")):
    os.system("wget https://zenodo.org/records/12801421/files/CIANNA_net_model_coco_v1.0_darknet19custom_res416_map50_39.9_fp16.dat")
  cnn.load("CIANNA_net_model_coco_v1.0_darknet19custom_res416_map50_39.9_fp16.dat", 0, bin=1)

#cnn.print_arch_tex("./", "arch", activation=1, dropout=0)

for block_id in range(0, (nb_val + block_size - 1)//block_size):

  b_input_val, b_targets_val = prep_data(block_id)
  targets_val[block_id*block_size:(block_id+1)*block_size,:] = b_targets_val[:,:]
  cnn.create_dataset("TEST", min(block_size, nb_val - block_id*block_size), b_input_val, b_targets_val)
  del(b_input_val, b_targets_val)

  cnn.forward(repeat=1, no_error=1, saving=2, drop_mode="AVG_MODEL")
  os.system("mv fwd_res/net0_%04d.dat fwd_res/net0_%04d_b%d.dat"%(load_epoch, load_epoch, block_id))
  cnn.delete_dataset("TEST")

np.save("targets_val",targets_val)
np.save("val_size",val_size)

EOF


### Convert prediction into the COCO format


In [None]:
%%shell

cd /content/CIANNA/examples/COCO/

python3 - <<EOF

from aux_fct import *
#Use auxiliary functions from aux_fct.py

data_path = "./"

with open("classnames.txt") as f:
  class_list = [line.rstrip('\n') for line in f]

class_id_conv = np.arange(1,92)
class_id_conv = np.delete(class_id_conv, [11,25,28,29,44,65,67,68,70,82,90])

class_list_short = class_list
color_offset = 0

val_list_2017 = {}
with open(data_path+"annotations/instances_val2017.json", "r") as f:
  val2017_instances = json.load(f)
for item in val2017_instances["images"]:
  val_list_2017[item["id"]] = item["file_name"]

val_im_path_2017 = list(val_list_2017.values())
val_im_id_2017 = list(val_list_2017.keys())


image_size = 416
flat_image_slice = image_size*image_size
nb_class = 80
max_nb_obj_per_image = 70
nb_box = 5
yolo_reg_size = 32
yolo_nb_reg = int(image_size/yolo_reg_size)

nb_val = 5000
block_size = 2000

targets_val = np.zeros((nb_val,1+max_nb_obj_per_image*(7+1)), dtype="float32")

load_epoch = 0
obj_threshold = 0.03
class_soft_limit = 0.25
nms_threshold_same = 0.4
nms_threshold_diff = 0.95

targets_val = np.load("targets_val.npy")
val_size = np.load("val_size.npy")

#####################################################
# Filter network predictions (objectness, NMS, etc)
#####################################################

c_tile = np.zeros((yolo_nb_reg*yolo_nb_reg*nb_box,(6+1+nb_class)),dtype="float32")
c_tile_kept = np.zeros((yolo_nb_reg*yolo_nb_reg*nb_box,(6+1+nb_class)),dtype="float32")
c_box = np.zeros((6+1+nb_class),dtype="float32")

box_list = []

for block_id in range(0, (nb_val + block_size - 1)//block_size):

  l_b_size = min(block_size, nb_val - block_id*block_size)
  pred_raw = np.fromfile("fwd_res/net0_%04d_b%d.dat"%(load_epoch, block_id), dtype="float32")
  predict = np.reshape(pred_raw, (l_b_size,nb_box*(8+nb_class),yolo_nb_reg,yolo_nb_reg))

  for l in tqdm(range(0, l_b_size)):
    i_d = l + block_id*block_size
    im_id = val_im_id_2017[i_d]

    dim_long = np.argmax(val_size[i_d,:])
    ratio = image_size/val_size[i_d,dim_long]

    other_dim = int(np.mod(dim_long+1,2))
    offset = np.zeros((2))
    offset[dim_long] = 0.0
    offset[other_dim] = max(0.0,image_size - val_size[i_d,other_dim]*ratio)/2.0

    c_tile[:,:] = 0.0
    c_tile_kept[:,:] = 0.0

    c_pred = predict[l,:,:,:]
    c_nb_box = box_extraction(c_pred, c_box, c_tile, obj_threshold, class_soft_limit)

    c_nb_box_final = c_nb_box
    amax_array = np.amax(c_tile[:,7:], axis=1)
    c_nb_box_final = apply_NMS(c_tile, c_tile_kept, c_box, c_nb_box, amax_array, nms_threshold_same, nms_threshold_diff)

    for k in range(0, c_nb_box_final):

      x_min  = float(round((c_tile_kept[k,0]-offset[1])/ratio,2))
      y_min  = float(round((c_tile_kept[k,1]-offset[0])/ratio,2))
      width  = float(round((c_tile_kept[k,2]-offset[1])/ratio - (c_tile_kept[k,0]-offset[1])/ratio,2))
      height = float(round((c_tile_kept[k,3]-offset[0])/ratio - (c_tile_kept[k,1]-offset[0])/ratio,2))
      cat_id = int(class_id_conv[np.argmax(c_tile_kept[k,7:])])
      score  = float(round(c_tile_kept[k,5],4))

      box_list.append({"image_id": int(im_id), "category_id": cat_id,"bbox": [x_min,y_min, width,height],"score": score})

  del (pred_raw, predict)

with open("fwd_res/pred_%04d.json"%(load_epoch), "w") as f:
  json.dump(list(box_list), f)

EOF


### Compute COCO validation set mAP

In [None]:
%%shell

cd /content/CIANNA/examples/COCO/

git clone https://github.com/cocodataset/cocoapi

cd cocoapi/PythonAPI/

make

In [None]:
%cd /content/CIANNA/examples/COCO/

import numpy as np
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
import sys

load_epoch = 0

annType = "bbox"
prefix = "instances"

dataDir = "./"
dataType = "val2017"
annFile = "%s/annotations/%s_%s.json"%(dataDir,prefix,dataType)
cocoGt = COCO(annFile)

resFile="fwd_res/pred_%04d.json"%(load_epoch)
cocoDt=cocoGt.loadRes(resFile)

imgIds = sorted(cocoGt.getImgIds())
imgIds = imgIds[:]

cocoEval = COCOeval(cocoGt,cocoDt,annType)
#cocoEval.params.catIds = [1] #To select class
cocoEval.params.imgIds = imgIds
cocoEval.evaluate()
cocoEval.accumulate()
cocoEval.summarize()

### Visualize network predictions

In [None]:
%%shell

cd /content/CIANNA/examples/COCO/

python3 - <<EOF

from aux_fct import *
#Use auxiliary functions from aux_fct.py

data_path = "./"

with open("classnames.txt") as f:
  class_list = [line.rstrip('\n') for line in f]

class_id_conv = np.arange(1,92)
class_id_conv = np.delete(class_id_conv, [11,25,28,29,44,65,67,68,70,82,90])

class_list_short = class_list
color_offset = 0

val_list_2017 = {}
with open(data_path+"annotations/instances_val2017.json", "r") as f:
  val2017_instances = json.load(f)
for item in val2017_instances["images"]:
  val_list_2017[item["id"]] = item["file_name"]

val_im_path_2017 = list(val_list_2017.values())
val_im_id_2017 = list(val_list_2017.keys())


image_size = 416
flat_image_slice = image_size*image_size
nb_class = 80
max_nb_obj_per_image = 70
nb_box = 5
yolo_reg_size = 32
yolo_nb_reg = int(image_size/yolo_reg_size)

nb_val = 5000
block_size = 2000

targets_val = np.zeros((nb_val,1+max_nb_obj_per_image*(7+1)), dtype="float32")

transform_val = A.Compose([
    A.LongestMaxSize(max_size=image_size, interpolation=1, p=1.0),
    A.PadIfNeeded(min_width=image_size, min_height=image_size, border_mode=cv2.BORDER_CONSTANT, p=1.0),
  ], bbox_params=A.BboxParams(format='coco'))

load_epoch = 0
obj_threshold=0.3 #Remove low objectness boxes for display
class_soft_limit=0.3
nms_threshold_same=0.4
nms_threshold_diff=0.95

targets_val = np.load("targets_val.npy")
val_size = np.load("val_size.npy")

#####################################################
# Filter network predictions (objectness, NMS, etc)
#####################################################

c_tile = np.zeros((yolo_nb_reg*yolo_nb_reg*nb_box,(6+1+nb_class)),dtype="float32")
c_tile_kept = np.zeros((yolo_nb_reg*yolo_nb_reg*nb_box,(6+1+nb_class)),dtype="float32")
c_box = np.zeros((6+1+nb_class),dtype="float32")

final_boxes = []

for block_id in range(0, (nb_val + block_size - 1)//block_size):

  l_b_size = min(block_size, nb_val - block_id*block_size)
  pred_raw = np.fromfile("fwd_res/net0_%04d_b%d.dat"%(load_epoch, block_id), dtype="float32")
  predict = np.reshape(pred_raw, (l_b_size,nb_box*(8+nb_class),yolo_nb_reg,yolo_nb_reg))

  for l in tqdm(range(0, l_b_size)):
    i_d = l + block_id*block_size
    im_id = val_im_id_2017[i_d]

    dim_long = np.argmax(val_size[i_d,:])
    ratio = image_size/val_size[i_d,dim_long]

    other_dim = int(np.mod(dim_long+1,2))
    offset = np.zeros((2))
    offset[dim_long] = 0.0
    offset[other_dim] = max(0.0,image_size - val_size[i_d,other_dim]*ratio)/2.0

    c_tile[:,:] = 0.0
    c_tile_kept[:,:] = 0.0

    c_pred = predict[l,:,:,:]
    c_nb_box = box_extraction(c_pred, c_box, c_tile, obj_threshold, class_soft_limit)

    c_nb_box_final = c_nb_box
    amax_array = np.amax(c_tile[:,7:], axis=1)
    c_nb_box_final = apply_NMS(c_tile, c_tile_kept, c_box, c_nb_box, amax_array, nms_threshold_same, nms_threshold_diff)

    final_boxes.append(np.copy(c_tile_kept[0:c_nb_box_final]))

  del (pred_raw, predict)

visual_w = 6
visual_h = 4
display_target=1
block_id = 0
id_start=0

fig, ax = plt.subplots(visual_h, visual_w, figsize=(1.5*visual_w,1.5*visual_h), dpi=210, constrained_layout=True)

for l in tqdm(range(block_id*block_size + id_start, block_id*block_size + id_start + visual_w*visual_h)):
    c_x = (l - id_start - block_id*block_size) // visual_w
    c_y = (l - id_start - block_id*block_size) % visual_w

    patch = np.load(data_path+"val2017/%s.npy"%(val_im_path_2017[l][:-4]), allow_pickle=False)

    no_box = 0
    if(os.path.exists(data_path+"val2017/bbox_%s.txt"%(val_im_id_2017[l]))):
      bbox_list = np.loadtxt(data_path+"val2017/bbox_%s.txt"%(val_im_id_2017[l]))
    else:
      no_box = 1

    if(no_box == 0):
      if(bbox_list.ndim == 1):
        bbox_list = np.reshape(bbox_list, (1,5))

      transformed = transform_val(image=patch, bboxes=bbox_list)

      patch_aug = transformed['image']
      bbs_aug = np.asarray(transformed['bboxes'])
    else:
      transformed = transform_val(image=patch, bboxes=[])

      patch_aug = transformed['image']
      bbs_aug = np.array([])

    ax[c_x,c_y].imshow(patch_aug)
    ax[c_x,c_y].axis("off")

    im_boxes = final_boxes[l]

    if(display_target):
      targ_boxes = targets_val[l]
      if(targ_boxes[0] == -1):
        targ_boxes[0] = 1
      for k in range(0, int(targ_boxes[0])):
        xmin = targ_boxes[1+k*8+1]
        ymin = targ_boxes[1+k*8+2]
        xmax = targ_boxes[1+k*8+4]
        ymax = targ_boxes[1+k*8+5]
        p_c = int(targ_boxes[1+k*8+0]) - 1
        diff = int(targ_boxes[1+k*8+7])

        el = patches.Rectangle((xmin,ymin), (xmax-xmin), (ymax-ymin), linewidth=0.4, ls="--", fill=False,
          color=plt.cm.tab20((p_c+color_offset)%20), zorder=3)
        c_patch = ax[c_x,c_y].add_patch(el)
        c_text = ax[c_x,c_y].text(xmin+4, ymin+10, "%s %d"%(class_list_short[p_c], (xmax-xmin)*(ymax-ymin)),
          c=plt.cm.tab20((p_c+color_offset)%20), fontsize=2, clip_on=True)
        c_patch.set_path_effects([path_effects.Stroke(linewidth=0.8, foreground="black"), path_effects.Normal()])
        c_text.set_path_effects([path_effects.Stroke(linewidth=0.8, foreground="black"), path_effects.Normal()])

    for k in range(0, np.shape(im_boxes)[0]):

      xmin  = float(round((im_boxes[k,0]),2))
      ymin  = float(round((im_boxes[k,1]),2))
      width  = float(round((im_boxes[k,2]) - (im_boxes[k,0]),2))
      height = float(round((im_boxes[k,3]) - (im_boxes[k,1]),2))
      p_c = np.argmax(im_boxes[k,7:])
      score  = float(round(im_boxes[k,5],4))

      el = patches.Rectangle((xmin,ymin), width, height, linewidth=0.4, fill=False, color=plt.cm.tab20((p_c+color_offset)%20), zorder=3)
      c_patch = ax[c_x,c_y].add_patch(el)
      c_text = ax[c_x,c_y].text(xmin+5, ymin+height-4, "%s:%d-%0.2f-%0.2f"%(class_list[p_c],im_boxes[k,6],im_boxes[k,5],np.max(im_boxes[k,7:])),
        c=plt.cm.tab20((p_c+color_offset)%20), fontsize=2,clip_on=True)
      c_patch.set_path_effects([path_effects.Stroke(linewidth=0.8, foreground="black"), path_effects.Normal()])
      c_text.set_path_effects([path_effects.Stroke(linewidth=0.8, foreground="black"), path_effects.Normal()])

plt.savefig("pred_mosaic.jpg",dpi=500, bbox_inches='tight')

EOF



In [None]:
%cd /content/CIANNA/examples/COCO/

#Display the produced JPG
from IPython.display import Image
Image("pred_mosaic.jpg", width=1280)

## External image prediction

Minimalist example on how to use the network to perform prediction on an external .jpg image.

In [None]:
%%shell

cd /content/CIANNA/examples/COCO/

python3 - <<EOF

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patheffects as path_effects
from matplotlib import patches
from PIL import Image
import albumentations as A
import cv2
from numba import jit

#Comment to access system wide install
import sys, glob, os
sys.path.insert(0,glob.glob('../../src/build/lib.*/')[-1])
import CIANNA as cnn


#Minimum deployement setup for prediction on a single image

def i_ar(int_list):
	return np.array(int_list, dtype="int")

def f_ar(float_list):
	return np.array(float_list, dtype="float32")

@jit(nopython=True, cache=False, fastmath=False)
def fct_IoU(box1, box2):
	inter_w = max(0, min(box1[2], box2[2]) - max(box1[0], box2[0]) + 1)
	inter_h = max(0, min(box1[3], box2[3]) - max(box1[1], box2[1]) + 1)
	inter_2d = inter_w*inter_h
	uni_2d = abs(box1[2]-box1[0] + 1)*abs(box1[3] - box1[1] + 1) + \
		abs(box2[2]-box2[0] + 1)*abs(box2[3] - box2[1] + 1) - inter_2d
	enclose_w = (max(box1[2], box2[2]) - min(box1[0], box2[0]))
	enclose_h = (max(box1[3], box2[3]) - min(box1[1],box2[1]))
	enclose_2d = enclose_w*enclose_h

	cx_a = (box1[2] + box1[0])*0.5; cx_b = (box2[2] + box2[0])*0.5
	cy_a = (box1[3] + box1[1])*0.5; cy_b = (box2[3] + box2[1])*0.5
	dist_cent = np.sqrt((cx_a - cx_b)*(cx_a - cx_b) + (cy_a - cy_b)*(cy_a - cy_b))
	diag_enclose = np.sqrt(enclose_w*enclose_w + enclose_h*enclose_h)

  # DIoU
	return float(inter_2d)/float(uni_2d) - float(dist_cent)/float(diag_enclose)
  # GIoU
	#return float(inter_2d)/float(uni_2d) - float(enclose_2d - uni_2d)/float(enclose_2d)


@jit(nopython=True, cache=False, fastmath=False)
def box_extraction(c_pred, c_box, c_tile, obj_threshold, class_soft_limit):
	c_nb_box = 0
	for i in range(0,yolo_nb_reg):
		for j in range(0,yolo_nb_reg):
			for k in range(0,nb_box):
				offset = int(k*(8+nb_class)) #no +1 for box prior in prediction
				c_box[4] = c_pred[offset+6,i,j]
				c_box[5] = c_pred[offset+7,i,j]
				p_c = np.max(c_pred[offset+8:offset+8+nb_class,i,j])
				cl = np.argmax(c_pred[offset+8:offset+8+nb_class,i,j])

				if(c_box[5] >= obj_threshold and c_box[5]*p_c**1 >= 0.01 and p_c > class_soft_limit):
					c_box[0] = c_pred[offset,i,j]
					c_box[1] = c_pred[offset+1,i,j]
					c_box[2] = c_pred[offset+3,i,j]
					c_box[3] = c_pred[offset+4,i,j]

					c_box[6] = k
					c_box[7:] = c_pred[offset+8:offset+8+nb_class,i,j]
					c_tile[c_nb_box,:] = c_box[:]
					c_nb_box +=1

	return c_nb_box

@jit(nopython=True, cache=False, fastmath=False)
def apply_NMS(c_tile, c_tile_kept, c_box, c_nb_box, amax_array, nms_threshold_same, nms_threshold_diff):
  c_nb_box_final = 0
  c_box_size_prev = c_nb_box

  while(c_nb_box > 0):
    max_objct = np.argmax(c_tile[:c_box_size_prev,5]*amax_array[:c_box_size_prev])
    c_box = np.copy(c_tile[max_objct])
    c_tile[max_objct,5] = 0.0
    c_tile_kept[c_nb_box_final] = c_box
    c_nb_box_final += 1
    c_nb_box -= 1
    i = 0

    for i in range(0,c_box_size_prev):
      if(c_tile[i,5] < 0.00000001):
        continue
      IoU = fct_IoU(c_box[:4], c_tile[i,:4])

      if((IoU > nms_threshold_same and np.argmax(c_box[7:]) == np.argmax(c_tile[i,7:]))
      	or (IoU > nms_threshold_diff and np.argmax(c_box[7:]) != np.argmax(c_tile[i,7:]))):
        c_tile[i] = 0.0
        c_nb_box -= 1

  return c_nb_box_final


#The network is resiliant to slight augment in image resolution, which increase the mAP
#We recommand changing image_size by step of 64 (2 grid elements)
#Here training resolution was 416

image_size = 416 + 64*3
flat_image_slice = image_size*image_size
nb_box = 5
nb_class = 80

color_offset = 0

max_nb_obj_per_image = 70

yolo_nb_reg = int(image_size/32)
c_size = 32

with open("classnames.txt") as f:
    class_list = [line.rstrip('\n') for line in f]

class_id_conv = np.arange(1,92)
class_id_conv = np.delete(class_id_conv, [11,25,28,29,44,65,67,68,70,82,90])


if(not os.path.isfile("office_1.jpg")):
	os.system("wget https://share.obspm.fr/s/GynmcyDtkrsbyLe/download/office_1.jpg")

im = Image.open("office_1.jpg", mode='r')

if(im.format != "RGB"):
	im = im.convert('RGB')

patch = np.asarray(im)

dim_long = np.argmax(im.size)
ratio = image_size/im.size[dim_long]

other_dim = int(np.mod(dim_long+1,2))
offset = np.zeros((2))
offset[dim_long] = 0.0
offset[other_dim] = max(0.0,image_size - im.size[other_dim]*ratio)/2.0

transform = A.Compose([
	A.LongestMaxSize(max_size=image_size, interpolation=1, p=1.0),
	A.PadIfNeeded(min_width=image_size, min_height=image_size, border_mode=cv2.BORDER_CONSTANT, p=1.0),
])

transformed = transform(image=patch)
patch_aug = transformed['image']

input_data = f_ar(np.zeros((1,3*image_size*image_size)))
empty_target = f_ar(np.zeros((1,1)))

for depth in range(0,3):
	input_data[0,depth*flat_image_slice:(depth+1)*flat_image_slice] = (patch_aug[:,:,depth].flatten("C") - 100.0)/155.0



cnn.init(in_dim=i_ar([image_size,image_size]), in_nb_ch=3, out_dim=1, b_size=1,
	comp_meth='C_CUDA', dynamic_load=1, mixed_precision="FP16C_FP32A", inference_only=1)

cnn.create_dataset("TEST", 1, input_data, empty_target)

cnn.set_yolo_params()

load_epoch = 0
if(load_epoch > 0):
	cnn.load("net_save/net0_s%04d.dat"%load_epoch,load_epoch, bin=1)
else:
	if(not os.path.isfile("CIANNA_net_model_coco_v1.0_darknet19custom_res416_map50_39.9_fp16.dat")):
		os.system("wget https://zenodo.org/records/12801421/files/CIANNA_net_model_coco_v1.0_darknet19custom_res416_map50_39.9_fp16.dat")
	cnn.load("CIANNA_net_model_coco_v1.0_darknet19custom_res416_map50_39.9_fp16.dat", 0, bin=1)

#cnn.print_arch_tex("./", "arch", activation=1, dropout=0)

cnn.forward(repeat=1, no_error=1, saving=2, drop_mode="AVG_MODEL")



pred_raw = np.fromfile("fwd_res/net0_%04d.dat"%load_epoch, dtype="float32")
predict = np.reshape(pred_raw, (1, nb_box*(8+nb_class),yolo_nb_reg,yolo_nb_reg))

c_tile = np.zeros((yolo_nb_reg*yolo_nb_reg*nb_box,(6+1+nb_class)),dtype="float32")
c_tile_kept = np.zeros((yolo_nb_reg*yolo_nb_reg*nb_box,(6+1+nb_class)),dtype="float32")
c_box = np.zeros((6+1+nb_class),dtype="float32")

final_boxes = []

#Choice of filters that produce visually appealing results (!= best mAP )
obj_threshold = 0.45
class_soft_limit = 0.3
nms_threshold_same = 0.4
nms_threshold_diff = 0.9


c_tile[:,:] = 0.0
c_tile_kept[:,:] = 0.0

c_pred = predict[0,:,:,:]
c_nb_box = box_extraction(c_pred, c_box, c_tile, obj_threshold, class_soft_limit)

c_nb_box_final = c_nb_box
amax_array = np.amax(c_tile[:,7:], axis=1)
c_nb_box_final = apply_NMS(c_tile, c_tile_kept, c_box, c_nb_box, amax_array, nms_threshold_same, nms_threshold_diff)
final_boxes.append(np.copy(c_tile_kept[0:c_nb_box_final]))


#Image is displayed at full resolution. Changing imshow and removing ratio allows to visualize the prediction at the resolution seen by the network.
fig, ax = plt.subplots(1,1, figsize=(4,4), dpi=200, constrained_layout=True)

ax.imshow(patch)
ax.axis('off')

im_boxes = final_boxes[0]

for k in range(0, np.shape(im_boxes)[0]):
	xmin = max(0.0,(im_boxes[k,0]-offset[0])/ratio)
	ymin = max(0.0,(im_boxes[k,1]-offset[1])/ratio)
	xmax = min(im.size[0],(im_boxes[k,2]-offset[0])/ratio)
	ymax = min(im.size[1],(im_boxes[k,3]-offset[1])/ratio)

	p_c = np.argmax(im_boxes[k,7:])

	el = patches.Rectangle((xmin,ymin), (xmax-xmin), (ymax-ymin), linewidth=0.4, fill=False, color=plt.cm.tab20((p_c+color_offset)%20), zorder=3)
	c_patch = ax.add_patch(el)
	c_text = ax.text(xmin+8, ymax-15, "%s:%d-%0.2f-%0.2f"%(class_list[p_c],im_boxes[k,6],im_boxes[k,5],np.max(im_boxes[k,7:])), c=plt.cm.tab20((p_c+color_offset)%20), fontsize=2,clip_on=True)
	c_patch.set_path_effects([path_effects.Stroke(linewidth=0.8, foreground='black'),
										path_effects.Normal()])
	c_text.set_path_effects([path_effects.Stroke(linewidth=0.8, foreground='black'),
										path_effects.Normal()])

plt.savefig("pred_on_image.jpg",dpi=400, bbox_inches='tight')


EOF

In [None]:
%cd /content/CIANNA/examples/COCO/

#Display the produced JPG
from IPython.display import Image
Image("pred_on_image.jpg", width=960)