### Use this Jupyter Notebook as a guide to run your trained model in inference mode

created by Anton Morgunov

inspired by [tensorflow object detection API tutorial](https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html#exporting-a-trained-model)

Your first step is going to specify which unit you are going to work with for inference. Select between GPU or CPU and follow the below instructions for implementation.

In [1]:
import os # importing OS in order to make GPU visible
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # do not change anything in here

# specify which device you want to work on.
# Use "-1" to work on a CPU. Default value "0" stands for the 1st GPU that will be used
os.environ["CUDA_VISIBLE_DEVICES"]="0" # TODO: specify your computational device

In [2]:
import tensorflow as tf # import tensorflow

# checking that GPU is found
if tf.test.gpu_device_name():
    print('GPU found')
else:
    print("No GPU found")

2023-06-02 13:36:30.875605: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-06-02 13:36:30.906739: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-06-02 13:36:30.907202: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


No GPU found


In [3]:
# other import
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
from tqdm import tqdm

Next you will import import scripts that were already provided by Tensorflow API. **Make sure that Tensorflow is your current working directory.**

In [4]:
import sys # importyng sys in order to access scripts located in a different folder

%cd ..
path2scripts = './models/research' # TODO: provide pass to the research folder
%cd workspace
sys.path.insert(0, path2scripts) # making scripts in models/research available for import

/home/satarw/efficientdet
/home/satarw/efficientdet/workspace


In [5]:
# importing all scripts that will be needed to export your model and use it for inference
from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.builders import model_builder


TensorFlow Addons (TFA) has ended development and introduction of new features.
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). 

For more information see: https://github.com/tensorflow/addons/issues/2807 



Now you can import and build your trained model:

In [6]:
# NOTE: your current working directory should be Tensorflow.

# TODO: specify two pathes: to the pipeline.config file and to the folder with trained model.
path2config ='./exported_models/d0/v3/pipeline.config'

path2model = './exported_models/d0/v3/checkpoint/'    

In [7]:
# do not change anything in this cell
configs = config_util.get_configs_from_pipeline_file(path2config) # importing config
model_config = configs['model'] # recreating model config
detection_model = model_builder.build(model_config=model_config, is_training=False) # importing model

In [8]:
ckpt = tf.compat.v2.train.Checkpoint(model=detection_model)
ckpt.restore(os.path.join(path2model, 'ckpt-0')).expect_partial()

<tensorflow.python.checkpoint.checkpoint.CheckpointLoadStatus at 0x7f1203097d60>

Next, path to label map should be provided. Category index will be created based on label map file

In [9]:
path2label_map = './data/button_label_map.pbtxt' # TODO: provide a path to the label map file
category_index = label_map_util.create_category_index_from_labelmap(path2label_map,use_display_name=True)

Now, a few supporting functions will be defined

In [20]:
def detect_fn(image):
    """
    Detect objects in image.
    
    Args:
      image: (tf.tensor): 4D input image
      
    Returs:
      detections (dict): predictions that model made
    """

    image, shapes = detection_model.preprocess(image)
    print(image,shapes)
    prediction_dict = detection_model.predict(image, shapes)
    detections = detection_model.postprocess(prediction_dict, shapes)

    return detections
  
def load_image_into_numpy_array(path):
  """Load an image from file into a numpy array.

  Puts image into numpy array to feed into tensorflow graph.
  Note that by convention we put it into a numpy array with shape
  (height, width, channels), where channels=3 for RGB.

  Args:
    path: the file path to the image

  Returns:
    numpy array with shape (img_height, img_width, 3)
  """
  
  return np.array(Image.open(path))

**Next function is the one that you can use to run inference and plot results an an input image:**

In [21]:
def inference_with_plot(path2images, box_th=0.6):
    """
    Function that performs inference and plots resulting b-boxes
    
    Args:
      path2images: an array with pathes to images
      box_th: (float) value that defines threshold for model prediction.
      
    Returns:
      None
    """
    i = 0
    for image_path in path2images:
        i += 1
        print('Running inference for {}... '.format(image_path), end='')

        image_np = load_image_into_numpy_array(image_path)
        
        input_tensor = tf.convert_to_tensor(np.expand_dims(image_np, 0), dtype=tf.float32)
        detections = detect_fn(input_tensor)

        # All outputs are batches tensors.
        # Convert to numpy arrays, and take index [0] to remove the batch dimension.
        # We're only interested in the first num_detections.
        num_detections = int(detections.pop('num_detections'))
        detections = {key: value[0, :num_detections].numpy()
                      for key, value in detections.items()}
      
        detections['num_detections'] = num_detections

        # detection_classes should be ints.
        detections['detection_classes'] = detections['detection_classes'].astype(np.int64)

        print(detections)
        label_id_offset = 1
        image_np_with_detections = image_np.copy()

        viz_utils.visualize_boxes_and_labels_on_image_array(
                image_np_with_detections,
                detections['detection_boxes'],
                detections['detection_classes']+label_id_offset,
                detections['detection_scores'],
                category_index,
                use_normalized_coordinates=True,
                max_boxes_to_draw=200,
                min_score_thresh=box_th,
                agnostic_mode=False,
                line_thickness=5)

        plt.figure(figsize=(15,10))
        plt.imshow(image_np_with_detections)
        print('Done')
        plt.savefig(f"results{i}.png")
    # plt.show()

Next, we will define a few other supporting functions:

In [22]:
def nms(rects, thd=0.5):
    """
    Filter rectangles
    rects is array of oblects ([x1,y1,x2,y2], confidence, class)
    thd - intersection threshold (intersection divides min square of rectange)
    """
    out = []

    remove = [False] * len(rects)

    for i in range(0, len(rects) - 1):
        if remove[i]:
            continue
        inter = [0.0] * len(rects)
        for j in range(i, len(rects)):
            if remove[j]:
                continue
            inter[j] = intersection(rects[i][0], rects[j][0]) / min(square(rects[i][0]), square(rects[j][0]))

        max_prob = 0.0
        max_idx = 0
        for k in range(i, len(rects)):
            if inter[k] >= thd:
                if rects[k][1] > max_prob:
                    max_prob = rects[k][1]
                    max_idx = k

        for k in range(i, len(rects)):
            if (inter[k] >= thd) & (k != max_idx):
                remove[k] = True

    for k in range(0, len(rects)):
        if not remove[k]:
            out.append(rects[k])

    boxes = [box[0] for box in out]
    scores = [score[1] for score in out]
    classes = [cls[2] for cls in out]
    return boxes, scores, classes


def intersection(rect1, rect2):
    """
    Calculates square of intersection of two rectangles
    rect: list with coords of top-right and left-boom corners [x1,y1,x2,y2]
    return: square of intersection
    """
    x_overlap = max(0, min(rect1[2], rect2[2]) - max(rect1[0], rect2[0]));
    y_overlap = max(0, min(rect1[3], rect2[3]) - max(rect1[1], rect2[1]));
    overlapArea = x_overlap * y_overlap;
    return overlapArea


def square(rect):
    """
    Calculates square of rectangle
    """
    return abs(rect[2] - rect[0]) * abs(rect[3] - rect[1])

**Next function is the one that you can use to run inference and save results into a file:**

In [23]:
import cv2

def inference_as_raw_output(path2images,box_th = 0.5,nms_th = 0.5,to_file = False,data = None,path2dir = False):
    # print("Start")
     
    """
    Function that performs inference and return filtered predictions and prediction times
    
    Args:
      path2images: an array with pathes to images
      box_th: (float) value that defines threshold for model prediction. Consider 0.25 as a value.
      nms_th: (float) value that defines threshold for non-maximum suppression. Consider 0.5 as a value.
      to_file: (boolean). When passed as True => results are saved into a file. Writing format is
      path2image + (x1abs, y1abs, x2abs, y2abs, score, conf) for box in boxes
      data: (str) name of the dataset you passed in (e.g. test/validation)
      path2dir: (str). Should be passed if path2images has only basenames. If full pathes provided => set False.
      
    Returs:
      detections (dict): filtered predictions that model made
    """
     
    
    print (f'Current data set is {data}')  
    print (f'Ready to start inference on {len(path2images)} images!')
    times = []
    for image_path in tqdm(path2images):
        if path2dir: # if a path to a directory where images are stored was passed in
            image_path = os.path.join(path2dir, image_path.strip())
            
        image_np = load_image_into_numpy_array(image_path)

        input_tensor = tf.convert_to_tensor(np.expand_dims(image_np, 0), dtype=tf.float32)
        t0 = cv2.getTickCount()
        detections = detect_fn(input_tensor)
        t1 = cv2.getTickCount()
        time = (t1-t0)/cv2.getTickFrequency()
        times.append(time)
        # print("time noted")
        # checking how many detections we got
        num_detections = int(detections.pop('num_detections'))
        
        # filtering out detection in order to get only the one that are indeed detections
        detections = {key: value[0, :num_detections].numpy() for key, value in detections.items()}
        
        # detection_classes should be ints.
        detections['detection_classes'] = detections['detection_classes'].astype(np.int64)
        
        # defining what we need from the resulting detection dict that we got from model output
        key_of_interest = ['detection_classes', 'detection_boxes', 'detection_scores']
        
        # filtering out detection dict in order to get only boxes, classes and scores
        detections = {key: value for key, value in detections.items() if key in key_of_interest}
        
        # print('detections processed')
        if box_th: # filtering detection if a confidence threshold for boxes was given as a parameter
            for key in key_of_interest:
                scores = detections['detection_scores']
                current_array = detections[key]
                filtered_current_array = current_array[scores > box_th]
                detections[key] = filtered_current_array
        
        # print('box_th done')
        if nms_th: # filtering rectangles if nms threshold was passed in as a parameter
            # creating a zip object that will contain model output info as
            output_info = list(zip(detections['detection_boxes'],
                                   detections['detection_scores'],
                                   detections['detection_classes']
                                  )
                              )
            boxes, scores, classes = nms(output_info)
            
            detections['detection_boxes'] = boxes # format: [y1, x1, y2, x2]
            detections['detection_scores'] = scores
            detections['detection_classes'] = classes
        
        # print('nms_th done')
        if to_file and data: # if saving to txt file was requested

            image_h, image_w, _ = image_np.shape
            file_name = f'pred_result_{data}.txt'
            
            line2write = list()
            line2write.append(os.path.basename(image_path))
            
            with open(file_name, 'a+') as text_file:
                # iterating over boxes
                for b, s, c in zip(boxes, scores, classes):
                    
                    y1abs, x1abs = b[0] * image_h, b[1] * image_w
                    y2abs, x2abs = b[2] * image_h, b[3] * image_w
                    
                    list2append = [x1abs, y1abs, x2abs, y2abs, s, c]
                    line2append = ','.join([str(item) for item in list2append])
                    
                    line2write.append(line2append)
                
                line2write = ' '.join(line2write)
                text_file.write(line2write + os.linesep)
        # print('to_file done')
        
    return times

In [24]:
test_images_folder = './data/test_panels/'
path2images = []
for filename in os.listdir(test_images_folder):
    path2images.append(os.path.join(test_images_folder,filename))

In [25]:
# import numpy as np
# times = inference_as_raw_output(path2images)
# arr = np.array(times)
 
# # measures of dispersion
# min = np.amin(arr)
# max = np.amax(arr)
# range = np.ptp(arr)
# variance = np.var(arr)
# sd = np.std(arr)
 
# # print("Array =", arr)
# print("Measures of Dispersion")
# print("Minimum =", min)
# print("Maximum =", max)
# print("Range =", range)
# print("Variance =", variance)
# print("Standard Deviation =", sd)

In [26]:
  #Functions for button detection
  import cv2
  import PIL
  import PIL.Image
  import io

  def button_candidates(boxes, scores, img_path):
      
    with open(img_path, 'rb') as f:
            image = np.asarray(PIL.Image.open(io.BytesIO(f.read())))
    img_height = image.shape[0]
    img_width = image.shape[1]

    button_scores = [] #stores the score of each button (confidence)
    button_patches = [] #stores the cropped image that encloses the button
    button_positions = [] #stores the coordinates of the bounding box on buttons

    for box, score in zip(boxes, scores):
      if score < 0.5: continue

      y_min = int(box[0] * img_height)
      x_min = int(box[1] * img_width)
      y_max = int(box[2] * img_height)
      x_max = int(box[3] * img_width)

      button_patch = image[y_min: y_max, x_min: x_max]
      button_patch = cv2.resize(button_patch, (180, 180))

      button_scores.append(score)
      button_patches.append(button_patch)
      button_positions.append([x_min, y_min, x_max, y_max])
      
    return button_patches, button_positions, button_scores


  def inferences_and_times(image_path,box_th = 0.5,nms_th = 0.5,to_file = False,data = None,path2dir = False):
      # print("Start")
      
      """
      Function that performs inference and return filtered predictions and prediction detections
      
      Args:
        path2images: an array with pathes to images
        box_th: (float) value that defines threshold for model prediction. Consider 0.25 as a value.
        nms_th: (float) value that defines threshold for non-maximum suppression. Consider 0.5 as a value.
        to_file: (boolean). When passed as True => results are saved into a file. Writing format is
        path2image + (x1abs, y1abs, x2abs, y2abs, score, conf) for box in boxes
        data: (str) name of the dataset you passed in (e.g. test/validation)
        path2dir: (str). Should be passed if path2images has only basenames. If full pathes provided => set False.
        
      Returs:
        detections (dict): filtered predictions that model made
      """
      

      if path2dir: # if a path to a directory where images are stored was passed in
          image_path = os.path.join(path2dir, image_path.strip())
          
      image_np = load_image_into_numpy_array(image_path)

      input_tensor = tf.convert_to_tensor(np.expand_dims(image_np, 0), dtype=tf.float32)
      detections = detect_fn(input_tensor)
      # checking how many detections we got
      num_detections = int(detections.pop('num_detections'))
      
      # filtering out detection in order to get only the one that are indeed detections
      detections = {key: value[0, :num_detections].numpy() for key, value in detections.items()}
      
      # detection_classes should be ints.
      detections['detection_classes'] = detections['detection_classes'].astype(np.int64)
      
      # defining what we need from the resulting detection dict that we got from model output
      key_of_interest = ['detection_classes', 'detection_boxes', 'detection_scores']
      
      # filtering out detection dict in order to get only boxes, classes and scores
      detections = {key: value for key, value in detections.items() if key in key_of_interest}
      
      # print('detections processed')
      if box_th: # filtering detection if a confidence threshold for boxes was given as a parameter
          for key in key_of_interest:
              scores = detections['detection_scores']
              current_array = detections[key]
              filtered_current_array = current_array[scores > box_th]
              detections[key] = filtered_current_array
      
      # print('box_th done')
      if nms_th: # filtering rectangles if nms threshold was passed in as a parameter
          # creating a zip object that will contain model output info as
          output_info = list(zip(detections['detection_boxes'],
                                  detections['detection_scores'],
                                  detections['detection_classes']
                                  )
                              )
          boxes, scores, classes = nms(output_info)
          
          detections['detection_boxes'] = boxes # format: [y1, x1, y2, x2]
          detections['detection_scores'] = scores
          detections['detection_classes'] = classes
      
      # print('nms_th done')
      if to_file and data: # if saving to txt file was requested

          image_h, image_w, _ = image_np.shape
          file_name = f'pred_result_{data}.txt'
          
          line2write = list()
          line2write.append(os.path.basename(image_path))
          
          with open(file_name, 'a+') as text_file:
              # iterating over boxes
              for b, s, c in zip(boxes, scores, classes):
                  
                  y1abs, x1abs = b[0] * image_h, b[1] * image_w
                  y2abs, x2abs = b[2] * image_h, b[3] * image_w
                  
                  list2append = [x1abs, y1abs, x2abs, y2abs, s, c]
                  line2append = ','.join([str(item) for item in list2append])
                  
                  line2write.append(line2append)
              
              line2write = ' '.join(line2write)
              text_file.write(line2write + os.linesep)
      # print('to_file done')
          
      return detections

In [27]:
import os
import imageio
import numpy as np
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
from PIL import Image, ImageDraw, ImageFont


charset = {'0': 0,  '1': 1,  '2': 2,  '3': 3,  '4': 4,  '5': 5,
           '6': 6,  '7': 7,  '8': 8,  '9': 9,  'A': 10, 'B': 11,
           'C': 12, 'D': 13, 'E': 14, 'F': 15, 'G': 16, 'H': 17,
           'I': 18, 'J': 19, 'K': 20, 'L': 21, 'M': 22, 'N': 23,
           'O': 24, 'P': 25, 'R': 26, 'S': 27, 'T': 28, 'U': 29,
           'V': 30, 'X': 31, 'Z': 32, '<': 33, '>': 34, '(': 35,
           ')': 36, '$': 37, '#': 38, '^': 39, 's': 40, '-': 41,
           '*': 42, '%': 43, '?': 44, '!': 45, '+': 46} # <nul> = +

class CharacterRecognizer:
  def __init__(self, graph_path=None, verbose=False):
    self.graph_path = graph_path #path to the model which is loaded as a graph
    self.session = None
    self.input = None
    self.output = []
    self.class_num = 1
    self.verbose = verbose

    self.idx_lbl = {} #this is the functionally inverse to charset
    for key in charset.keys():
      self.idx_lbl[charset[key]] = key
    self.init_recognizer()
    print('character recognizer initialized!')

  def init_recognizer(self):

    # load graph and label map from default folder
    if self.graph_path is None:
      self.graph_path = './frozen_models/ocr_graph.pb'

    # check existence of the two files
    if not os.path.exists(self.graph_path):
      raise IOError('Invalid ocr_graph path! {}'.format(self.graph_path))

    # load frozen graph
    recognition_graph = tf.Graph()
    with recognition_graph.as_default():
      od_graph_def = tf.GraphDef()
      with tf.gfile.GFile(self.graph_path, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')
    self.session = tf.Session(graph=recognition_graph)

    # prepare input and output request
    self.input = recognition_graph.get_tensor_by_name('ocr_input:0')
    # self.output.append(recognition_graph.get_tensor_by_name('chars_logit:0'))
    # self.output.append(recognition_graph.get_tensor_by_name('chars_log_prob:0'))
    self.output.append(recognition_graph.get_tensor_by_name('predicted_chars:0'))
    self.output.append(recognition_graph.get_tensor_by_name('predicted_scores:0'))
    # self.output.append(recognition_graph.get_tensor_by_name('predicted_text:0'))


  def clear_session(self):
    if self.session is not None:
      self.session.close()

  def predict(self, image_np, draw=False):
    assert image_np.shape == (180, 180, 3)
    img_in = np.expand_dims(image_np, axis=0)
    codes, scores = self.session.run(self.output, feed_dict={self.input: img_in}) #returns codes and scores for each code (single letter)
    codes, scores = [np.squeeze(x) for x in [codes, scores]]
    print(len(codes), codes)
    score_ave = 0
    text = ''
    for char, score in zip(codes, scores):
      if not self.idx_lbl[char] == '+':
        score_ave += score
        text += self.idx_lbl[char]
    score_ave /= len(text)

    if self.verbose:
      self.visualize_recognition_result(image_np, text, score_ave)


    img_show = self.draw_result(image_np, text, score_ave) if draw else image_np
    
    # print(f"text = {text}")

    return text, score_ave, np.array(img_show)

  @staticmethod
  def visualize_recognition_result(image_np, text, scores):
    img_pil = Image.fromarray(image_np)
    img_show = ImageDraw.Draw(img_pil)
    font = ImageFont.truetype('./Arial.ttf', 60)
    img_show.text((45, 60), text=text, font=font, fill=(255, 0, 255))
    img_pil.show()

  @staticmethod
  def draw_result(image_np, text, scores):
    img_pil = Image.fromarray(image_np)
    img_show = ImageDraw.Draw(img_pil)
    font = ImageFont.truetype('./Arial.ttf', 60)
    img_show.text((45, 60), text=text, font=font, fill=(255, 0, 255))
    return img_pil


In [28]:
path2images = []
test_panels = './data/test_panels/'
for filename in os.listdir(test_panels):
    file_path = os.path.join(test_panels, filename)
    path2images.append(file_path)    

In [29]:
recognizer = CharacterRecognizer(verbose=False)
overall_det_times = []
overall_lbl_times = []

for image_path in tqdm(path2images[0:1]):
    time_det, dets = inferences_and_times(image_path)
    boxes, scores, classes = dets['detection_boxes'], dets['detection_scores'], dets['detection_classes']
    boxes, scores, classes = [np.squeeze(x) for x in [boxes, scores, classes]]
    print(boxes.shape, boxes)
    print(scores.shape, scores)
    print(classes.shape, classes)
    button_patches, button_positions, _ = button_candidates(boxes, scores, image_path)
    t0 = cv2.getTickCount()
    for button_imgs in button_patches:
        button_text, button_score, _ = recognizer.predict(button_imgs)
    t1 = cv2.getTickCount()
    time_lbl = (t1-t0)/cv2.getTickFrequency()
    
    overall_det_times.append(time_det)
    overall_lbl_times.append(time_lbl)

    print(f"Det time = {time_det}")
    print(f"Lbl time = {time_lbl}")

character recognizer initialized!


  0%|          | 0/1 [00:00<?, ?it/s]

Tensor("Preprocessor_1/stack:0", shape=(1, 512, 512, 3), dtype=float32) Tensor("Preprocessor_1/stack_1:0", shape=(1, 3), dtype=int32)





ValueError: Your Layer or Model is in an invalid state. This can happen for the following cases:
 1. You might be interleaving estimator/non-estimator models or interleaving models/layers made in tf.compat.v1.Graph.as_default() with models/layers created outside of it. Converting a model to an estimator (via model_to_estimator) invalidates all models/layers made before the conversion (even if they were not the model converted to an estimator). Similarly, making a layer or a model inside a a tf.compat.v1.Graph invalidates all layers/models you previously made outside of the graph.
2. You might be using a custom keras layer implementation with custom __init__ which didn't call super().__init__.  Please check the implementation of <class 'object_detection.models.ssd_efficientnet_bifpn_feature_extractor.SSDEfficientNetB0BiFPNKerasFeatureExtractor'> and its bases.

In [None]:
inference_as_raw_output(path2images[0:1])

Current data set is None
Ready to start inference on 1 images!


  0%|          | 0/1 [00:00<?, ?it/s]


ValueError: Your Layer or Model is in an invalid state. This can happen for the following cases:
 1. You might be interleaving estimator/non-estimator models or interleaving models/layers made in tf.compat.v1.Graph.as_default() with models/layers created outside of it. Converting a model to an estimator (via model_to_estimator) invalidates all models/layers made before the conversion (even if they were not the model converted to an estimator). Similarly, making a layer or a model inside a a tf.compat.v1.Graph invalidates all layers/models you previously made outside of the graph.
2. You might be using a custom keras layer implementation with custom __init__ which didn't call super().__init__.  Please check the implementation of <class 'object_detection.models.ssd_efficientnet_bifpn_feature_extractor.SSDEfficientNetB0BiFPNKerasFeatureExtractor'> and its bases.

python3 model_main_tf2.py --pipeline_config_path=./models/efficientdet_do/v3/pipeline.config --model_dir=./models/efficientdet_do/v3/ --checkpoint_every_n=500 --num_workers=4 --alsologtostderr

python3 model_main_tf2.py --pipeline_config_path=./models/efficientdet_do/v3/pipeline.config --model_dir=./models/efficientdet_do/v3/ --checkpoint_dir=./models/efficientdet_do/v3/ --num_workers=4 --sample_1_of_n_eval_examples=1