# Simple Object Detection in Tensorflow

<a target="_blank" href="https://colab.research.google.com/github/LuisAngelMendozaVelasco/TensorFlow-Advanced_Techniques_Specialization/blob/master/Advanced_Computer_Vision_with_TensorFlow/Week2/Labs/C3_W2_Lab_1_Simple_Object_Detection.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png">Run in Google Colab</a>

This lab will walk you through how to use object detection models available in [Tensorflow Hub](https://www.tensorflow.org/hub). In the following sections, you will:

* explore the Tensorflow Hub for object detection models
* load the models in your workspace
* preprocess an image for inference
* run inference on the models and inspect the output

Let's get started!

## Imports

In [1]:
import tensorflow as tf
import tensorflow_hub as hub
from PIL import Image, ImageOps
import tempfile
from six.moves.urllib.request import urlopen
from six import BytesIO

2024-08-29 16:42:11.916284: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-29 16:42:11.929186: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-29 16:42:11.932766: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-08-29 16:42:11.941789: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


### Download the model from Tensorflow Hub

Tensorflow Hub is a repository of trained machine learning models which you can reuse in your own projects.
- You can see the domains covered [here](https://tfhub.dev/) and its subcategories.
- For this lab, you will want to look at the [image object detection subcategory](https://tfhub.dev/s?module-type=image-object-detection).
- You can select a model to see more information about it and copy the URL so you can download it to your workspace.
- We selected a [inception resnet version 2](https://tfhub.dev/google/faster_rcnn/openimages_v4/inception_resnet_v2/1)
- You can also modify this following cell to choose the other model that we selected, [ssd mobilenet version 2](https://tfhub.dev/tensorflow/ssd_mobilenet_v2/2)

In [2]:
# You can switch the commented lines here to pick the other model

# Inception resnet version 2
module_handle = "https://tfhub.dev/google/faster_rcnn/openimages_v4/inception_resnet_v2/1"

# You can choose ssd mobilenet version 2 instead and compare the results
# module_handle = "https://tfhub.dev/google/openimages_v4/ssd/mobilenet_v2/1"

#### Load the model

Next, you'll load the model specified by the `module_handle`.
- This will take a few minutes to load the model.

In [3]:
model = hub.load(module_handle)

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore
2024-08-29 16:42:21.667018: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2021] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 2018 MB memory:  -> device: 0, name: NVIDIA GeForce GTX 1650, pci bus id: 0000:01:00.0, compute capability: 7.5


#### Choose the default signature

Some models in the Tensorflow hub can be used for different tasks. So each model's documentation should show what *signature* to use when running the model.
- If you want to see if a model has more than one signature then you can do something like `print(hub.load(module_handle).signatures.keys())`. In your case, the models you will be using only have the `default` signature so you don't have to worry about other types.

In [4]:
# Take a look at the available signatures for this particular model
model.signatures.keys()

KeysView(_SignatureMap({'default': <ConcreteFunction () -> Dict[['detection_class_entities', TensorSpec(shape=(None, 1), dtype=tf.string, name=None)], ['detection_class_names', TensorSpec(shape=(None, 1), dtype=tf.string, name=None)], ['detection_boxes', TensorSpec(shape=(None, 4), dtype=tf.float32, name=None)], ['detection_class_labels', TensorSpec(shape=(None, 1), dtype=tf.int64, name=None)], ['detection_scores', TensorSpec(shape=(None, 1), dtype=tf.float32, name=None)]] at 0x70CCEA4DB980>}))

Please choose the 'default' signature for your object detector.
- For object detection models, its 'default' signature will accept a batch of image tensors and output a dictionary describing the objects detected, which is what you'll want here.

In [5]:
detector = model.signatures['default']

### download_and_resize_image

This function downloads an image specified by a given "url", pre-processes it, and then saves it to disk.

In [6]:
def download_and_resize_image(url, new_width=256, new_height=256):
    '''
    Fetches an image online, resizes it and saves it locally.

    Args:
        url (string) -- link to the image
        new_width (int) -- size in pixels used for resizing the width of the image
        new_height (int) -- size in pixels used for resizing the length of the image

    Returns:
        (string) -- path to the saved image
    '''

    # Create a temporary file ending with ".jpg"
    _, filename = tempfile.mkstemp(suffix=".jpg")

    # Opens the given URL
    response = urlopen(url)

    # Reads the image fetched from the URL
    image_data = response.read()

    # Puts the image data in memory buffer
    image_data = BytesIO(image_data)

    # Opens the image
    pil_image = Image.open(image_data)

    # Resizes the image. will crop if aspect ratio is different.
    pil_image = ImageOps.fit(pil_image, (new_width, new_height), Image.LANCZOS)

    # Converts to the RGB colorspace
    pil_image_rgb = pil_image.convert("RGB")

    # Saves the image to the temporary file created earlier
    pil_image_rgb.save(filename, format="JPEG", quality=90)

    print("Image downloaded to %s." % filename)

    return filename

### Download and preprocess an image

Now, using `download_and_resize_image` you can get a sample image online and save it locally.
- We've provided a URL for you, but feel free to choose another image to run through the object detector.
- You can use the original width and height of the image but feel free to modify it and see what results you get.

In [7]:
# You can choose a different URL that points to an image of your choice
image_url = "https://upload.wikimedia.org/wikipedia/commons/f/fb/20130807_dublin014.JPG"

# Download the image and use the original height and width
downloaded_image_path = download_and_resize_image(image_url, 3872, 2592)

Image downloaded to /tmp/tmpk3ftl3ad.jpg.


### run_detector

This function will take in the object detection model `detector` and the path to a sample image, then use this model to detect objects and display its predicted class categories and detection boxes.
- run_detector uses `load_image` to convert the image into a tensor.

In [8]:
def load_img(path):
    '''
    Loads a JPEG image and converts it to a tensor.

    Args:
        path (string) -- path to a locally saved JPEG image

    Returns:
        (tensor) -- an image tensor
    '''

    # Read the file
    img = tf.io.read_file(path)

    # Convert to a tensor
    img = tf.image.decode_jpeg(img, channels=3)

    return img

def run_detector(detector, path):
    '''
    Runs inference on a local file using an object detection model.

    Args:
        detector (model) -- an object detection model loaded from TF Hub
        path (string) -- path to an image saved locally
    '''

    # Load an image tensor from a local file path
    img = load_img(path)

    # Add a batch dimension in front of the tensor
    converted_img  = tf.image.convert_image_dtype(img, tf.float32)[tf.newaxis, ...]

    # Run inference using the model
    result = detector(converted_img)

    # Save the results in a dictionary
    result = {key:value.numpy() for key,value in result.items()}

    # Print results
    print("Found %d objects." % len(result["detection_scores"]))

    print("\nDetection scores:\n", result["detection_scores"])
    print("\nDetection class entities:\n", result["detection_class_entities"])
    print("\nDetection boxes:\n", result["detection_boxes"])

### Run inference on the image

You can run your detector by calling the `run_detector` function. This will print the number of objects found followed by three lists:

* The detection scores of each object found (i.e. how confident the model is),
* The classes of each object found,
* The bounding boxes of each object

You will see how to overlay this information on the original image in the next sections and in this week's assignment!

In [9]:
# Runs the object detection model and prints information about the objects found
run_detector(detector, downloaded_image_path)

W0000 00:00:1724971384.795000   44323 op_level_cost_estimator.cc:699] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: -2642 } dim { size: -2643 } dim { size: -2644 } dim { size: 1088 } } } inputs { dtype: DT_FLOAT shape { dim { size: -22 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -22 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } value { dtype: DT_INT32 tensor_shape { dim { size: 2 } } int_val: 17 } } device { type: "GPU" vendor: "NVIDIA" model: "NVIDIA GeForce GTX 1650" frequency: 1515 num_cores: 14 environment { key: "architecture" value: "7.5" } environment { key: "cuda" value: "12000" } environment { key: "cudnn" value: "8907" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 1048576 shared_memory_size_per_multiprocessor: 65536 memory_size: 211602636

Found 100 objects.

Detection scores:
 [0.65448576 0.6114539  0.6042274  0.59263176 0.5921922  0.58049124
 0.5514052  0.4946692  0.47515702 0.47342214 0.43995938 0.4148516
 0.40629733 0.3982894  0.397653   0.37621075 0.37279427 0.36574844
 0.35260734 0.33274773 0.30428663 0.2727656  0.26865074 0.25777173
 0.25290567 0.24612108 0.23403783 0.20342937 0.18229423 0.1804577
 0.17571369 0.16435082 0.15849963 0.15665978 0.15470928 0.15452768
 0.14924915 0.13340692 0.12948267 0.12649699 0.12044197 0.11767332
 0.11356067 0.11114843 0.11100272 0.1091493  0.10604016 0.0894051
 0.08598321 0.08280165 0.08104508 0.07806055 0.07760363 0.07628626
 0.07546864 0.0744413  0.07427155 0.07204875 0.07177526 0.07102234
 0.07032738 0.06809694 0.06304476 0.06285913 0.0627095  0.06223931
 0.05882135 0.05815005 0.05795752 0.05787573 0.05462372 0.05274337
 0.05133702 0.0482655  0.04708424 0.04682794 0.04495208 0.04405155
 0.0436071  0.04113495 0.0410996  0.03968572 0.03934988 0.03912779
 0.03879521 0.03878608 0.0