## This notebook is an example of how to pipeline two models. 
A video stream from a local camera is processed by the person detection model. The person detection results are then processed by the pose detection model, one person bounding box at a time. Combined result is then displayed.

This example uses `degirum_tools.streams` streaming toolkit.

This script works with the following inference options:

1. Run inference on DeGirum Cloud Platform;
2. Run inference on DeGirum AI Server deployed on a localhost or on some computer in your LAN or VPN;
3. Run inference on DeGirum ORCA accelerator directly installed on your computer.

To try different options, you just need to uncomment **one** of the lines in the code below.

You also need to specify your cloud API access token, cloud zoo URLs, and AI server hostname in [env.ini](../../env.ini) file, located in the same directory as this notebook.

**Access to camera is required to run this sample.**

The script needs either a web camera or local camera connected to the machine running this code. The camera index or URL needs to be specified either in the code below by assigning `camera_id` or in [env.ini](../../env.ini) file by defining `CAMERA_ID` variable and assigning `camera_id = None`.

In [None]:
# make sure degirum-tools package is installed
!pip show degirum-tools || pip install degirum-tools

#### Specify camera index here

In [None]:
camera_id = None         # camera index or URL; 0 to use default local camera, None to take from env.ini file

#### Specify where do you want to run your inferences

In [None]:
import degirum as dg, degirum_tools

degirum_tools.configure_colab() # configure for Google Colab

#
# Please UNCOMMENT only ONE of the following lines to specify where to run AI inference
#

target = dg.CLOUD # <-- on the Cloud Platform
# target = degirum_tools.get_ai_server_hostname() # <-- on AI Server deployed in your LAN
# target = dg.LOCAL # <-- on ORCA accelerator installed on this computer

# connect to AI inference engine getting zoo URL and token from env.ini file
zoo = dg.connect(target, degirum_tools.get_cloud_zoo_url(), degirum_tools.get_token())

#### The rest of the cells below should run without any modifications

In [None]:
import cv2
from degirum_tools import streams as dgstreams

In [None]:
# load models for DeGirum Orca AI accelerator
# (change model name to "...n2x_cpu_1" to run it on CPU)
people_det_model = zoo.load_model("yolo_v5s_person_det--512x512_quant_n2x_orca1_1")
pose_model = zoo.load_model("mobilenet_v1_posenet_coco_keypoints--353x481_quant_n2x_orca1_1")

# adjust pose model properties
pose_model.output_pose_threshold = 0.2 # lower threshold
pose_model.overlay_line_width = 1
pose_model.overlay_alpha = 1
pose_model.overlay_show_labels = False
pose_model.overlay_color = (255, 0, 0)

# adjust people model properties
people_det_model.overlay_show_probabilities = True

In [None]:
# Define pose detection gizmo (in degirum_tools.streams terminology)
class PoseDetectionGizmo(dgstreams.AiGizmoBase):
    
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._cur_result = None
        
    def on_result(self, result):
        
        # here result.info contains StreamData object used for AI inference (because AiGizmoBase does it this way);
        # and result.info.meta contains metainfo dictionary placed by AiObjectDetectionCroppingGizmo, 
        # because in our pipeline it is connected as a source of this gizmo
        meta = result.info
        if "original_result" in meta: # new frame comes
            if self._cur_result is not None:
                # send previous frame
                self.send_result(dgstreams.StreamData(self._cur_result.image, self._cur_result))                
            
            # save first pose result object at the beginning of new frame in order to accumulate all poses into it
            self._cur_result = result
            # replace original image with full annotated image which came from person detector to show person boxes as well as poses
            self._cur_result._input_image = meta["original_result"].image_overlay            
        
        if "cropped_index" in meta and "cropped_result" in meta:            
            # convert pose coordinates to back to original image
            box = meta["cropped_result"]["bbox"]
            for r in result.results:
                if 'landmarks' in r:
                    for p in r['landmarks']:
                        p['landmark'][0] += box[0]
                        p['landmark'][1] += box[1]
                        
            if self._cur_result != result:
                # accumulate all other detected poses into current result object
                self._cur_result._inference_results += result.results

In [None]:
# create composition object
c = dgstreams.Composition()

# create gizmos adding them to composition
source = c.add(dgstreams.VideoSourceGizmo(camera_id))  # video source
people_detection = c.add(dgstreams.AiSimpleGizmo(people_det_model))  # people detection gizmo
person_crop = c.add(
    dgstreams.AiObjectDetectionCroppingGizmo(["person"])
)  # cropping gizmo, which outputs cropped image for each detected person
pose_detection = c.add(PoseDetectionGizmo(pose_model))  # pose detection gizmo
display = c.add(
    dgstreams.VideoDisplayGizmo("Person Poses", show_ai_overlay=True, show_fps=True)
)  # display

# connect gizmos to create pipeline
source >> people_detection
person_crop.connect_to(source, 0)
person_crop.connect_to(people_detection, 1)
person_crop >> pose_detection >> display

# start execution of composition
c.start()
