![Degirum banner](https://raw.githubusercontent.com/DeGirum/PySDKExamples/main/images/degirum_banner.png)
## This notebook is an example of how to pipeline two models. 
A video stream from a local camera is processed by the person detection model. The person detection results are then processed by the pose detection model, one person bounding box at a time. Combined result is then displayed.

This example uses `degirum_tools.streams` streaming toolkit.

This script works with the following inference options:

1. Run inference on DeGirum Cloud Platform;
2. Run inference on DeGirum AI Server deployed on a localhost or on some computer in your LAN or VPN;
3. Run inference on DeGirum ORCA accelerator directly installed on your computer.

To try different options, you need to specify the appropriate `hw_location` option. 

When running this notebook locally, you need to specify your cloud API access token in the [env.ini](../../env.ini) file, located in the same directory as this notebook.

When running this notebook in Google Colab, the cloud API access token should be stored in a user secret named `DEGIRUM_CLOUD_TOKEN`.

The script can use either a web camera or local camera connected to the machine or a video file. The camera index or URL or video file path needs to be specified in the code below by assigning `video_source`.

In [None]:
# make sure degirum-tools package is installed
!pip show degirum-tools || pip install degirum-tools

#### Specify where do you want to run your inferences, model_zoo_url, model names for inference, and video source

In [10]:
# hw_location: where you want to run inference
#     "@cloud" to use DeGirum cloud
#     "@local" to run on local machine
#     IP address for AI server inference
# model_zoo_url: url/path for model zoo
#     cloud_zoo_url: valid for @cloud, @local, and ai server inference options
#     '': ai server serving models from local folder
#     path to json file: single model zoo in case of @local inference
# people_det_model_name: name of the model for detecting people
# pose_det_model_name: name of the model for pose detection
# video_source: video source for inference
#     camera index for local camera
#     URL of RTSP stream
#     URL of YouTube Video
#     path to video file (mp4 etc)
hw_location = "@cloud"
model_zoo_url = "degirum/public"
people_det_model_name = "yolo_v5s_person_det--512x512_quant_n2x_orca1_1"
pose_det_model_name = "mobilenet_v1_posenet_coco_keypoints--353x481_quant_n2x_orca1_1"
video_source = "https://raw.githubusercontent.com/DeGirum/PySDKExamples/main/images/WalkingPeople.mp4"

#### The rest of the cells below should run without any modifications

In [11]:
import degirum as dg, degirum_tools
from degirum_tools import streams as dgstreams

# connect to AI inference engine
zoo = dg.connect(hw_location, model_zoo_url, degirum_tools.get_token())

# load person detection model
person_det_model = dg.load_model(
    model_name=people_det_model_name,
    inference_host_address=hw_location,
    zoo_url=model_zoo_url,
    token=degirum_tools.get_token(),
    overlay_show_probabilities=True,
    overlay_line_width=1,
)

# load pose detection model
pose_det_model = dg.load_model(
    model_name=pose_det_model_name,
    inference_host_address=hw_location,
    zoo_url=model_zoo_url,
    token=degirum_tools.get_token(),
    output_pose_threshold=0.2,
    overlay_line_width=1,
    overlay_alpha=1,
    overlay_show_labels=False,
    overlay_color=(255, 0, 0),
)

In [8]:
# Define pose detection gizmo (in degirum_tools.streams terminology)
class PoseDetectionGizmo(dgstreams.AiGizmoBase):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._cur_result = None

    def on_result(self, result):
        # here result.info contains StreamMeta object (because AiGizmoBase does it this way);
        meta = result.info

        # find last crop meta in that StreamMeta object
        crop_meta = meta.find_last(dgstreams.tag_crop)

        if self._cur_result is None:
            # save first pose result object at the beginning of new frame in order to accumulate all poses into it
            self._cur_result = result
            # replace cropped image with full annotated image which came from person detector to show person boxes as well as poses
            self._cur_result._input_image = crop_meta["original_result"].image_overlay

        if crop_meta["cropped_result"] is not None:
            # convert pose coordinates to back to original image
            box = crop_meta["cropped_result"]["bbox"]
            for r in result.results:
                for p in r["landmarks"]:
                    p["landmark"][0] += box[0]
                    p["landmark"][1] += box[1]

            if self._cur_result != result:
                # accumulate all other detected poses into current result object
                self._cur_result._inference_results += result.results

        # if this is the last crop of the frame:
        if crop_meta["is_last_crop"]:
            # append to meta accumulated result with tag_inference so display will show it with AI annotations
            meta.append(self._cur_result, dgstreams.tag_inference)
            # send accumulated result
            self.send_result(dgstreams.StreamData(self._cur_result.image, meta))
            self._cur_result = None

In [None]:
# create gizmos

source = dgstreams.VideoSourceGizmo(video_source)  # video source
person = dgstreams.AiSimpleGizmo(person_det_model)  # person detector
pose = PoseDetectionGizmo(pose_det_model)  # pose detector
crop = dgstreams.AiObjectDetectionCroppingGizmo(["person"])  # cropping gizmo
display = dgstreams.VideoDisplayGizmo("Poses", show_ai_overlay=True)  # display

# create pipeline and composition, then start it
dgstreams.Composition(source >> person >> crop >> pose >> display).start()