![Degirum banner](https://raw.githubusercontent.com/DeGirum/PySDKExamples/main/images/degirum_banner.png)
## Example script illustrating asynchronous parallel execution of sound classification on audio stream and object detection on video stream
This notebook is an example of how to use DeGirum PySDK to perform parallel inferences on two asynchronous data streams with different frame rates. 
To achieve maximum performance this example uses non-blocking batch prediction mode.

This script works with the following inference options:

1. Run inference on DeGirum Cloud Platform;
2. Run inference on DeGirum AI Server deployed on a localhost or on some computer in your LAN or VPN;
3. Run inference on DeGirum ORCA accelerator directly installed on your computer.

To try different options, you need to specify the appropriate `hw_location` option. 

When running this notebook locally, you need to specify your cloud API access token in the [env.ini](../../env.ini) file, located in the same directory as this notebook.

When running this notebook in Google Colab, the cloud API access token should be stored in a user secret named `DEGIRUM_CLOUD_TOKEN`.

**pyaudio package with portaudio is required to run this sample.**

The script may use either a web camera or local camera connected to the machine running this code. Alternatively, you may use the video file. The camera index, URL, 
or file path needs to be specified either in the code below by assigning `camera_id` or in [env.ini](../../env.ini) file by defining `CAMERA_ID` variable and 
assigning `camera_id = None`.

The script may use local microphone connected to the machine running this code. Alternatively, you may use the WAV file.
The mic index or WAV filename needs to be specified either in the code below by assigning `audio_id` or in [env.ini](../../env.ini) file by defining `AUDIO_ID` variable 
and assigning `audio_id = None`.

In [None]:
# make sure degirum-tools package is installed
!pip show degirum-tools || pip install degirum-tools

# to install pyaudio package, uncomment the following lines
#!apt install libasound2-dev portaudio19-dev libportaudio2 libportaudiocpp0
#!pip show pyaudio || pip install pyaudio

#### Specify camera and audio ids

In [None]:
video_source = None  # camera index or URL; 0 to use default local camera, None to take from env.ini file
audio_source = None  # mic index or WAV file name; 0 to use default mic, None to take from env.ini file

#### Specify where do you want to run your inferences

In [None]:
import degirum as dg
import degirum_tools

#
# Please UNCOMMENT only ONE of the following lines to specify where to run AI inference
#

hw_location = dg.CLOUD  # <-- on the Cloud Platform
# hw_location = degirum_tools.get_ai_server_hostname() # <-- on AI Server deployed in your LAN
# hw_location = dg.LOCAL # <-- on ORCA accelerator installed on this computer

model_zoo_url = 'degirum/public'
sound_model_name = "mobilenet_v1_yamnet_sound_cls--96x64_quant_n2x_orca1_1"
detection_model_name="mobilenet_v2_ssd_coco--300x300_quant_n2x_orca1_1"

#### The rest of the cells below should run without any modifications

In [None]:
# load YAMNET sound classification model for DeGirum Orca AI accelerator
# (change model name to "...n2x_cpu_1" to run it on CPU)
sound_model =dg.load_model(
    model_name=sound_model_name,
    inference_host_address=hw_location,
    zoo_url=model_zoo_url,
    token=degirum_tools.get_token(),
)

# load MobileNetv2+SSD object detection model for DeGirum Orca AI accelerator
# (change model name to "...n2x_cpu_1" to run it on CPU)
detection_model = dg.load_model(
    model_name=detection_model_name,
    inference_host_address=hw_location,
    zoo_url=model_zoo_url,
    token=degirum_tools.get_token(),
)

# set non-blocking mode for both models
sound_model.non_blocking_batch_predict = True
detection_model.non_blocking_batch_predict = True

In [None]:
audio_sampling_rate_hz = sound_model.model_info.InputSamplingRate[0]
audio_buffer_size = (
    sound_model.model_info.InputWaveformSize[0] // 2
)  # two read buffers in waveform for half-length overlapping

with degirum_tools.Display("Async Streams") as display, degirum_tools.open_audio_stream(
    audio_sampling_rate_hz, audio_buffer_size, audio_source
) as audio_stream, degirum_tools.open_video_stream(video_source) as video_stream:
    # create prediction result generators:
    sound_predictor = sound_model.predict_batch(
        degirum_tools.audio_overlapped_source(audio_stream, lambda: False, True)
    )
    detection_predictor = detection_model.predict_batch(
        degirum_tools.video_source(video_stream)
    )

    sound_label = ""
    try:
        while True:  # press 'x' or 'q' to abort
            # do asynchronous ML inferences for both models (each one can be None if not ready):
            sound_result = next(sound_predictor)
            detection_result = next(detection_predictor)

            # process sound classification result (just remember the text)
            if sound_result is not None:
                sound_label = f"{sound_result.results[0]['label']}: {sound_result.results[0]['score']}"

            # process video detection result (just display the annotated frame)
            if detection_result is not None:
                img = detection_result.image_overlay
                degirum_tools.put_text(
                    img,
                    sound_label,
                    (1, img.shape[0] - 40),
                    font_color=(0, 0, 0),
                    bg_color=(255, 255, 255),
                )
                display.show(img)
    except StopIteration:
        pass