## Visualization Notebook for RTLS Results

<img src="images/viz_rtls_sample.png" width=1080/>

### This notebook generates a bird's-eye view (BEV) video for the visualization of outputs from the RTLS (Real-Time Location System) reference application.

Central to the video is the floor plan map, serving as the primary display area, onto which moving tracklets for each detected global ID are superimposed. Users have the option to incorporate camera views around the perimeter of the floor plan. When these views are activated, the video integrates the detection and tracking results from individual cameras. Furthermore, if camera views are enabled, the video sequence includes a rotation through each camera, sequentially spotlighting its view. This process involves highlighting the selected camera's perspective and projecting its field of view (FOV) onto the floor plan map, ensuring each camera is featured individually at the start of the video.

#### Configuration

A configuration file in JSON format needs to be provided for this notebook to work. A sample configuration file is given as follows:

```
{
    "inputSetting": {
        "calibrationPath": "path/to/calibration.json",
        "mapPath": "path/to/map.png",
        "rtlsLogPath": "path/to/mdx-rtls.log",
        "videoDirPath": "path/to/folder/containing/videos",
        "rawDataPath": "path/to/raw_data.log"
    },
    "outputSetting": {
        "outputVideoPath": "path/to/output_video.mp4",
        "outputMapHeight": 1080,
        "displaySensorViews": false,
        "sensorViewsLayout": "radial",
        "sensorViewDisplayMode": "rotational",
        "sensorFovDisplayMode": "rotational",
        "skippedBeginningTimeSec": 0.0,
        "outputVideoDurationSec": 60.0,
        "sensorSetup": 8,
        "bufferLengthThreshSec": 3.0,
        "trajectoryLengthThreshSec": 5.0,
        "sensorViewStartTimeSec": 2.0,
        "sensorViewDurationSec": 1.0,
        "sensorViewGapSec": 0.1
    }
}
```

- **`calibrationPath`**: Path to the standard Metropolis calibration file.
- **`mapPath`**: Path to the floor plan map image, utilized for plotting in BEV.
- **`rtlsLogPath`**: Path to the log file containing Kafka messages from the `mdx-rtls` topic, which includes the RTLS results in JSON format.
- **`videoDirPath`**: Directory path containing videos from each camera. This parameter is disregarded if `displaySensorViews` is `false`.
- **`rawDataPath`**: Path to the raw data that contains DeepStream perception results in JSON or protobuf format. In this example, the protobuf format with potential AMR data is used. This parameter is ignored if `displaySensorViews` is `false`.
- **`outputVideoPath`**: Path where the output video file will be saved.
- **`outputMapHeight`**: The height of the map image in the output video.
- **`displaySensorViews`**: A boolean flag to enable or disable the display of camera views. Enabling this can significantly increase RAM usage and slow down processing, depending on the number of cameras involved.
- **`sensorViewsLayout`**: The sensor views' layout has two options - "radial" and "split". The radial layout shows sensor views surrounding the map view at the center. The split layout shows the sensor views on the left and the map view on the right. This parameter is required if `displaySensorViews` is `true`.
- **`sensorViewDisplayMode`**: The display mode of the sensor views has two options - "rotational" and "cumulative". The rotational mode highlights each sensor view individually in a rotational manner. The cumulative mode keeps all the previous sensor views highlighted while circling through all sensor views. This parameter is required if `displaySensorViews` is `true`.
- **`sensorFovDisplayMode`**: The display mode of sensors' FOV in the map of floor plan has two options - "rotational" and "cumulative". The rotational mode displays each sensor's FOV individually in a rotational manner. The cumulative mode keeps all the previous sensors' FOV displayed while circling through all sensors. This parameter is required if `displaySensorViews` is `true`.
- **`skippedBeginningTimeSec`**: Duration (in seconds) to skip at the beginning of the output video, accommodating for RTLS initialization time.
- **`outputVideoDurationSec`**: Specifies the length of the output video. Adjusting this value helps manage the balance between quick checks and generating longer video outputs.
- **`sensorSetup`**: Required only if `displaySensorViews` is `true`. Specifies the configuration for tiled windows based on the number of cameras. Supported configurations are for 8, 12, 16, 30, 40, and 100 cameras. Selecting a configuration with fewer cameras than you have will display only that number, and choosing a larger configuration will leave additional windows blank.
- **`bufferLengthThreshSec`**: Threshold for buffered locations in seconds, used for smoothing trajectories.
- **`trajectoryLengthThreshSec`**: Maximum trajectory length threshold in seconds for plotting on the floor plan.
- **`sensorViewStartTimeSec`**: Start time for displaying each camera view in the rotation, required if `displaySensorViews` is `true`.
- **`sensorViewDurationSec`**: Duration for displaying each camera view in the rotation, required if `displaySensorViews` is `true`.
- **`sensorViewGapSec`**: Time gap between displaying each camera view in the rotation, required if `displaySensorViews` is `true`.

## Import modules

In [None]:
# **Copyright (c) 2009-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.**

import os
import cv2
from mdx.mtmc.config import VizRtlsConfig
from mdx.mtmc.utils.io_utils import load_json_from_file
from mdx.mtmc.utils.viz_rtls_utils import VizConfig, GlobalObjects, read_rtls_log, \
  read_protobuf_data_with_amr_data, read_videos, plot_combined_image

## Load config file

In [None]:
viz_config_path = "resources/viz_rtls_config.json"
assert os.path.exists(viz_config_path), "Viz config not found"
viz_config = VizRtlsConfig(**load_json_from_file(viz_config_path))
viz_config = VizConfig(viz_config)

## Load input files

In [None]:
image_map = cv2.imread(viz_config.rtls_config.input.mapPath)
rtls_log, frame_ids = read_rtls_log(viz_config.rtls_config.input.rtlsLogPath)
map_video_name_to_capture = None
content_by_frame_id = None
data_dict = dict()
data_dict, amr_log, amr_frame_ids = read_protobuf_data_with_amr_data(viz_config.rtls_config.input.rawDataPath)
if viz_config.rtls_config.output.displaySensorViews:
    map_video_name_to_capture = read_videos(viz_config.rtls_config.input.videoDirPath)
fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
video = cv2.VideoWriter(viz_config.rtls_config.output.outputVideoPath, fourcc, viz_config.fps,
                        viz_config.output_video_size)
global_people = GlobalObjects(viz_config)
global_amrs = GlobalObjects(viz_config)

## Create output video

In [None]:
import time
start_time = time.time()

if viz_config.rtls_config.output.displaySensorViews:
    if viz_config.sensor_views_layout == "radial":
        print("Creating RTLS visualization with sensor views surrounding the map...")
    if viz_config.sensor_views_layout == "split":
        print("Creating RTLS visualization with sensor views on the left...")
else:
    print("Creating RTLS visualization without sensor views...")
    
num_frames_to_skip = int(viz_config.rtls_config.output.skippedBeginningTimeSec * viz_config.fps)
num_output_frames = int(viz_config.rtls_config.output.outputVideoDurationSec * viz_config.fps)
num_total_frames = num_frames_to_skip + num_output_frames
for frame_id in range(num_total_frames):
    image_output = plot_combined_image(viz_config, image_map, map_video_name_to_capture, global_people, global_amrs,
                                       data_dict, rtls_log, frame_ids, amr_log, amr_frame_ids, frame_id)

    if frame_id > num_frames_to_skip:  
        if viz_config.output_video_size[0] != image_output.shape[1]:
            print(f"ERROR: The frame width {image_output.shape[1]} is different from "
                  f"output video width {viz_config.output_video_size[0]}."
                  f"The assumption is that all videos share the same size.")
            exit(1)
        if viz_config.output_video_size[1] != image_output.shape[0]:
            print(f"ERROR: The frame height {image_output.shape[0]} is different from "
                  f"output video height {viz_config.output_video_size[1]}."
                  f"The assumption is that all videos share the same size.")
            exit(1)
        video.write(image_output)

    processed_percentage = (frame_id / num_total_frames) * 100
    if processed_percentage % 5 == 0:
        end_time = time.time()
        print("Time used: {0:.2f} sec. Finished {1:.1f}%.".format(end_time - start_time, processed_percentage))

video.release() 

print("Done")

## [Optional] Re-encode video and play in the notebook

In [None]:
# [Optional] Re-encode video to reduce the file size
import subprocess
import shutil

# Check if FFmpeg is already installed
if shutil.which("ffmpeg") is None:
    # Install FFmpeg using apt
    subprocess.run(["apt-get", "update"])
    subprocess.run(["apt-get", "install", "-y", "ffmpeg"])
else:
    print("FFmpeg is already installed.")

output_video_path = viz_config.rtls_config.output.outputVideoPath.split(".mp4")[0] + "_reencoded.mp4"
ffmpeg_command = "ffmpeg -y -i {} -vcodec libx265 -crf 28 {}".format(viz_config.rtls_config.output.outputVideoPath, output_video_path)
os.system(ffmpeg_command)

In [None]:
from ipywidgets import Video
Video.from_file(output_video_path, width=480)