
# Hot3D Data Provider Tutorial

In order to use sequences from the HOT3D dataset, you will need ot use the Hot3dDataProvider object.

This notebook is explaining how to use the various "DataProvider" in order to retrieve:
- Section 0: DataProvider initialization
- Section 1: Device calibration and Image data
- Section 2: Pose data
  - Section 2.a: Device/Headset pose data
  - Section 2.b: Hand pose data
  - Section 2.b.a: Hand pose data and MESH hands
  - Section 2.c: Object pose data
- Section 3:
  - Section 3.a: Object bounding boxes (amodal bounding boxes)
  - Section 3.b: Hand bounding boxes (amodal bounding boxes)
- Section 4: Eye Gaze data (only for Aria data)
- Section 5: Camera reprojection (reprojection hand vertices to raw fish images)

Hot3dDataProvider API is organized as follow:
```
|- device_data_provider        -> provides device calibration and image data
|- device_pose_data_provider   -> provides device pose data
|- mano_hand_data_provider     -> provides hand pose data (MANO representation)
|- umetrack_hand_data_provider -> provides hand pose data (UmeTrack representation)
|- object_pose_data_provider   -> provides object pose data
|- object_library              -> provides information about the HOT3D 3D objects/assets
|- hand_box2d_data_provider    -> provides hands bbox information
|- object_box2d_data_provider  -> provides objects bbox information
```

## Notes
- All Device/Headset, Hand, Object poses data are shared in world coordinates (meters)

In this tutorial you will learn that:
- Device data, such as Image data stream is indexed with a stream_id
- Headset use camera rig coordinates relative to the DEVICE pose (world_camera_stream_id = world_device @ device_camera_stream_id)

In [1]:
#
# Section 0: DataProvider initialization
#
# Take home message:
# - Device data, such as Image data stream is indexed with a stream_id
# - Intrinsics and Extrinsics calibration relative to the device coordinates is available for each CAMERA/stream_id
#
# Data Requirements:
# - a sequence
# - the object library
# Optional:
# - To use the Mano hand you need to have the LEFT/RIGHT *.pkl hand models (available)

import os
from dataset_api import Hot3dDataProvider
from data_loaders.loader_object_library import load_object_library
from data_loaders.mano_layer import MANOHandModel

home = os.path.expanduser("~")
hot3d_dataset_path = home + "/Dataset/ljh/dataset/hot3d/full-hot3d"
sequence_path = os.path.join(hot3d_dataset_path, "P0003_5766eae8")
object_library_path = home +"/Dataset/ljh/dataset/hot3d/assets"
mano_hand_model_path = home + "/dir/mano_v1_2/models"

if not os.path.exists(sequence_path) or not os.path.exists(object_library_path):
    print("Invalid input sequence or library path.")
    print("Please do update the path to VALID values for your system.")
    raise
#
# Init the object library
#
object_library = load_object_library(object_library_folderpath=object_library_path)

#
# Init the HANDs model
# If None, the UmeTrack HANDs model will be used
#
mano_hand_model = None
if mano_hand_model_path is not None:
    mano_hand_model = MANOHandModel(mano_hand_model_path)

#
# Initialize hot3d data provider
#
hot3d_data_provider = Hot3dDataProvider(
    sequence_folder=sequence_path,
    object_library=object_library,
    mano_hand_model=mano_hand_model,
)
print(f"data_provider statistics: {hot3d_data_provider.get_data_statistics()}")



num_betas=10, shapedirs.shape=(778, 3, 10), self.SHAPE_SPACE_DIM=300
num_betas=10, shapedirs.shape=(778, 3, 10), self.SHAPE_SPACE_DIM=300
MPS Data Paths
MPS SLAM Data Paths
--closedLoopTrajectory: /home/jeongho/Dataset/ljh/dataset/hot3d/full-hot3d/P0003_5766eae8/mps/slam/closed_loop_trajectory.csv
--openLoopTrajectory: /home/jeongho/Dataset/ljh/dataset/hot3d/full-hot3d/P0003_5766eae8/mps/slam/open_loop_trajectory.csv
--semidensePoints: /home/jeongho/Dataset/ljh/dataset/hot3d/full-hot3d/P0003_5766eae8/mps/slam/semidense_points.csv.gz
--semidenseObservations: /home/jeongho/Dataset/ljh/dataset/hot3d/full-hot3d/P0003_5766eae8/mps/slam/semidense_observations.csv.gz
--onlineCalibration: /home/jeongho/Dataset/ljh/dataset/hot3d/full-hot3d/P0003_5766eae8/mps/slam/online_calibration.jsonl
--summary: /home/jeongho/Dataset/ljh/dataset/hot3d/full-hot3d/P0003_5766eae8/mps/slam/summary.json
MPS Eyegaze Data Paths
--generalEyegaze: /home/jeongho/Dataset/ljh/dataset/hot3d/full-hot3d/P0003_5766eae8/mps/

[38;2;000;128;000m[MultiRecordFileReader][DEBUG]: Opened file '/home/jeongho/Dataset/ljh/dataset/hot3d/full-hot3d/P0003_5766eae8/recording.vrs' and assigned to reader #0[0m
[0m[38;2;000;000;255m[VrsDataProvider][INFO]: streamId 214-1/camera-rgb activated[0m
[0m[38;2;000;000;255m[VrsDataProvider][INFO]: Timecode stream found: 285-2[0m
[0m[38;2;000;000;255m[VrsDataProvider][INFO]: Fail to activate streamId 286-1[0m
[0m[38;2;000;000;255m[VrsDataProvider][INFO]: streamId 1201-1/camera-slam-left activated[0m
[0m[38;2;000;000;255m[VrsDataProvider][INFO]: streamId 1201-2/camera-slam-right activated[0m
[0m[38;2;000;000;255m[VrsDataProvider][INFO]: streamId 1202-1/imu-right activated[0m
[0m[38;2;000;000;255m[VrsDataProvider][INFO]: streamId 1202-2/imu-left activated[0m
[0m

In [2]:
# Utility functions
# Used for interactive display in the following sections
#
#
import rerun as rr
import numpy as np

from projectaria_tools.core.sophus import SE3
from projectaria_tools.utils.rerun_helpers import ToTransform3D


def log_image(
    image: np.array,
    label: str,
    static=False
) -> None:
    rr.log(label, rr.Image(image), static=static)


def log_pose(
    pose: SE3,
    label: str,
    static=False
) -> None:
    rr.log(label, ToTransform3D(pose, False), static=static)

# A gentle introduction to the "GT Data" Provider API

Take home message:
- All "GT data provider" are using a similar API interface to query data at a given timestamp and/or StreamID.
- If the requested timestamp does not exists, the closest one can be retrieve along its delta time (dt).

All the following "GT data providers" are accessible from Hot3dDataProvider and using a similar API interface.
```
|- device_pose_data_provider   -> device/headset pose data
|- mano_hand_data_provider     -> hand pose data (MANO hand model)
|- umetrack_hand_data_provider -> hand pose data (UmeTrack hand model)
|- object_pose_data_provider   -> object pose data
|- hand_box2d_data_provider    -> hand information such as amodal BBox and visibility ratio
|- object_box2d_data_provider  -> object information such as amodal BBox and visibility ratio
```

We are here shortly introducing the retrieval concept used, and then will showcase how to use each data_provider.
GT data providers enable retrieving information at a given TIMESTAMP
- If the timestamp is not exact, the closest one can will be returned,
- Delta Time (dt) between the found sample and the query timestamp is returned
  Meaning that you known if you have a perfect match to the GT time sample or retrieved a close sample.
  
Note: Some GT data providers are STREAM_ID specific and enable retrieve information for a given image stream.
```
data_with_dt = device_pose_provider.get_pose_at_timestamp(
   timestamp_ns: int,                           -> Timestamp
   stream_id: StreamID,                         -> If used, specify for which VRS image stream you query the data
   time_query_options: TimeQueryOptions,        -> Retrieval configuration, i.e TimeQueryOptions.CLOSEST
   time_domain: TimeDomain,                     -> TimeDomain (always use TimeDomain.TIME_CODE)
   acceptable_time_delta: Optional[int] = None, -> Threshold to reject delta dt that would be too large (using 0 or None is recommended)
```

Here is how most of the interface will be used in the following sections:
```
data_with_dt = X_provider.get_X_at_timestamp(
    timestamp_ns=timestamp_ns,
    time_query_options=TimeQueryOptions.CLOSEST,
    time_domain=TimeDomain.TIME_CODE)
```

In [9]:
# Section 1: Device calibration and Image data

from tqdm import tqdm

#
# Retrieve some statistics about the "IMAGE" VRS recording
#

# Getting the device data provider (alias)
device_data_provider = hot3d_data_provider.device_data_provider

# Retrieve the list of image stream supported by this sequence
# It will return the RGB and SLAM Left/Right image streams
image_stream_ids = device_data_provider.get_image_stream_ids()
# Retrieve a list of timestamps for the sequence (in nanoseconds)
timestamps = device_data_provider.get_sequence_timestamps()

print(f"Sequence: {os.path.basename(os.path.normpath(sequence_path))}")
print(f"Device type is {hot3d_data_provider.get_device_type()}")
print(f"Image stream ids: {image_stream_ids}")
print(f"Number of timestamp for this sequence: {len(timestamps)}")
print(
    f"Duration of the sequence: {(timestamps[-1] - timestamps[0]) / 1e9} (seconds)"
)  # Timestamps are in nanoseconds


# Init a rerun context to visualize the sequence file images
rr.init("Device images")
rec = rr.memory_recording()

# How to iterate over timestamps using a slice to show one timestamp every 200
timestamps_slice = slice(None, None, 200)
# Loop over the timestamps of the sequence and visualize corresponding data
for timestamp_ns in tqdm(timestamps[timestamps_slice]):

    for stream_id in image_stream_ids:
        # Retrieve the image stream label as string
        image_stream_label = device_data_provider.get_image_stream_label(stream_id)
        # Retrieve the image data for a given timestamp
        image_data = device_data_provider.get_image(timestamp_ns, stream_id)
        # Visualize the image data (it's a numpy array)
        log_image(label=f"img/{image_stream_label}", image=image_data)


#
# Retrieve Camera calibration (intrinsics and extrinsics) for a given stream_id
#
for stream_id in image_stream_ids:
    # Retrieve the camera calibration (intrinsics and extrinsics) for a given stream_id
    [extrinsics, intrinsics] = device_data_provider.get_camera_calibration(stream_id)
    print(intrinsics)
    # We will show in next section how to visualize the position of the camera in the world frame

# Showing the rerun window
# rr.notebook_show()

[2025-05-06T22:08:56Z WARN  re_sdk::log_sink] Dropping data in MemorySink


Sequence: P0003_5766eae8
Device type is Headset.Aria
Image stream ids: [214-1, 1201-1, 1201-2]
Number of timestamp for this sequence: 3687
Duration of the sequence: 122.866632014 (seconds)


100%|██████████| 19/19 [00:00<00:00, 50.53it/s]

CameraCalibration(label: camera-rgb, model name: Fisheye624, principal point: [707.397, 707.154], focal length: [609.195, 609.195], projection params: [609.195, 707.397, 707.154, 0.389402, -0.387784, -0.125112, 1.55909, -1.98291, 0.720775, -0.000469152, 0.000364496, 0.000965818, -0.000290893, -0.00111536, -0.000101303], image size (w,h): [1408, 1408], T_Device_Camera:(translation:[-0.00414696, -0.0122964, -0.00474385], quaternion(x,y,z,w):[0.327852, 0.0378477, 0.0316434, 0.94344]), serialNumber:0450577b730410834401100000000000)
CameraCalibration(label: camera-slam-left, model name: Fisheye624, principal point: [318.229, 236.975], focal length: [241.414, 241.414], projection params: [241.414, 318.229, 236.975, -0.0283439, 0.104745, -0.0759371, 0.0158346, -0.000111902, -0.000199961, 0.000929693, -0.00184501, -0.000276403, -7.76735e-05, 0.00234218, -6.94112e-05], image size (w,h): [640, 480], T_Device_Camera:(translation:[1.56125e-17, -1.38778e-17, 3.46945e-18], quaternion(x,y,z,w):[0, 0,




In [None]:
#
# Section 2: Pose data
#
# Take home message:
# - the device_pose_provider enables you to retrieve the Headset pose as (T_world_device)
# - amoving to the device to a given camera can be done by using calibration data and combining SE3 poses
#   - such as T_world_camera = T_world_device @ T_device_camera
#

from projectaria_tools.core.sensor_data import TimeDomain, TimeQueryOptions

# Alias over the HEADSET/Device pose data provider
device_pose_provider = hot3d_data_provider.device_pose_data_provider

# Init a rerun context to visualize the device trajectory
rr.init("Device/Headset trajectory")
rec = rr.memory_recording()

pose_translations = []
# Retrieve the position of the device in the world frame at a given timestamp
for timestamp_ns in tqdm(timestamps):

    rr.set_time_nanos("synchronization_time", int(timestamp_ns))
    rr.set_time_sequence("timestamp", timestamp_ns)

    headset_pose3d_with_dt = None
    if device_pose_provider is None:
        continue
    headset_pose3d_with_dt = device_pose_provider.get_pose_at_timestamp(
        timestamp_ns=timestamp_ns,
        time_query_options=TimeQueryOptions.CLOSEST,
        time_domain=TimeDomain.TIME_CODE,
    )

    if headset_pose3d_with_dt is None:
        continue

    headset_pose3d = headset_pose3d_with_dt.pose3d
    T_world_device = headset_pose3d.T_world_device
    
    log_pose(pose=T_world_device, label="world/device")
    pose_translations.append(T_world_device.translation()[0])
    # This is the pose of the device, to move to a given camera, you need to apply the device_camera transformation
    #for stream_id in image_stream_ids:
       # # Retrieve the camera calibration (intrinsics and extrinsics) for a given stream_id
       # [T_device_camera, intrinsics] = device_data_provider.get_camera_calibration(stream_id)
       # # The pose of the given camera at this timestamp is (world_camera = world_device @ device_camera):
       # T_world_camera = headset_pose3d.T_world_device @ T_device_camera
       # camera_stream_label = device_data_provider.get_image_stream_label(stream_id)
       # print(f"Image stream label: {camera_stream_label} -> world_camera translation: {T_world_camera.translation()[0]}")

rr.log("world/device_trajectory", rr.LineStrips3D([pose_translations]), static=True)

# Showing the rerun window
rr.notebook_show()

NameError: name 'tqdm' is not defined

In [None]:
#
# Section 2.b: Hand pose data
#
# Take home message:
# - Hands are labelled as LEFT or RIGHT hands
# - "Hands pose" are representing the WRIST pose on which a MESH or LANDMARKS can be attached (see next section)
#

# Alias over the HAND pose data provider
hand_data_provider = hot3d_data_provider.mano_hand_data_provider if hot3d_data_provider.mano_hand_data_provider is not None else hot3d_data_provider.umetrack_hand_data_provider

# Init a rerun context to visualize the hand pose data trajectory
rr.init("Hand pose trajectory (wrist)")
rec = rr.memory_recording()

# Accumulate HAND poses translations as list, to show a LINE strip HAND trajectory
left_hand_pose_translations = []
right_hand_pose_translations = []

# Retrieve the position of the device in the world frame at a given timestamp
for timestamp_ns in tqdm(timestamps):

    rr.set_time_nanos("synchronization_time", int(timestamp_ns))
    rr.set_time_sequence("timestamp", timestamp_ns)

    hand_poses_with_dt = None
    if hand_data_provider is None:
        continue
    
    hand_poses_with_dt = hand_data_provider.get_pose_at_timestamp(
        timestamp_ns=timestamp_ns,
        time_query_options=TimeQueryOptions.CLOSEST,
        time_domain=TimeDomain.TIME_CODE,
    )

    if hand_poses_with_dt is None:
        continue
        
    hand_pose_collection = hand_poses_with_dt.pose3d_collection

    for hand_pose_data in hand_pose_collection.poses.values():
        # Retrieve the handedness of the hand (i.e Left or Right)
        handedness_label = hand_pose_data.handedness_label()

        T_world_wrist = hand_pose_data.wrist_pose
        log_pose(pose=T_world_wrist, label=f"world/hand/{handedness_label}")

        # Accumulate HAND poses translations as list, to show a LINE strip HAND trajectory
        if hand_pose_data.is_left_hand():
            left_hand_pose_translations.append(T_world_wrist.translation()[0])
        elif hand_pose_data.is_right_hand():
            right_hand_pose_translations.append(T_world_wrist.translation()[0])

rr.log("world/left_hand", rr.LineStrips3D([left_hand_pose_translations]), static=True)
rr.log("world/right_hand", rr.LineStrips3D([right_hand_pose_translations]), static=True)

# Showing the rerun window
rr.notebook_show()

In [None]:
#
# Section 2.b.a: Hand pose data
#
# Take home message:
# - Hands are labelled as LEFT or RIGHT hands
# - Hands can be retrieved as:
#   - Landmarks and displayed as line
#   - Vertices
#   - Mesh (using vertices, faces index and normals)
#

from data_loaders.hand_common import LANDMARK_CONNECTIVITY


# Alias over the HAND pose data provider
hand_data_provider = hot3d_data_provider.mano_hand_data_provider if hot3d_data_provider.mano_hand_data_provider is not None else hot3d_data_provider.umetrack_hand_data_provider

# Init a rerun context
rr.init("Hand pose LANDMARK/MESH")
rec = rr.memory_recording()

left_hand_pose_translations = []
right_hand_pose_translations = []

# Limit to the first 300 timestamps
for timestamp_ns in tqdm(timestamps[:300]):

    rr.set_time_nanos("synchronization_time", int(timestamp_ns))
    rr.set_time_sequence("timestamp", timestamp_ns)

    hand_poses_with_dt = None
    if hand_data_provider is None:
        continue
        
    hand_poses_with_dt = hand_data_provider.get_pose_at_timestamp(
        timestamp_ns=timestamp_ns,
        time_query_options=TimeQueryOptions.CLOSEST,
        time_domain=TimeDomain.TIME_CODE,
    )

    if hand_poses_with_dt is None:
        continue
    
    hand_pose_collection = hand_poses_with_dt.pose3d_collection

    for hand_pose_data in hand_pose_collection.poses.values():
        # Retrieve the handedness of the hand (i.e Left or Right)
        handedness_label = hand_pose_data.handedness_label()

        # Skeleton/Joints landmark representation (for LEFT hand)
        if hand_pose_data.is_left_hand():
            hand_landmarks = hand_data_provider.get_hand_landmarks(
                hand_pose_data
            )
            # convert landmarks to connected lines for display
            # (i.e retrieve points along the HAND LANDMARK_CONNECTIVITY as a list)
            points = [connections
                      for connectivity in LANDMARK_CONNECTIVITY
                      for connections in [[hand_landmarks[it].numpy().tolist() for it in connectivity]]]
            rr.log(
                f"world/{handedness_label}/joints",
                rr.LineStrips3D(points, radii=0.002),
            )

        #
        # Plot RIGHT hand as a Triangular Mesh representation
        #
        if hand_pose_data.is_right_hand():
            hand_mesh_vertices = hand_data_provider.get_hand_mesh_vertices(hand_pose_data)
            hand_triangles, hand_vertex_normals = hand_data_provider.get_hand_mesh_faces_and_normals(hand_pose_data)
            
            rr.log(
                f"world/{handedness_label}/mesh_faces",
                rr.Mesh3D(
                    vertex_positions=hand_mesh_vertices,
                    vertex_normals=hand_vertex_normals,
                    triangle_indices=hand_triangles,
                ),
            )

# Showing the rerun window
rr.notebook_show()

In [None]:
#
# Section 2.c: Object pose data
#
# Take home message:
# - Each object is associated with a Unique Identified (uid)
# - The object library enables to retrieve the 3D asset linked to this UID (a glb file)
#

from data_loaders.loader_object_library import ObjectLibrary

# Alias over the Object pose data provider
object_pose_data_provider = hot3d_data_provider.object_pose_data_provider

# Keep track of what 3D assets has been loaded/unloaded so we will load them only when needed
# So we will load them only when required for Rerun
object_cache_status = {}

# Init a rerun context
rr.init("Object pose")
rec = rr.memory_recording()

# Limit to the some timestamps
for timestamp_ns in tqdm(timestamps[100:300]):

    rr.set_time_nanos("synchronization_time", int(timestamp_ns))
    rr.set_time_sequence("timestamp", timestamp_ns)

    object_poses_with_dt = (
        object_pose_data_provider.get_pose_at_timestamp(
            timestamp_ns=timestamp_ns,
            time_query_options=TimeQueryOptions.CLOSEST,
            time_domain=TimeDomain.TIME_CODE,
        )
    )
    if object_poses_with_dt is None:
        continue

    objects_pose3d_collection = object_poses_with_dt.pose3d_collection

    # Keep a mapping to know what object has been seen, and which one has not
    object_uids = object_pose_data_provider.object_uids_with_poses
    logging_status = {x: False for x in object_uids}

    for (
        object_uid,
        object_pose3d,
    ) in objects_pose3d_collection.poses.items():

        object_name = object_library.object_id_to_name_dict[object_uid]
        object_name = object_name + "_" + str(object_uid)
        object_cad_asset_filepath = ObjectLibrary.get_cad_asset_path(
            object_library_folderpath=object_library.asset_folder_name,
            object_id=object_uid,
        )

        log_pose(pose=object_pose3d.T_world_object, label=f"world/objects/{object_name}")
        
        # Mark object has been seen (enable to know which object has been logged or not)
        # I.E and object not logged, has not been seen and will have its entity cleared for rerun
        logging_status[object_uid] = True

        # Link the corresponding 3D object to the pose
        if object_uid not in object_cache_status.keys():
            object_cache_status[object_uid] = True
            rr.log(
                f"world/objects/{object_name}",
                rr.Asset3D(
                    path=object_cad_asset_filepath,
                ),
            )

    # Rerun specifics (if an entity is disapearing, the last status is shown)
    # To compensate that , if some objects are not visible, we clear the entity
    for object_uid, displayed in logging_status.items():
        if not displayed:
            object_name = object_library.object_id_to_name_dict[object_uid]
            object_name = object_name + "_" + str(object_uid)
            rr.log(
                f"world/objects/{object_name}",
                rr.Clear.recursive(),
            )
            if object_uid in object_cache_status.keys():
                del object_cache_status[object_uid]  # We will log the mesh again

# Showing the rerun window
rr.notebook_show()

In [None]:
#
# Section 3.a: Object bounding boxes
#
#
from projectaria_tools.core.stream_id import StreamId

import matplotlib.pyplot as plt # Used to display consistent colored Bounding Boxes contours

# Alias over the Object box2d data provider and Device data provider (to get image data)
object_box2d_data_provider = hot3d_data_provider.object_box2d_data_provider
device_data_provider = hot3d_data_provider.device_data_provider

# Retrieve a distinct color mapping for object bounding box
# by using a colormap (i.e associate a object_uid to a specific color)
object_uids = list(object_box2d_data_provider.object_uids) # list of available object_uid used to map them to [0, 1, 2, ...] indices
object_box2d_colors = None
if object_box2d_data_provider is not None:
    color_map = plt.get_cmap("viridis")
    object_box2d_colors = color_map(
        np.linspace(0, 1, len(object_uids))
    )
else:
    print("This section expect to have valid bounding box data")


# Init a rerun context
rr.init("Object bounding boxed and visibility ratio")
rec = rr.memory_recording()

# Use SLAM-LEFT image (exists for both Aria and Quest files)
stream_id = StreamId("1201-1")
if stream_id not in object_box2d_data_provider.stream_ids:
    print(f"The object_box2d_data_provider does not have data for this StreamId: {stream_id}")


# Limit to the some timestamps
for timestamp_ns in tqdm(timestamps[100:200]):

    rr.set_time_nanos("synchronization_time", int(timestamp_ns))
    rr.set_time_sequence("timestamp", timestamp_ns)

    # Retrieve data for this timestamp and specific stream_id
    box2d_collection_with_dt = (
        object_box2d_data_provider.get_bbox_at_timestamp(
            stream_id=stream_id,
            timestamp_ns=timestamp_ns,
            time_query_options=TimeQueryOptions.CLOSEST,
            time_domain=TimeDomain.TIME_CODE,
        )
    )
    if box2d_collection_with_dt is None:
        continue
    if (
        box2d_collection_with_dt is None
        and box2d_collection_with_dt.box2d_collection or None
    ):
        continue
    
    # We have valid data, returned as a collection
    # i.e for each object_uid, we retrieve its BBOX and visibility
    object_uids_at_query_timestamp = (
        box2d_collection_with_dt.box2d_collection.object_uid_list
    )

    for object_uid in object_uids_at_query_timestamp:
        object_name = object_library.object_id_to_name_dict[object_uid]
        axis_aligned_box2d = box2d_collection_with_dt.box2d_collection.box2ds[object_uid]
        bbox = axis_aligned_box2d.box2d
        visibility_ratio = axis_aligned_box2d.visibility_ratio
        if bbox is None:
            continue

        rr.log(
            f"{stream_id}_raw/bbox/{object_name}",
            rr.Boxes2D(
                mins=[bbox.left, bbox.top],
                sizes=[bbox.width, bbox.height],
                colors=object_box2d_colors[object_uids.index(object_uid)],
            ),
        )
        rr.log(f"visibility_ratio/{object_name}", rr.Scalar(visibility_ratio))
        
        # Log the corresponding image
        image_stream_label = device_data_provider.get_image_stream_label(stream_id)
        # Retrieve the image data for a given timestamp
        image_data = device_data_provider.get_image(timestamp_ns, stream_id)
        # Visualize the image data (it's a numpy array)
        log_image(label=f"{stream_id}_raw", image=image_data)

# Showing the rerun window
rr.notebook_show()


  3%|▎         | 3/100 [00:00<00:01, 56.21it/s]


KeyboardInterrupt: 

In [7]:
#
# Section 3.b: Hand bounding boxes
#
#
from projectaria_tools.core.stream_id import StreamId
from projectaria_tools.core.sensor_data import TimeDomain, TimeQueryOptions
from data_loaders.loader_hand_poses import LEFT_HAND_INDEX, RIGHT_HAND_INDEX
import matplotlib.pyplot as plt # Used to display consistent colored Bounding Boxes contours

# Alias over the Hand box2d data provider and Device data provider (to get image data)
hand_box2d_data_provider = hot3d_data_provider.hand_box2d_data_provider
device_data_provider = hot3d_data_provider.device_data_provider

# Retrieve a distinct color mapping for hand bounding box
# by using a colormap (i.e associate a hand_uid to a specific color)
hand_uids = [LEFT_HAND_INDEX, RIGHT_HAND_INDEX]
hand_box2d_colors = None
if hand_box2d_data_provider is not None:
    color_map = plt.get_cmap("viridis")
    hand_box2d_colors = color_map(
        np.linspace(0, 1, len(hand_uids))
    )
else:
    print("This section expect to have valid bounding box data")


# Init a rerun context
rr.init("Hand bounding boxed and visibility ratio")
rec = rr.memory_recording()

# Use SLAM-LEFT image (exists for both Aria and Quest files)
stream_id = StreamId("1201-1")
if stream_id not in hand_box2d_data_provider.stream_ids:
    print(f"The hand_box2d_data_provider does not have data for this StreamId: {stream_id}")


# Limit to the some timestamps
for timestamp_ns in tqdm(timestamps[100:200]):

    rr.set_time_nanos("synchronization_time", int(timestamp_ns))
    rr.set_time_sequence("timestamp", timestamp_ns)

    # Retrieve data for this timestamp and specific stream_id
    box2d_collection_with_dt = (
        hand_box2d_data_provider.get_bbox_at_timestamp(
            stream_id=stream_id,
            timestamp_ns=timestamp_ns,
            time_query_options=TimeQueryOptions.CLOSEST,
            time_domain=TimeDomain.TIME_CODE,
        )
    )
    
    if box2d_collection_with_dt is None:
        continue
    if (
        box2d_collection_with_dt is None
        and box2d_collection_with_dt.box2d_collection or None
    ):
        continue

    
    # We have valid data, returned as a collection
    # i.e for each hand_uid, we retrieve its BBOX and visibility
    for hand_uid in hand_uids:
        hand_name = "left" if hand_uid == LEFT_HAND_INDEX else "right"
        axis_aligned_box2d = box2d_collection_with_dt.box2d_collection.box2ds[hand_uid]
        bbox = axis_aligned_box2d.box2d
        visibility_ratio = axis_aligned_box2d.visibility_ratio
        if bbox is None:
            continue

        rr.log(
            f"{stream_id}_raw/bbox/{hand_name}",
            rr.Boxes2D(
                mins=[bbox.left, bbox.top],
                sizes=[bbox.width, bbox.height],
                colors=object_box2d_colors[hand_uids.index(hand_uid)],
            ),
        )
        rr.log(f"visibility_ratio/{hand_name}", rr.Scalar(visibility_ratio))
        
        # Log the corresponding image
        image_stream_label = device_data_provider.get_image_stream_label(stream_id)
        # Retrieve the image data for a given timestamp
        image_data = device_data_provider.get_image(timestamp_ns, stream_id)
        # Visualize the image data (it's a numpy array)
        log_image(label=f"{stream_id}_raw", image=image_data)

# Showing the rerun window
rr.notebook_show()


  0%|          | 0/100 [00:00<?, ?it/s]


NameError: name 'object_box2d_colors' is not defined

In [None]:

#
# Section 3.a: Object bounding boxes
#
#
from tqdm import tqdm

from projectaria_tools.core.stream_id import StreamId

import matplotlib.pyplot as plt # Used to display consistent colored Bounding Boxes contours

# Alias over the Object box2d data provider and Device data provider (to get image data)
object_box2d_data_provider = hot3d_data_provider.object_box2d_data_provider
device_data_provider = hot3d_data_provider.device_data_provider

# Retrieve a distinct color mapping for object bounding box
# by using a colormap (i.e associate a object_uid to a specific color)
object_uids = list(object_box2d_data_provider.object_uids) # list of available object_uid used to map them to [0, 1, 2, ...] indices
object_box2d_colors = None
if object_box2d_data_provider is not None:
    color_map = plt.get_cmap("viridis")
    object_box2d_colors = color_map(
        np.linspace(0, 1, len(object_uids))
    )
else:
    print("This section expect to have valid bounding box data")


# Init a rerun context
rr.init("Object bounding boxed and visibility ratio")
rec = rr.memory_recording()

# Use SLAM-LEFT image (exists for both Aria and Quest files)
stream_id = StreamId("1201-1")
if stream_id not in object_box2d_data_provider.stream_ids:
    print(f"The object_box2d_data_provider does not have data for this StreamId: {stream_id}")


# Limit to the some timestamps
for timestamp_ns in tqdm(timestamps[100:200]):

    rr.set_time_nanos("synchronization_time", int(timestamp_ns))
    rr.set_time_sequence("timestamp", timestamp_ns)

    # Retrieve data for this timestamp and specific stream_id
    box2d_collection_with_dt = (
        object_box2d_data_provider.get_bbox_at_timestamp(
            stream_id=stream_id,
            timestamp_ns=timestamp_ns,
            time_query_options=TimeQueryOptions.CLOSEST,
            time_domain=TimeDomain.TIME_CODE,
        )
    )
    if box2d_collection_with_dt is None:
        continue
    if (
        box2d_collection_with_dt is None
        and box2d_collection_with_dt.box2d_collection or None
    ):
        continue
    
    # We have valid data, returned as a collection
    # i.e for each object_uid, we retrieve its BBOX and visibility
    object_uids_at_query_timestamp = (
        box2d_collection_with_dt.box2d_collection.object_uid_list
    )

    for object_uid in object_uids_at_query_timestamp:
        object_name = object_library.object_id_to_name_dict[object_uid]
        axis_aligned_box2d = box2d_collection_with_dt.box2d_collection.box2ds[object_uid]
        bbox = axis_aligned_box2d.box2d
        visibility_ratio = axis_aligned_box2d.visibility_ratio
        if bbox is None:
            continue

        rr.log(
            f"{stream_id}_raw/bbox/{object_name}",
            rr.Boxes2D(
                mins=[bbox.left, bbox.top],
                sizes=[bbox.width, bbox.height],
                colors=object_box2d_colors[object_uids.index(object_uid)],
            ),
        )
        rr.log(f"visibility_ratio/{object_name}", rr.Scalar(visibility_ratio))
        
        # Log the corresponding image
        image_stream_label = device_data_provider.get_image_stream_label(stream_id)
        # Retrieve the image data for a given timestamp
        image_data = device_data_provider.get_image(timestamp_ns, stream_id)
        # Visualize the image data (it's a numpy array)
        log_image(label=f"{stream_id}_raw", image=image_data)

# Showing the rerun window
rr.notebook_show()

In [None]:
# Copyright (c) Meta Platforms, Inc. and affiliates.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from typing import Dict, List, Optional

import matplotlib.pyplot as plt

import numpy as np
import rerun as rr  # @manual

from data_loaders.hand_common import LANDMARK_CONNECTIVITY
from data_loaders.headsets import Headset
from data_loaders.loader_hand_poses import HandType
from data_loaders.loader_object_library import ObjectLibrary
from projectaria_tools.core.stream_id import StreamId  # @manual

try:
    from dataset_api import Hot3dDataProvider  # @manual
except ImportError:
    from hot3d.dataset_api import Hot3dDataProvider

from data_loaders.HandDataProviderBase import (  # @manual
    HandDataProviderBase,
    HandPose3dCollectionWithDt,
)
from data_loaders.ObjectBox2dDataProvider import (  # @manual
    ObjectBox2dCollectionWithDt,
    ObjectBox2dProvider,
)

from data_loaders.ObjectPose3dProvider import (  # @manual
    ObjectPose3dCollectionWithDt,
    ObjectPose3dProvider,
)

from projectaria_tools.core.calibration import (
    CameraCalibration,
    DeviceCalibration,
    FISHEYE624,
    LINEAR,
)
from projectaria_tools.core.mps import get_eyegaze_point_at_depth  # @manual

from projectaria_tools.core.mps.utils import (  # @manual
    filter_points_from_confidence,
    filter_points_from_count,
)

from projectaria_tools.core.sensor_data import TimeDomain, TimeQueryOptions  # @manual
from projectaria_tools.core.sophus import SE3  # @manual
from projectaria_tools.utils.rerun_helpers import (  # @manual
    AriaGlassesOutline,
    ToTransform3D,
)


class Hot3DVisualizer:
    def __init__(
        self,
        hot3d_data_provider,
        hand_type: HandType = HandType.Umetrack,
        **kargs
    ) -> None:
        for key, value in kargs.items():
            setattr(self, key, value)
        self._hot3d_data_provider = hot3d_data_provider
        # Device calibration and Image stream data
        self._device_data_provider = hot3d_data_provider.device_data_provider
        # Data provider at time T (for device & objects & hand poses)
        self._device_pose_provider = hot3d_data_provider.device_pose_data_provider
        self._hand_data_provider = (
            hot3d_data_provider.umetrack_hand_data_provider
            if hand_type == HandType.Umetrack
            else hot3d_data_provider.mano_hand_data_provider
        )
        if hand_type is HandType.Umetrack:
            print("Hot3DVisualizer is using UMETRACK hand model")
        elif hand_type is HandType.Mano:
            print("Hot3DVisualizer is using MANO hand model")
        self._object_pose_data_provider = hot3d_data_provider.object_pose_data_provider
        self._object_box2d_data_provider = (
            hot3d_data_provider.object_box2d_data_provider
        )
        # Object library
        self._object_library = hot3d_data_provider.object_library

        # If required
        # Retrieve a distinct color mapping for object bounding box to show consistent color across stream_ids
        # - Use a Colormap for visualizing object bounding box
        self._object_box2d_colors = None
        if self._object_box2d_data_provider is not None:
            color_map = plt.get_cmap("viridis")
            self._object_box2d_colors = color_map(
                np.linspace(0, 1, len(self._object_box2d_data_provider.object_uids))
            )

        # Keep track of what 3D assets has been loaded/unloaded so we will load them only when needed
        self._object_cache_status = {}

        # To be parametrized later
        self._jpeg_quality = 75

    def log_static_assets(
        self,
        image_stream_ids: List[StreamId],
    ) -> None:
        """
        Log all static assets (aka Timeless assets)
        - assets that are immutable (but can still move if attached to a 3D Pose)
        """

        # Configure the world coordinate system to ease navigation
        if self._hot3d_data_provider.get_device_type() is Headset.Aria:
            rr.log("world", rr.ViewCoordinates.RIGHT_HAND_Z_UP, static=True)
        else:
            rr.log("world", rr.ViewCoordinates.RIGHT_HAND_Y_UP, static=True)

        if self._hot3d_data_provider.get_device_type() is Headset.Aria:
            ## for Aria devices, we use online calibration which is a dynamic asset
            pass
        elif self._hot3d_data_provider.get_device_type() is Headset.Quest3:
            # For each of the stream ids we want to use, export the camera calibration (intrinsics and extrinsics)
            for stream_id in image_stream_ids:
                #
                # Plot the camera configuration
                [extrinsics, intrinsics] = (
                    self._device_data_provider.get_camera_calibration(stream_id)
                )
                Hot3DVisualizer.log_pose(
                    f"world/device/{stream_id}", extrinsics, static=True
                )
                Hot3DVisualizer.log_calibration(f"world/device/{stream_id}", intrinsics)

        # Deal with Aria specifics
        # - Glasses outline
        # - Point cloud
        if self._hot3d_data_provider.get_device_type() is Headset.Aria:
            Hot3DVisualizer.log_aria_glasses(
                "world/device/glasses_outline",
                self._device_data_provider.get_device_calibration(),
            )

            # Point cloud (downsampled for visualization)
            point_cloud = self._device_data_provider.get_point_cloud()
            if point_cloud:
                # Filter out low confidence points
                threshold_invdep = 5e-4
                threshold_dep = 5e-4
                point_cloud = filter_points_from_confidence(
                    point_cloud, threshold_invdep, threshold_dep
                )
                # Down sample points
                points_data_down_sampled = filter_points_from_count(
                    point_cloud, 500_000
                )
                # Retrieve point position
                point_positions = [it.position_world for it in points_data_down_sampled]
                POINT_COLOR = [200, 200, 200]
                rr.log(
                    "world/points",
                    rr.Points3D(point_positions, colors=POINT_COLOR, radii=0.002),
                    static=True,
                )

    def log_dynamic_assets(
        self,
        stream_ids: List[StreamId],
        timestamp_ns: int,
        idx: int
    ) -> None:
        """
        Log dynamic assets:
        I.e assets that are moving, such as:
        - 3D assets
        - Device pose
        - Hands
        - Object poses
        - Image related specifics assets
        - images (stream_ids)
        - Object Bounding boxes
        - Aria Eye Gaze
        """

        #
        ## Retrieve and log data that is not stream_id dependent (pure 3D data)
        #
        acceptable_time_delta = 0
        
        self.time = idx

        if self._hot3d_data_provider.get_device_type() is Headset.Aria:
            # For each of the stream ids we want to use, export the camera calibration (intrinsics and extrinsics)
            for stream_id in stream_ids:
                #
                # Plot the camera configuration
                [extrinsics, intrinsics] = (
                    self._device_data_provider.get_online_camera_calibration(
                        stream_id=stream_id, timestamp_ns=timestamp_ns
                    )
                )
                Hot3DVisualizer.log_pose(f"world/device/{stream_id}", extrinsics)
                Hot3DVisualizer.log_calibration(f"world/device/{stream_id}", intrinsics)

        elif self._hot3d_data_provider.get_device_type() is Headset.Quest3:
            ## for Quest devices we will use factory calibration which is a static asset
            pass

        headset_pose3d_with_dt = None
        if self._device_data_provider is not None:
            headset_pose3d_with_dt = self._device_pose_provider.get_pose_at_timestamp(
                timestamp_ns=timestamp_ns,
                time_query_options=TimeQueryOptions.CLOSEST,
                time_domain=TimeDomain.TIME_CODE,
                acceptable_time_delta=acceptable_time_delta,
            )

        hand_poses_with_dt = None
        if self._hand_data_provider is not None:
            hand_poses_with_dt = self._hand_data_provider.get_pose_at_timestamp(
                timestamp_ns=timestamp_ns,
                time_query_options=TimeQueryOptions.CLOSEST,
                time_domain=TimeDomain.TIME_CODE,
                acceptable_time_delta=acceptable_time_delta,
            )

        object_poses_with_dt = None
        if self._object_pose_data_provider is not None:
            object_poses_with_dt = (
                self._object_pose_data_provider.get_pose_at_timestamp(
                    timestamp_ns=timestamp_ns,
                    time_query_options=TimeQueryOptions.CLOSEST,
                    time_domain=TimeDomain.TIME_CODE,
                    acceptable_time_delta=acceptable_time_delta,
                )
            )

        aria_eye_gaze_data = (
            self._device_data_provider.get_eye_gaze(timestamp_ns)
            if self._hot3d_data_provider.get_device_type() is Headset.Aria
            else None
        )

        #
        ## Log Device pose
        #
        if headset_pose3d_with_dt is not None:
            headset_pose3d = headset_pose3d_with_dt.pose3d
            Hot3DVisualizer.log_pose(
                "world/device", headset_pose3d.T_world_device, static=False
            )

        #
        ## Log Hand poses
        #
        Hot3DVisualizer.log_hands(
            "world/hands",  # /{handedness_label}/... will be added as necessary
            self._hand_data_provider,
            hand_poses_with_dt,
            show_hand_mesh=True,
            show_hand_vertices=False,
            show_hand_landmarks=False,
        )

        #
        ## Log Object poses
        #
        Hot3DVisualizer.log_object_poses(
            "world/objects",
            object_poses_with_dt,
            self._object_pose_data_provider,
            self._object_library,
            self._object_cache_status,
        )

        #
        ## Log stream dependent data
        #
        for stream_id in stream_ids:
            #
            ## Log Image data
            #
            
            #
            ## Eye Gaze image reprojection
            #
            if self._hot3d_data_provider.get_device_type() is Headset.Aria:
                # We are showing EyeGaze reprojection only on the RGB image stream
                if stream_id != StreamId("214-1"):
                    continue

                # Reproject EyeGaze for raw and pinhole images
                camera_configurations = [FISHEYE624, LINEAR]
                for camera_model in camera_configurations:
                    eye_gaze_reprojection_data = (
                        self._device_data_provider.get_eye_gaze_in_camera(
                            stream_id, timestamp_ns, camera_model=camera_model
                        )
                    )
                    if (
                        eye_gaze_reprojection_data is None
                        or not eye_gaze_reprojection_data.any()
                    ):
                        continue

                    label = (
                        f"world/device/{stream_id}/eye-gaze_projection"
                        if camera_model == LINEAR
                        else f"world/device/{stream_id}_raw/eye-gaze_projection_raw"
                    )
                    rr.log(
                        label,
                        rr.Points2D(eye_gaze_reprojection_data, radii=20),
                        # TODO consistent color and size depending of camera resolution
                    )

            # Undistorted image (required if you want see reprojected 3D mesh on the images)
            image_data = self._device_data_provider.get_undistorted_image(
                timestamp_ns, stream_id
            )
            if image_data is not None:
                rr.log(
                    f"world/device/{stream_id}",
                    rr.Image(image_data).compress(jpeg_quality=self._jpeg_quality),
                )

            # Raw device images (required for object bounding box visualization)
            image_data = self._device_data_provider.get_image(timestamp_ns, stream_id)
            if image_data is not None:
                rr.log(
                    f"world/device/{stream_id}_raw",
                    rr.Image(image_data).compress(jpeg_quality=self._jpeg_quality),
                )

            if (
                self._object_box2d_data_provider is not None
                and stream_id in self._object_box2d_data_provider.stream_ids
            ):
                box2d_collection_with_dt = (
                    self._object_box2d_data_provider.get_bbox_at_timestamp(
                        stream_id=stream_id,
                        timestamp_ns=timestamp_ns,
                        time_query_options=TimeQueryOptions.CLOSEST,
                        time_domain=TimeDomain.TIME_CODE,
                    )
                )
                
                if (
                    eye_gaze_reprojection_data is None
                    or not eye_gaze_reprojection_data.any()
                ): 
                    print(f'gaze 없다 ~ {timestamp_ns}')
                    continue
                
                self.log_object_bounding_boxes(
                    stream_id,
                    box2d_collection_with_dt,
                    self._object_box2d_data_provider,
                    self._object_library,
                    self._object_box2d_colors,
                    eye_gaze_reprojection_data
                )

        ## Log device dependent remaining 3D data
        #

        # Log 3D eye gaze
        if aria_eye_gaze_data is not None:
            T_device_CPF = self._device_data_provider.get_device_calibration().get_transform_device_cpf()
            # Compute eye_gaze vector at depth_m (30cm for a proxy 3D vector to display)
            gaze_vector_in_cpf = get_eyegaze_point_at_depth(
                aria_eye_gaze_data.yaw, aria_eye_gaze_data.pitch, depth_m=0.3
            )
            # Draw EyeGaze vector
            rr.log(
                "world/device/eye-gaze",
                rr.Arrows3D(
                    origins=[T_device_CPF @ [0, 0, 0]],
                    vectors=[
                        T_device_CPF @ gaze_vector_in_cpf - T_device_CPF @ [0, 0, 0]
                    ],
                ),
            )
        return self
            

    @staticmethod
    def log_aria_glasses(
        label: str,
        device_calibration: DeviceCalibration,
        use_cad_calibration: bool = True,
    ) -> None:
        ## Plot Project Aria Glasses outline (as lines)
        aria_glasses_point_outline = AriaGlassesOutline(
            device_calibration, use_cad_calibration
        )
        rr.log(label, rr.LineStrips3D([aria_glasses_point_outline]), static=True)

    @staticmethod
    def log_calibration(
        label: str,
        camera_calibration: CameraCalibration,
    ) -> None:
        rr.log(
            label,
            rr.Pinhole(
                resolution=[
                    camera_calibration.get_image_size()[0],
                    camera_calibration.get_image_size()[1],
                ],
                focal_length=float(camera_calibration.get_focal_lengths()[0]),
            ),
            static=True,
        )

    @staticmethod
    def log_pose(label: str, pose: SE3, static=False) -> None:
        rr.log(label, ToTransform3D(pose, False), static=static)

    @staticmethod
    def log_hands(
        label: str,
        hand_data_provider: HandDataProviderBase,
        hand_poses_with_dt: HandPose3dCollectionWithDt,
        show_hand_mesh=True,
        show_hand_vertices=True,
        show_hand_landmarks=True,
    ):
        logged_right_hand_data = False
        logged_left_hand_data = False
        if hand_poses_with_dt is None:
            return

        hand_pose_collection = hand_poses_with_dt.pose3d_collection

        for hand_pose_data in hand_pose_collection.poses.values():
            if hand_pose_data.is_left_hand():
                logged_left_hand_data = True
            elif hand_pose_data.is_right_hand():
                logged_right_hand_data = True

            handedness_label = hand_pose_data.handedness_label()

            # Skeleton/Joints landmark representation
            if show_hand_landmarks:
                hand_landmarks = hand_data_provider.get_hand_landmarks(hand_pose_data)
                # convert landmarks to connected lines for display
                # (i.e retrieve points along the HAND LANDMARK_CONNECTIVITY as a list)
                points = [
                    connections
                    for connectivity in LANDMARK_CONNECTIVITY
                    for connections in [
                        [hand_landmarks[it].numpy().tolist() for it in connectivity]
                    ]
                ]
                rr.log(
                    f"{label}/{handedness_label}/joints",
                    rr.LineStrips3D(points, radii=0.002),
                )

            # Update mesh vertices if required
            hand_mesh_vertices = (
                hand_data_provider.get_hand_mesh_vertices(hand_pose_data)
                if show_hand_vertices or show_hand_mesh
                else None
            )

            # Vertices representation
            if show_hand_vertices:
                rr.log(
                    f"{label}/{handedness_label}/mesh",
                    rr.Points3D(hand_mesh_vertices),
                )

            # Triangular Mesh representation
            if show_hand_mesh:
                [hand_triangles, hand_vertex_normals] = (
                    hand_data_provider.get_hand_mesh_faces_and_normals(hand_pose_data)
                )
                rr.log(
                    f"{label}/{handedness_label}/mesh_faces",
                    rr.Mesh3D(
                        vertex_positions=hand_mesh_vertices,
                        vertex_normals=hand_vertex_normals,
                        triangle_indices=hand_triangles,  # TODO: we could avoid sending this list if we want to save memory
                    ),
                )
        # If some hand data has not been logged, do not show it in the visualizer
        if logged_left_hand_data is False:
            rr.log(f"{label}/left", rr.Clear.recursive())
        if logged_right_hand_data is False:
            rr.log(f"{label}/right", rr.Clear.recursive())

    @staticmethod
    def log_object_poses(
        label: str,  # "world/objects",
        object_poses_with_dt: ObjectPose3dCollectionWithDt,
        object_pose_data_provider: ObjectPose3dProvider,
        object_library: ObjectLibrary,
        object_cache_status: Dict[int, bool],
    ):
        if object_poses_with_dt is None:
            return

        objects_pose3d_collection = object_poses_with_dt.pose3d_collection

        # Keep a mapping to know what object has been seen, and which one has not
        object_uids = object_pose_data_provider.object_uids_with_poses
        logging_status = {x: False for x in object_uids}

        for (
            object_uid,
            object_pose3d,
        ) in objects_pose3d_collection.poses.items():
            object_name = object_library.object_id_to_name_dict[object_uid]
            object_name = object_name + "_" + str(object_uid)
            object_cad_asset_filepath = ObjectLibrary.get_cad_asset_path(
                object_library_folderpath=object_library.asset_folder_name,
                object_id=object_uid,
            )

            Hot3DVisualizer.log_pose(
                f"world/objects/{object_name}",
                object_pose3d.T_world_object,
                False,
            )
            # Mark object has been seen
            logging_status[object_uid] = True

            # Link the corresponding 3D object
            if object_uid not in object_cache_status.keys():
                object_cache_status[object_uid] = True
                rr.log(
                    f"world/objects/{object_name}",
                    rr.Asset3D(
                        path=object_cad_asset_filepath,
                    ),
                )

        # If some object are not visible, we clear the entity (last known mesh and pose will not be displayed)
        for object_uid, displayed in logging_status.items():
            if not displayed:
                object_name = object_library.object_id_to_name_dict[object_uid]
                object_name = object_name + "_" + str(object_uid)
                rr.log(
                    f"world/objects/{object_name}",
                    rr.Clear.recursive(),
                )
                if object_uid in object_cache_status.keys():
                    del object_cache_status[object_uid]  # We will log the mesh again
                    
    @staticmethod
    def is_point_in_box(point, box):
        """
        point: (x, y) 좌표
        box: bounding box 객체 또는 dict
            box.left, box.top, box.width, box.height가 존재해야 함
        """
        x, y = point
        x_min = box.left
        x_max = box.left + box.width
        y_min = box.top
        y_max = box.top + box.height

        return 1 if (x_min <= x <= x_max) and (y_min <= y <= y_max) else 0


    def log_object_bounding_boxes(
        self,
        stream_id: StreamId,
        box2d_collection_with_dt: Optional[ObjectBox2dCollectionWithDt],
        object_box2d_data_provider: ObjectBox2dProvider,
        object_library: ObjectLibrary,
        bbox_colors: np.ndarray,
        eye_gaze_reprojection_data
    ):
        """
        Object bounding boxes (valid for native raw images).
        - We assume that the image corresponding to the stream_id has been logged beforehand as 'world/device/{stream_id}_raw/'
        """

        # Keep a mapping to know what object has been seen, and which one has not
        object_uids = list(object_box2d_data_provider.object_uids)
        logging_status = {x: False for x in object_uids}

        if (
            box2d_collection_with_dt is None
            or box2d_collection_with_dt.box2d_collection is None
        ):
            # No bounding box are retrieved, we clear all the bounding box visualization existing so far
            rr.log(f"world/device/{stream_id}_raw/bbox", rr.Clear.recursive())
            return

        object_uids_at_query_timestamp = (
            box2d_collection_with_dt.box2d_collection.object_uid_list
        )

        for object_uid in object_uids_at_query_timestamp:
            object_name = object_library.object_id_to_name_dict[object_uid]
            axis_aligned_box2d = box2d_collection_with_dt.box2d_collection.box2ds[
                object_uid
            ]
            box = axis_aligned_box2d.box2d
            if box is None:
                continue
            
            axis_aligned_box2d = box2d_collection_with_dt.box2d_collection.box2ds[object_uid]
            visibility_ratio = axis_aligned_box2d.visibility_ratio
            
            logging_status[object_uid] = True
            if visibility_ratio > 0.9: self.vis_num[object_uid] += 1 
            
            inside = Hot3DVisualizer.is_point_in_box(eye_gaze_reprojection_data, box)

            if inside and not self.in_box[object_uid]:
                self.enter_time[object_uid] = self.time
                self.in_box[object_uid] = True

            elif not inside and self.in_box[object_uid]:
                duration = self.time - self.enter_time[object_uid]
                self.durations[object_uid].append(duration)
                self.in_box[object_uid] = False
                self.enter_time[object_uid] = None
            
            rr.log(
                f"world/device/{stream_id}_raw/bbox/{object_name}",
                rr.Boxes2D(
                    mins=[box.left, box.top],
                    sizes=[box.width, box.height],
                    colors=bbox_colors[object_uids.index(object_uid)],
                ),
            )
        # If some object are not visible, we clear the bounding box visualization
        for key, value in logging_status.items():
            if not value:
                object_name = object_library.object_id_to_name_dict[key]
                rr.log(
                    f"world/device/{stream_id}_raw/bbox/{object_name}",
                    rr.Clear.flat(),
                )
rr.init("Hot3D") 

stream_id = StreamId("214-1")
device_data_provider = hot3d_data_provider.device_data_provider
timestamps = device_data_provider.get_sequence_timestamps()
object_box2d_data_provider = hot3d_data_provider.object_box2d_data_provider
object_uids = list(object_box2d_data_provider.object_uids)

func_dict = {
    'vis_num':      {uid: 0      for uid in object_uids},
    'in_box':       {uid: False  for uid in object_uids},
    'enter_time':   {uid: None   for uid in object_uids},
    'durations':    {uid: []     for uid in object_uids},
}
gg = Hot3DVisualizer(hot3d_data_provider, HandType.Mano, **func_dict)

for idx, time in enumerate(timestamps[:100]):
    output = gg.log_dynamic_assets([stream_id], time, idx)
output.vis_num = {output._object_library.object_id_to_name_dict.get(k, k): v for k, v in output.vis_num.items()}
output.durations = {output._object_library.object_id_to_name_dict.get(k, k): v for k, v in output.durations.items()}

print(output.vis_num)
print(output.durations)


[2025-05-07T12:32:08Z INFO  re_sdk_comms::server] Hosting a SDK server over TCP at 0.0.0.0:9876. Connect with the Rerun logging SDK.
Error: winit EventLoopError: os error at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.29.9/src/platform_impl/linux/mod.rs:776: neither WAYLAND_DISPLAY nor DISPLAY is set. -> os error at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.29.9/src/platform_impl/linux/mod.rs:776: neither WAYLAND_DISPLAY nor DISPLAY is set.
[2025-05-07T12:32:09Z WARN  re_sdk_comms::buffered_client] Failed to send message after 3 attempts: Failed to connect to Rerun server at 127.0.0.1:9876: Connection refused (os error 111)
