# Getting started 🚀

Welcome to the ICRA 2024 Cloth Competition! In this notebook we will load and explore the data.

Run the cell below to download a part of the dataset (10 samples, this is ~1 GB) and unzip it.
You only need to run the cell once, then you can comment it out.

☁️ For the full dataset, see: https://cloud.ilabt.imec.be/index.php/s/Sy945rbamg8JMgR

## 1. Directory structure 📂

In [None]:
import os
from dataclasses import fields
from pathlib import Path

import cv2
import matplotlib.pyplot as plt
import numpy as np
import open3d as o3d
from airo_camera_toolkit.point_clouds.conversions import point_cloud_to_open3d
from airo_dataset_tools.data_parsers.pose import Pose
from cloth_tools.annotation.grasp_annotation import GraspAnnotation
from cloth_tools.dataset.format import load_competition_observation
from cloth_tools.dataset.download import download_and_extract_dataset
from cloth_tools.visualization.opencv import draw_pose


data_dir = Path("data")
dataset_dir = data_dir / "cloth_competition_dataset_0000_0-9"

In [None]:
os.path.exists(dataset_dir)

In the cell below we download a small part (10 episodes) of the dataset if no dataset was found.

In [None]:
if os.path.exists(dataset_dir):
    print(f"Found existing dataset in: {dataset_dir}")
else:
    print(f"Downloading dataset to directory: {data_dir}")
    dataset_zip_url = "https://cloud.ilabt.imec.be/index.php/s/BMg3c9g2i6oKJgN/download/cloth_competition_dataset_0000_0-9.zip" 
    dataset_dir = download_and_extract_dataset(data_dir, dataset_zip_url)
    dataset_dir = Path(dataset_dir)
    print(f"Downloaded and extracted dataset to directory: {dataset_dir}")

In [None]:
def emoji(dir: str, file: str) -> str:
    if os.path.isdir(os.path.join(dir, file)):
        return "📁"
    elif file.endswith(".jpg") or file.endswith(".png"):
        return "🖼️"
    elif file.endswith(".mp4"):
        return "🎥"
    return "📄"

print("First directories in the dataset:")
for f in sorted(os.listdir(dataset_dir))[:5]:
    print(emoji(dataset_dir, f), f)

One sample in the dataset corresponds to one episode. 
An episode consists of one attempt at unfolding a piece of hanging cloth by grasping it at a human-annotated point.

A sample directory contains the following files:

In [None]:
sample_dir = dataset_dir / "sample_000000"

for f in os.listdir(sample_dir):
    print(emoji(sample_dir, f), f)

One sample thus contains two observations, the **start** and **result**, a **grasp** annotation and a video of the episode.

* 🔎 The **start** observation is taken after the cloth has been grasped by its lowest point.
* 🔎 The **result** observation is taken after the attempt to unfold it.
* 👉 The **grasp** pose annotation used to unfold the garment, currently these are human-annotated.
* 🎥 The **video** of the entire episode.

Participants of the ICRA 2024 Cloth Competition will be asked to predict a good **grasp**, given the **start** observation.

The grasp will be evaluated based on the **result** observation (using the surface area of cloth).

## 2. Start Observation 🔎

In this section we explore some of the data contained in the start observation.

In [None]:
observation_start_dir = sample_dir / "observation_start"

observation = load_competition_observation(observation_start_dir)

print("Overview of the fields in an Cloth Competition Observation:")
for field in fields(observation):
    field_name = field.name + ":"
    field_value = getattr(observation, field.name)
    if isinstance(field_value, np.ndarray):
        print(f" - {field_name:<34} np.ndarray {field_value.shape} {field_value.dtype}")
    else:
        print(f" - {field_name:<34} {field.type}")

### 2.1 Color images 🌈

The dataset is collect using a [Zed2i](https://store.stereolabs.com/en-eu/products/zed-2i) stereo RGB-D camera 📷📷.
For this reason, we provide two color images. 
One for the left camera and one for the right camera.

In [None]:
plt.figure(figsize=(20, 10))
plt.subplot(1, 2, 1)
plt.imshow(observation.image_left)
plt.title("Left image")
plt.subplot(1, 2, 2)
plt.imshow(observation.image_right)
plt.title("Right image")
plt.show()

### 2.2 Depth and confidence maps 🌌

The slight difference in perspective between the left and right view is used by the ZED SDK to estimate depth.
The [ZED SDK](https://www.stereolabs.com/docs/depth-sensing/depth-settings) has several depth modes, we use the NEURAL mode and enable FILL.

In [None]:
depth_map = observation.depth_map
confidence_map = observation.confidence_map

print(f"depth_map: {depth_map.shape} {depth_map.dtype}, range: {depth_map.min():.2f}-{depth_map.max():.2f}")
print(f"confidence_map: {confidence_map.shape} {confidence_map.dtype}, range: {confidence_map.min():.2f}-{confidence_map.max():.2f}")

plt.figure(figsize=(20, 10))
plt.subplot(1, 2, 1)
plt.imshow(observation.depth_map)
plt.title("Depth map")
plt.colorbar(fraction=0.025, pad=0.04)
plt.subplot(1, 2, 2)
plt.imshow(observation.confidence_map)
plt.title("Confidence map")
plt.colorbar(fraction=0.025, pad=0.04)
plt.show()

### 2.3 Stereo camera parameters 📷📷

In [None]:
with np.printoptions(precision=3, suppress=True):
    print("Resolution:", observation.camera_resolution)
    print("\nIntrinsics (camera image formation characteristics): \n", observation.camera_intrinsics)
    print("\nExtrinsics (pose of the left camera in the world frame): \n", observation.camera_pose_in_world)
    print("\nPose of right camera expressed in left camera frame: \n", observation.right_camera_pose_in_left_camera)

### 2.4 Colored point cloud ✨

In [None]:
point_cloud = observation.point_cloud

point_cloud.points.shape, point_cloud.points.dtype

In [None]:
point_cloud.colors.shape, point_cloud.colors.dtype

In [None]:
pcd = point_cloud_to_open3d(point_cloud)

world_frame = o3d.geometry.TriangleMesh.create_coordinate_frame(size=0.5, origin=[0, 0, 0])

o3d.visualization.draw_geometries([pcd.to_legacy(), world_frame])

## 3. Grasp Annotation 👉

In [None]:
grasp_dir = sample_dir / "grasp"

for f in os.listdir(grasp_dir):
    print(emoji(grasp_dir, f), f)


image_frontal_annotated = cv2.imread(str(grasp_dir / "frontal_image_grasp.jpg"))
image_top_annotated = cv2.imread(str(grasp_dir / "topdown_image_grasp.jpg"))

plt.figure(figsize=(20, 10))
plt.subplot(1, 2, 1)
plt.imshow(cv2.cvtColor(image_frontal_annotated, cv2.COLOR_BGR2RGB))
plt.title("Grasp annotation window - frontal image")
plt.subplot(1, 2, 2)
plt.imshow(cv2.cvtColor(image_top_annotated, cv2.COLOR_BGR2RGB))
plt.title("Grasp annotation window - (virtual) topdown image")
plt.show()


In [None]:
grasp_pose_file = grasp_dir / "grasp_pose.json"
grasp_annotation_file = grasp_dir / "grasp_annotation.json"

with open(grasp_pose_file, "r") as f:
    grasp_pose = Pose.model_validate_json(f.read()).as_homogeneous_matrix()


with open(grasp_annotation_file, "r") as f:
    grasp_annotation = GraspAnnotation.model_validate_json(f.read())

with np.printoptions(precision=3, suppress=True):
    print("Grasp pose:\n", grasp_pose)
    print("\nGrasp annotation:\n", grasp_annotation)

## 4. Result Observation 🎉

In [None]:
observation_result_dir = sample_dir / "observation_result"

observation_result = load_competition_observation(observation_result_dir)

plt.figure(figsize=(10, 5))
plt.imshow(observation_result.image_left)
plt.title("Result: image of cloth after grasping and stretching")
plt.show()

ℹ️ The precise calculation of the evaluation metric will be released at a later date.

## 5. Coordinate frames 📐

In [None]:
X_W_C = observation.camera_pose_in_world
X_W_TCPL = observation.arm_left_tcp_pose_in_world
X_W_TCPR = observation.arm_right_tcp_pose_in_world
X_W_LB = observation.arm_left_pose_in_world
X_W_RB = observation.arm_right_pose_in_world
intrinsics = observation.camera_intrinsics

X_W_GRASP = grasp_pose

image_bgr = cv2.cvtColor(observation.image_left, cv2.COLOR_RGB2BGR)

draw_pose(image_bgr, np.identity(4), intrinsics, X_W_C, 0.25)
draw_pose(image_bgr, X_W_LB, intrinsics, X_W_C)
draw_pose(image_bgr, X_W_RB, intrinsics, X_W_C)
draw_pose(image_bgr, X_W_TCPL, intrinsics, X_W_C, 0.05)
draw_pose(image_bgr, X_W_TCPR, intrinsics, X_W_C, 0.05)
draw_pose(image_bgr, X_W_GRASP, intrinsics, X_W_C, 0.05)

image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)

plt.figure(figsize=(10, 5))
plt.imshow(image_rgb)
plt.title("Coordinate frames visualization")
plt.show()


❔ If you have any questions, feel free to ask in on the [Github Discussions page](https://github.com/Victorlouisdg/cloth-competition/discussions)!