# Waymo Open Dataset E2ED Visualization

This notebook visualizes all camera images, ego vehicle intent, sequence number, and index number from the Waymo Open Dataset End-to-End Driving (E2ED) data. It also projects past trajectory states onto back cameras (side_left, side_right) and future states onto front cameras (front, front_left, front_right).

**Dataset Details**:
- Visualizes all cameras: front (1), front_left (2), front_right (3), side_left (4), side_right (5).
- Past states (t = -4s to 0s) are shown on back cameras; future states (t = 0s to 5s) on front cameras.

## Package Installation

In [2]:
import matplotlib.pyplot as plt
import tensorflow as tf
import os
import numpy as np
import cv2
from waymo_open_dataset import dataset_pb2 as open_dataset
from waymo_open_dataset.wdl_limited.camera.ops import py_camera_model_ops
from waymo_open_dataset.protos import end_to_end_driving_data_pb2 as wod_e2ed_pb2


2025-04-24 20:40:32.655184: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-24 20:40:32.656527: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2025-04-24 20:40:32.683546: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2025-04-24 20:40:32.684131: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


## Loading the Data

In [4]:
DATASET_FOLDER = 'gs://waymo_open_dataset_end_to_end_camera_v_1_0_0'

TRAIN_FILES = os.path.join(DATASET_FOLDER, '*.tfrecord-*')
VALIDATION_FILES = os.path.join(DATASET_FOLDER, '*.tfrecord-*')
TEST_FILES = os.path.join(DATASET_FOLDER, '*.tfrecord-*')

In [15]:
os.environ['CURL_CA_BUNDLE'] = '/home/aaylen/Documents/Waymo-Challenge/cacert.pem'
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/home/aaylen/Documents/Waymo-Challenge/token1.json'

In [14]:
filenames = tf.io.matching_files(TRAIN_FILES)
dataset = tf.data.TFRecordDataset(filenames, compression_type='')
dataset_iter = dataset.as_numpy_iterator()

2025-04-24 20:40:40.851070: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-04-24 20:40:40.851416: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1956] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2025-04-24 20:40:42.283656: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Pla

In [3]:
# Set environment variables for authentication
os.environ['CURL_CA_BUNDLE'] = '/home/aaylen/Documents/Waymo-Challenge/cacert.pem'
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/home/aaylen/Documents/Waymo-Challenge/token1.json'

# Define dataset path
DATASET_FOLDER = 'gs://waymo_open_dataset_end_to_end_camera_v_1_0_0'
TRAIN_FILES = os.path.join(DATASET_FOLDER, '*.tfrecord-*')

# Initialize dataset
filenames = tf.io.matching_files(TRAIN_FILES)
dataset = tf.data.TFRecordDataset(filenames, compression_type='')
dataset_iter = dataset.as_numpy_iterator()

# Retrieve one example (targeting the specified frame)
target_frame_name = 'b197472f28df9f18c22654a5b514082a-072'
data = None
for bytes_example in dataset_iter:
    frame_data = wod_e2ed_pb2.E2EDFrame()
    frame_data.ParseFromString(bytes_example)
    print(frame_data.frame.context.name)
    if frame_data.frame.context.name == target_frame_name:
        data = frame_data
        break


2025-04-24 21:01:52.137860: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_0' with dtype string and shape [315]
	 [[{{node Placeholder/_0}}]]


d6cdf6eb1b7d4a8be6dac71f34e6cdb7-164
b197472f28df9f18c22654a5b514082a-072


## Visualization Functions

In [None]:
def get_all_cameras(data: wod_e2ed_pb2.E2EDFrame):
    """Return all 8 camera images and calibrations."""
    image_list = []
    calibration_list = []
    camera_names = {
        1: 'FRONT',
        2: 'FRONT_LEFT',
        3: 'FRONT_RIGHT',
        4: 'SIDE_LEFT',
        5: 'SIDE_RIGHT',
        6: 'REAR_LEFT',
        7: 'REAR',
        8: 'REAR_RIGHT'
    }
    order = [1, 2, 3, 4, 5, 6, 7, 8]
    
    for camera_name in order:
        for index, image_content in enumerate(data.frame.images):
            if image_content.name == camera_name:
                calibration = data.frame.context.camera_calibrations[index]
                image = tf.io.decode_image(image_content.image).numpy()
                image_list.append((camera_names[camera_name], image))
                calibration_list.append((camera_names[camera_name], calibration))
                break
    
    return image_list, calibration_list

def project_vehicle_to_image(vehicle_pose, calibration, points):
    """Projects from vehicle coordinate system to image with global shutter."""
    pose_matrix = np.array(vehicle_pose.transform).reshape(4, 4)
    world_points = np.zeros_like(points)
    for i, point in enumerate(points):
        cx, cy, cz, _ = np.matmul(pose_matrix, [*point, 1])
        world_points[i] = (cx, cy, cz)

    extrinsic = tf.reshape(
        tf.constant(list(calibration.extrinsic.transform), dtype=tf.float32), [4, 4])
    intrinsic = tf.constant(list(calibration.intrinsic), dtype=tf.float32)
    metadata = tf.constant([
        calibration.width,
        calibration.height,
        open_dataset.CameraCalibration.GLOBAL_SHUTTER,
    ], dtype=tf.int32)
    camera_image_metadata = list(vehicle_pose.transform) + [0.0] * 10

    return py_camera_model_ops.world_to_image(
        extrinsic, intrinsic, metadata, camera_image_metadata, world_points).numpy()

def draw_points_on_image(image, points, size, color=(255, 0, 0)):
    """Draws points on an image."""
    image_copy = image.copy()
    for point in points:
        if point[2] > 0:  # Check if point is valid (ok flag)
            cv2.circle(image_copy, (int(point[0]), int(point[1])), size, color, -1)
    return image_copy


## Visualize All Cameras with Trajectories and Metadata

In [None]:
all_camera_images, all_camera_calibrations = get_all_cameras(data)

past_waypoints = np.stack([data.past_states.pos_x, data.past_states.pos_y, np.zeros_like(data.past_states.pos_x)], axis=1)
future_waypoints = np.stack([data.future_states.pos_x, data.future_states.pos_y, data.future_states.pos_z], axis=1)

vehicle_pose = data.frame.images[0].pose

images_with_points = []
camera_names = [name for name, _ in all_camera_images]
for i, (camera_name, image) in enumerate(all_camera_images):
    calibration = next(calib for name, calib in all_camera_calibrations if name == camera_name)
    if camera_name in ['SIDE_LEFT', 'SIDE_RIGHT', 'REAR', 'REAR_LEFT', 'REAR_RIGHT']:
        waypoints_camera_space = project_vehicle_to_image(vehicle_pose, calibration, past_waypoints)
        image_with_points = draw_points_on_image(image, waypoints_camera_space, size=15, color=(0, 255, 0))
    elif camera_name in ['FRONT', 'FRONT_LEFT', 'FRONT_RIGHT']:
        waypoints_camera_space = project_vehicle_to_image(vehicle_pose, calibration, future_waypoints)
        image_with_points = draw_points_on_image(image, waypoints_camera_space, size=15, color=(255, 0, 0))
    else:
        image_with_points = image.copy()
    images_with_points.append((camera_name, image_with_points))

top_down_image = create_top_down_view(all_camera_images, all_camera_calibrations, past_waypoints, future_waypoints, vehicle_pose)

frame_name = data.frame.context.name
uuid, seq_num = frame_name.rsplit('-', 1)
seq_num = int(seq_num)
intent = wod_e2ed_pb2.EgoIntent.Intent.Name(data.intent)
index_num = seq_num

fig = plt.figure(figsize=(20, 12))
axes = fig.add_subplot(2, 4, (1, 3))
axes.imshow(top_down_image)
axes.set_title('Top-Down Stitched View')
axes.axis('off')

for i, (camera_name, image) in enumerate(images_with_points):
    ax = fig.add_subplot(2, 4, i+5)
    ax.imshow(image)
    ax.set_title(camera_name)
    ax.axis('off')

plt.suptitle(f'Frame: {frame_name}\nUUID: {uuid}\nSeqNum: {seq_num}\nIndex: {index_num}\nIntent: {intent}', fontsize=14)
plt.tight_layout(rect=[0, 0, 1, 0.95])
plt.show()

ValueError: too many values to unpack (expected 2)