Skip to content

Conversation

@WaelDLZ
Copy link

@WaelDLZ WaelDLZ commented Nov 19, 2025

PR to add all the WOSAC metrics (except for traffic lights)

  • Interaction and Map-based metrics are implemented in torch now

WaelDLZ and others added 8 commits November 17, 2025 15:53
Notes:
- Need to add safeguards to load each map only once
- Might be slow if we increase num_agents per scenario, next step will
be torch.

I added some tests to see the distance and ttc computations are correct,
and metrics_sanity_check looks okay. I'll keep making some plots to
validate it.
Ref in original code:
message BernoulliEstimate {
    // Additive smoothing to apply to the underlying 2-bins histogram, to avoid
    // infinite values for empty bins.
    optional float additive_smoothing_pseudocount = 4 [default = 0.001];
  }
First step: extract the map from the vecenv
A bunch of little tests in test_map_metric_features.py to ensure this do what it is supposed to do.

python -m pufferlib.ocean.benchmark.test_map_metrics

Next steps should be straightforward.

Will need to check at some point if doing this on numpy isnt too slow
This works, and passes all the tests, I would still want to make additionnal checks with the renderer because we never know.

With this, we have the whole set of WOSAC metrics (except for traffic lights), and we might also have the same issue as the original WOSAC code: it is slow.

Next step would be to transition from numpy to torch.
…en WOSAC sees an offorad or a collision

python pufferlib/ocean/benchmark/visual_sanity_check.py
@daphne-cornelisse daphne-cornelisse self-requested a review November 19, 2025 19:28
@daphne-cornelisse
Copy link

Still need to see how we manage the Tracks_to_predict flag, if we want the interactive metrics to make sense we should control as much agents as we can. Or we should also take into account agents that were valid but not controlled in the distance computation (in interaction_features.py)

Is fixed now

Waël Doulazmi and others added 3 commits November 19, 2025 18:47
…etrics.

It makes the computation way faster, and all the tests pass.

I didn't switch kinematics to torch because it was already fast, but I might make the change for consistency.
@WaelDLZ
Copy link
Author

WaelDLZ commented Nov 20, 2025

This PR would make the WOSAC metrics complete, but also kinda slow (same issue as in the original WOSAC code).

Is also fixed, using torch make things way faster if you have access to a GPU.

Test on a A100 80Gb, with 1024 controlled agents (and ~110 evaluated agents): rollouts take ~22s, compute metrics takes about 5s. (it was taking >5min on numpy)

With 4096 controlled agents (~430 evaluated agents): rollouts take ~90s and compute metrics ~20s

It could be made faster in future work by processing all the scenarios in parallel

@daphne-cornelisse daphne-cornelisse changed the title Wbd/wosac map metrics Add WOSAC interaction + map metrics. Switch from np -> torch. Nov 20, 2025
daphne-cornelisse

This comment was marked as off-topic.

@daphne-cornelisse daphne-cornelisse marked this pull request as ready for review November 21, 2025 23:20
@daphne-cornelisse daphne-cornelisse merged commit 2d30fa3 into main Nov 21, 2025
14 checks passed
@daphne-cornelisse daphne-cornelisse deleted the wbd/wosac_map_metrics branch November 21, 2025 23:21
@greptile-apps
Copy link

greptile-apps bot commented Nov 21, 2025

Greptile Overview

Greptile Summary

This PR implements interaction-based and map-based metrics for the WOSAC (Waymo Open Sim Agents Challenge) evaluation framework, completing the metric coverage except for traffic lights.

Major changes:

  • Added collision detection and time-to-collision computation using PyTorch-based Minkowski sums and signed distance calculations
  • Implemented signed distance to road edges with proper handling of cyclic polylines and complex road geometries (donut roads, acute corners)
  • Migrated distance computations from NumPy to PyTorch for GPU acceleration
  • Added scenario-level likelihood estimation for boolean features (collision, offroad) using Bernoulli distributions
  • Extended evaluator to compute 5 new metrics: distance to nearest object, time to collision, collision indication, distance to road edge, and offroad indication
  • Renamed control_tracks_to_predict to control_wosac for clarity
  • Added C++ bindings to retrieve agent dimensions and road edge polylines

Code organization:

  • Refactored codebase into modular components: geometry_utils.py, kinematic_features.py, interaction_features.py, map_metric_features.py
  • Comprehensive test coverage with visual validation for map metrics

Testing:
The PR includes thorough unit tests covering edge cases like invalid objects, cyclic polylines, and various geometric configurations.

Confidence Score: 4/5

  • safe to merge with minor performance consideration
  • the implementation is mathematically sound with comprehensive tests covering edge cases. geometry utilities properly handle complex scenarios like cyclic polylines and acute corners. the main concern is a CPU-GPU data transfer inefficiency in TTC computation that should be optimized but doesn't affect correctness
  • pufferlib/ocean/benchmark/interaction_features.py - contains inefficient CPU-GPU round-trip in compute_time_to_collision function

Important Files Changed

File Analysis

Filename Score Overview
pufferlib/ocean/benchmark/metrics.py 4/5 added interaction and map metric computation functions (compute_interaction_features, compute_map_features), moved helper functions to separate modules, switched from NumPy to PyTorch for distance calculations
pufferlib/ocean/benchmark/interaction_features.py 3/5 new file implementing collision detection and time-to-collision computation using PyTorch with Minkowski sums; contains CPU-GPU data transfer inefficiency in compute_time_to_collision (line 196-204)
pufferlib/ocean/benchmark/map_metric_features.py 4/5 new file implementing signed distance to road edges with proper handling of cyclic polylines and donut-shaped roads using PyTorch
pufferlib/ocean/benchmark/geometry_utils.py 5/5 new file with PyTorch implementations of 2D geometry operations (box corners, Minkowski sums, signed distances to convex polygons)
pufferlib/ocean/benchmark/evaluator.py 4/5 extended WOSAC evaluator to compute interaction (collision, TTC, distance to nearest object) and map-based (offroad, distance to road edge) metrics with proper aggregation
pufferlib/ocean/drive/drive.py 4/5 renamed control mode from control_tracks_to_predict to control_wosac, added methods to retrieve agent dimensions and road edge polylines for metrics computation
pufferlib/ocean/drive/drive.h 4/5 refactored should_control_agent to use switch statement, added c_get_road_edge_counts and c_get_road_edge_polylines functions, updated control mode naming

Sequence Diagram

sequenceDiagram
    participant Evaluator as WOSACEvaluator
    participant Metrics as metrics.py
    participant Interaction as interaction_features.py
    participant Map as map_metric_features.py
    participant Geometry as geometry_utils.py
    participant Estimators as estimators.py
    
    Evaluator->>Metrics: compute_kinematic_features(x, y, heading)
    Metrics->>Metrics: compute speeds & accelerations
    Metrics-->>Evaluator: kinematic features
    
    Evaluator->>Metrics: compute_interaction_features(x, y, heading, scenario_ids, dimensions, eval_mask)
    loop For each scenario
        Metrics->>Interaction: compute_distance_to_nearest_object()
        Interaction->>Geometry: get_2d_box_corners()
        Interaction->>Geometry: minkowski_sum_of_box_and_box_points()
        Interaction->>Geometry: signed_distance_from_point_to_convex_polygon()
        Interaction-->>Metrics: distances
        Metrics->>Interaction: compute_time_to_collision()
        Interaction->>Metrics: compute_kinematic_features() [CPU transfer]
        Interaction-->>Metrics: TTC values
    end
    Metrics-->>Evaluator: distances, collisions, TTC
    
    Evaluator->>Metrics: compute_map_features(x, y, heading, scenario_ids, dimensions, polylines)
    loop For each scenario
        Metrics->>Map: compute_distance_to_road_edge()
        Map->>Geometry: get_2d_box_corners()
        Map->>Map: _compute_signed_distance_to_polylines()
        Map-->>Metrics: distances
    end
    Metrics-->>Evaluator: road edge distances, offroad indicators
    
    Evaluator->>Estimators: log_likelihood_estimate_timeseries()
    Estimators-->>Evaluator: kinematic likelihoods
    
    Evaluator->>Estimators: log_likelihood_estimate_scenario_level()
    Estimators-->>Evaluator: collision & offroad likelihoods
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

22 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +196 to +204
# NOTE: I know this is ugly, I will fix it another day
speed = metrics.compute_kinematic_features(
x=center_x.cpu().numpy(),
y=center_y.cpu().numpy(),
heading=heading.cpu().numpy(),
seconds_per_step=seconds_per_step,
)[0]
if not isinstance(speed, torch.Tensor):
speed = torch.as_tensor(speed, device=center_x.device, dtype=center_x.dtype)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: inefficient CPU-GPU data transfer within TTC computation

transfers all position data to CPU, computes speed in NumPy, then transfers back to GPU. consider implementing speed calculation directly in PyTorch to avoid this round-trip

Prompt To Fix With AI
This is a comment left during a code review.
Path: pufferlib/ocean/benchmark/interaction_features.py
Line: 196:204

Comment:
**style:** inefficient CPU-GPU data transfer within TTC computation

transfers all position data to CPU, computes speed in NumPy, then transfers back to GPU. consider implementing speed calculation directly in PyTorch to avoid this round-trip

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants