Skip to content

Conversation

@nadarenator
Copy link
Collaborator

  • Mirroring fix for classic dynamics to jerk.

@nadarenator nadarenator changed the title Joint action sampling with Jerk dynamics Joint action sampling with jerk dynamics Nov 28, 2025
@daphne-cornelisse daphne-cornelisse marked this pull request as ready for review December 1, 2025 15:19
@greptile-apps
Copy link

greptile-apps bot commented Dec 1, 2025

Greptile Overview

Greptile Summary

This PR migrates the jerk dynamics model from a 2-dimensional multi-discrete action space to a joint action space, matching the pattern already used for classic dynamics. The change treats the longitudinal and lateral jerk actions as dependent (joint sampling) rather than independent.

Key changes:

  • Action space changed from MultiDiscrete([4, 3]) to MultiDiscrete([12]) in drive.py:112
  • Action decoding in drive.h:1638-1646 now uses division/modulo to extract longitudinal and lateral indices from a single integer
  • Neural network output configuration in drivenet.h:60-62 updated to produce a single 12-dimensional logit vector instead of two separate vectors

The implementation correctly mirrors the classic dynamics approach (lines 1571-1579 in drive.h), ensuring consistency across dynamics models.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The changes are straightforward, well-structured, and follow an established pattern from the classic dynamics implementation. The mathematical transformation (using division and modulo to decode joint actions) is correct, and all three components (Python environment, C physics engine, and neural network configuration) are updated consistently. The action space size remains 12 (4×3), preserving the same action semantics while changing only the sampling approach.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
pufferlib/ocean/drive/drive.h 5/5 Changed discrete jerk action decoding from 2D array [long_idx, lat_idx] to single integer using division/modulo to extract indices, matching classic dynamics pattern
pufferlib/ocean/drive/drive.py 5/5 Changed action space from MultiDiscrete([4, 3]) to MultiDiscrete([4 * 3]) for joint action sampling
pufferlib/ocean/drive/drivenet.h 5/5 Updated neural network output configuration from 2-dimensional action space (action_dim=2, sizes [4,3]) to single joint action space (action_dim=1, size 12)

Sequence Diagram

sequenceDiagram
    participant Agent as RL Agent
    participant DriveNet as DriveNet (drivenet.h)
    participant Env as Drive Environment (drive.py)
    participant Physics as Physics Engine (drive.h)
    
    Note over Agent,Physics: Jerk Dynamics - Joint Action Space
    
    Agent->>DriveNet: Forward pass with observations
    Note over DriveNet: action_dim=1<br/>logit_sizes[0]=12<br/>(4 long × 3 lat)
    DriveNet->>Agent: Single logit vector [12]
    Agent->>Agent: Sample action from [0-11]
    Agent->>Env: Step(action)
    Note over Env: MultiDiscrete([12])
    Env->>Physics: move_dynamics(action_idx)
    Physics->>Physics: action_val = action_array[action_idx]
    Physics->>Physics: a_long_idx = action_val / 3<br/>a_lat_idx = action_val % 3
    Note over Physics: Decode joint action:<br/>12 possibilities from<br/>4 longitudinal × 3 lateral
    Physics->>Physics: a_long = JERK_LONG[a_long_idx]<br/>a_lat = JERK_LAT[a_lat_idx]
    Physics->>Physics: Update acceleration, velocity, position
    Physics->>Env: Updated agent state
    Env->>Agent: observation, reward, done, info
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@daphne-cornelisse daphne-cornelisse merged commit 95ceedd into main Dec 1, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants