Goal Radius Randomization by riccardosavorgnan · Pull Request #299 · Emerge-Lab/PufferDrive

riccardosavorgnan · 2026-02-16T00:59:14Z

Goal radius randomization. Based on the work by Kevin; remaking this branch was simpler than rebasing the other branch after all the merge conflicts we had.

first commit
Working version with rendering working as well
commit with config comments

…r + interval around it

greptile-apps · 2026-02-16T01:26:27Z

Greptile Summary

This PR adds per-goal randomization of the goal radius for each agent. Previously, the goal radius was a fixed global value (env->goal_radius). Now, when reward_randomization is enabled, each time a new goal is sampled (via sample_new_goal), the goal radius is independently randomized from a uniform distribution over [reward_bound_goal_radius_min, reward_bound_goal_radius_max] (configured as 2.0–12.0m). When randomization is disabled, the fixed goal_radius value is used as before.

New sample_new_goal_radius function added and called at the end of sample_new_goal to decouple goal radius sampling from the rest of reward coefficient generation
Distance check in c_step and all three rendering paths updated to use the per-agent reward_coefs[REWARD_COEF_GOAL_RADIUS] instead of the global env->goal_radius
Goal radius assignment is duplicated between generate_reward_coefs (called at init/reset) and the new sample_new_goal_radius (called per new goal) — consider consolidating to reduce redundancy
Config comment updated to note that the fixed goal_radius value is only active when reward_randomization = 0

Confidence Score: 4/5

This PR is safe to merge — the logic is correct and the changes are well-scoped to goal radius randomization.
Score of 4 reflects that the core logic is correct and consistent across all code paths (step logic, rendering modes). The only concerns are stylistic: duplicated goal radius assignment between two functions and mixed comment syntax in the INI file. No functional bugs or runtime risks identified.
pufferlib/ocean/drive/drive.h — verify that the duplicated goal radius logic between generate_reward_coefs and sample_new_goal_radius behaves as intended across all goal behavior modes.

Important Files Changed

Filename	Overview
pufferlib/config/ocean/drive.ini	Updated comment for `goal_radius` to clarify it's only active when `reward_randomization = 0`. Minor mixed comment syntax style issue (`;` and `//`).
pufferlib/ocean/drive/drive.h	Adds `sample_new_goal_radius` function for per-goal radius randomization, updates distance check and all rendering paths to use per-agent radius. Logic is correct but introduces duplication with `generate_reward_coefs`.

Flowchart

flowchart TD
    A[Agent Init / Reset] -->|"generate_reward_coefs()"| B[Set all reward coefs including goal_radius]
    B --> C{goal_behavior?}
    C -->|GOAL_GENERATE_NEW| D[Agent reaches goal]
    D --> E["sample_new_goal()"]
    E --> F["sample_new_goal_radius()"]
    F --> G{reward_randomization?}
    G -->|Yes| H["random_uniform(min, max)"]
    G -->|No| I["env->goal_radius (fixed)"]
    H --> J[Set agent reward_coefs GOAL_RADIUS]
    I --> J
    J --> K["within_distance = distance < agent.reward_coefs[GOAL_RADIUS]"]
    C -->|GOAL_RESPAWN| L[Agent respawns]
    L -->|"generate_reward_coefs()"| M[Goal radius set at respawn]
    M --> K
    C -->|GOAL_STOP| N[Agent stops at goal]
    N --> K

_{Last reviewed commit: b73b835}

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

pufferlib/config/ocean/drive.ini

greptile-apps · 2026-02-16T01:26:32Z

pufferlib/ocean/drive/drive.h

+static void sample_new_goal_radius(Drive *env, Agent *agent) {
+
+    if (env->reward_randomization) {
+        // Standard Uniform Randomization
+        agent->reward_coefs[REWARD_COEF_GOAL_RADIUS] = random_uniform(
+            env->reward_bounds[REWARD_COEF_GOAL_RADIUS].min_val, env->reward_bounds[REWARD_COEF_GOAL_RADIUS].max_val);
+    } else {
+        // Fixed coefficients
+        agent->reward_coefs[REWARD_COEF_GOAL_RADIUS] = env->goal_radius;
+    }
+}


Duplicated goal radius logic with generate_reward_coefs
generate_reward_coefs (line 417-418 and 449) already sets REWARD_COEF_GOAL_RADIUS using the same logic. When goal_behavior == GOAL_GENERATE_NEW, the value set by generate_reward_coefs at init/reset is immediately overwritten by this function when sample_new_goal is first called.

This isn't a bug — the separation is intentional so the goal radius can be re-randomized per goal. But you may want to consider removing the REWARD_COEF_GOAL_RADIUS assignment from generate_reward_coefs and instead calling sample_new_goal_radius directly after generate_reward_coefs in the init/reset paths. This would make sample_new_goal_radius the single source of truth for goal radius assignment, reducing duplication.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

mpragnay

lgtm!

riccardosavorgnan added 6 commits February 15, 2026 23:59

Modified goal speed interval to be based on min/max rather than cente…

e02e065

…r + interval around it

first commit

583842c

Working version with rendering working as well

7cc1660

commit with config comments

8f70e59

restoring the config

42d9385

restoring the config 2

b73b835

riccardosavorgnan requested a review from mpragnay February 16, 2026 01:02

riccardosavorgnan marked this pull request as ready for review February 16, 2026 01:22

greptile-apps bot reviewed Feb 16, 2026

View reviewed changes

mpragnay approved these changes Feb 16, 2026

View reviewed changes

riccardosavorgnan merged commit dddf28e into 3.0_beta Feb 16, 2026
10 checks passed

riccardosavorgnan deleted the ricky/goal_radius_randomization branch February 16, 2026 13:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Goal Radius Randomization#299

Goal Radius Randomization#299
riccardosavorgnan merged 6 commits into3.0_betafrom
ricky/goal_radius_randomization

riccardosavorgnan commented Feb 16, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Feb 16, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

greptile-apps bot Feb 16, 2026

Uh oh!

mpragnay left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

riccardosavorgnan commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Feb 16, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

mpragnay left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

riccardosavorgnan commented Feb 16, 2026 •

edited

Loading