Skip to content

broken link trl training script#354

Merged
shreymodi1 merged 1 commit intomainfrom
shrey/link
Dec 3, 2025
Merged

broken link trl training script#354
shreymodi1 merged 1 commit intomainfrom
shrey/link

Conversation

@shreymodi1
Copy link
Contributor

@shreymodi1 shreymodi1 commented Dec 2, 2025


name: Pull Request
about: Propose changes to the codebase
title: "Brief description of changes"
labels: ''
assignees: ''


Description

Please include a summary of the change and which issue is fixed or feature is implemented. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)
Implements # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Refactoring/Code cleanup
  • Build/CI/CD related changes
  • Other (please describe):

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration.

  • Test A
  • Test B

Test Configuration:

  • Firmware version:
  • Hardware:
  • Toolchain:
  • SDK:

Checklist:

  • My code follows the style guidelines of this project (ran black ., isort ., flake8 .)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Screenshots (if applicable)

If applicable, add screenshots to help showcase your changes.

Additional context

Add any other context about the PR here.


Note

Adds examples/trl/train_browsergym.py, an end-to-end GRPO training example integrating TRL’s vLLM server with OpenEnv BrowserGym MiniWoB tasks.

  • Examples:
    • New script: examples/trl/train_browsergym.py
      • Integrates TRL vLLM server (VLLM_URL, vllm_mode="server") with OpenEnv BrowserGymEnv for MiniWoB tasks (TASKS).
      • Implements prompt construction (build_prompt) and action parsing (parse_action) for BrowserGym interactions.
      • Provides reward aggregation (reward_func) consuming env-provided step_rewards.
      • Creates rollout via create_openenv_vllm_rollout_func with task rotation, env params, and sampling settings.
      • Configures and runs GRPOTrainer (GRPOConfig) with tokenizer, dataset stub, logging/saving, and final model save to outputs/simple-vllm/final.
      • Optional LoRA support via peft.LoraConfig.

Written by Cursor Bugbot for commit 1153fa2. This will update automatically on new commits. Configure here.

@shreymodi1 shreymodi1 merged commit 2ae135a into main Dec 3, 2025
9 checks passed
@shreymodi1 shreymodi1 deleted the shrey/link branch December 3, 2025 00:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants