Release Summary
NeMo Gym v0.3.0 ships alongside the NVIDIA Nemotron 3 Ultra model release, open sourcing the environments and corresponding datasets used during training.
Highlights:
- 70+ new environments, including benchmarks such as Tau2 and Nemotron RL training environments
- Popular harness available out-of-the-box such as Claude Code and Hermes
- Integrations with OpenEnv and Harbor - use environments from these libraries directly with NeMo Gym
- Integration with VeRL - train with VeRL and scale rollout collection with NeMo Gym
First-Time Contributors
We welcomed 30+ new contributors to this release! Here are a few highlights:
- @grace-lam added the integration to run Harbor environments with NeMo Gym
- @aleksficek — added Competitive Coding Challenges environment
- @jthomson04 improved rollout resilience when models emit malformed tool-call arguments or missing message content
Thank you to all the new contributors for helping make NeMo Gym better!
New Environments & Benchmarks
Added 70+ new environments including novel datasets and integrations of popular benchmarks. New coverage spans:
- Coding — competitive programming, code infilling, SQL generation, and software-engineering benchmarks with execution-based verification
- Math & proofs — olympiad-style problems, proof grading and validation, and formal verification (including Lean)
- Knowledge & science — graduate-level QA, chemistry and physics tasks, and lab-style reasoning (including multimodal figure, table, and protocol tasks)
- Agentic — multi-turn tool use, search, sandboxed execution, finance workflows, and tau-bench-style conversational agents
- Instruction following — format constraints, citation compliance, and IFBench-style rule verification
- Safety & RLHF — jailbreak detection, abstention calibration, prompt-injection resistance, and generative reward modeling
- Multimodal, speech & translation — VLM benchmarks, visual grounding, ASR evaluation, and machine-translation quality metrics
- Chat & broad knowledge — arena-style preference evaluation and MMLU-family benchmarks
- Interactive RL — Gymnasium-style multi-step environments for spatial and game-based training
See the Available Environments table for the full list.
Configure Agent Harnesses
- Claude Code — available out of the box in NeMo Gym
- Hermes — available out of the box in NeMo Gym
- LangGraph agent — an adapter that lets you build custom agents using LangGraph patterns (reflection, subagent orchestration, parallel thinking, rewoo)
- Gymnasium agent — generic multi-turn harness for use with OpenAI Gym-style environments
Configure Models
- Optional
max_concurrent_requestson the OpenAI model server to cap in-flight API calls — useful for rate-limited external endpoints when rollout concurrency is high
Rollout Collection & Profiling
- New
ng_aggregate_rolloutscommand to merge rollout shards collected independently across multiple nodes, enabling distributed eval without requiring a single coordinated collection job
Environment Library Integrations
- OpenEnv — combine OpenEnv environments with NeMo Gym environments
- Harbor — combine Harbor environments with NeMo Gym environments
Deprecation Notices
- Documentation has moved from Sphinx to Fern. Old Sphinx URLs redirect to the new site at docs.nvidia.com/nemo/gym. The
docs/directory is no longer used for publishing.
Bug Fixes
- Fixed aiohttp connection limit exhaustion under FastAPI/Uvicorn with multiple workers
- Fixed session cookie propagation for Starlette >= 1.0.0
- Fixed duplicated usage counting and errors on empty usage in subsequent model calls
- Improved rollout resilience when models emit malformed tool-call arguments or missing message content
- Fixed prompt-key hashing when inputs contain Pydantic BaseModel objects
Documentation
- New concepts pages for environments, evaluation, and training
- Improved Architecture page to clarify how environments map to NeMo Gym components
- Consolidated detailed setup and quickstart into a single improved quickstart with clearer descriptions
- Expanded Ecosystem page with environment library, training framework, and agent harness integrations
Changelog Details
- feat: VLM circle click environment (#837) by @cmunley1
- feat: LocalVLLMModel bump to vLLM 0.17.0 (#839) by @bxyu-nvidia
- feat: Status updates for agent refs during rollout collection (#843) by @bxyu-nvidia
- feat: ether0 chemistry benchmark environment (#838) by @cmunley1
- docs: prime intellect verifiers dataset generation instruction update (#851) by @cmunley1
- Finance Agent Environment (#742) by @ushnish-de
- feat: Add XSTest safety benchmark resource server (#764) by @dcfarris
- Create a guide to build environments in NeMo Gym (#711) by @shashank3959
- Add multi-step tool-calling data generation example (#778) by @shashank3959
- docs: Fix TRL docs link (#857) by @bxyu-nvidia
- Swap readme table columns (to main) (#856) by @fsiino-nvidia
- Introduce Benchmarks directory (#858) by @gwarmstrong
- add gpqa diamond dataset (#845) by @azkalot1
- docs: rl <> gym compatibility table (#803) by @lbliii
- Updated contributing guide message (#862) by @cwing-nvidia
- docs: Nemotron 3 Super recipe link (#863) by @bxyu-nvidia
- Gym 0.2.0 huggingface dataset pointers (#859) by @fsiino-nvidia
- Add support for SWE-Multilingual benchmark (#822) by @roclark
- chore: Bump python package version to 0.3.0.rc0 and descriptions (#883) by @chtruong814
- feat: add Harbor integration (#751) by @grace-lam
- docs: Fix MultiChallenge train dataset description (#885) by @bxyu-nvidia
- docs: update GPQA-D readme (#888) by @cmunley1
- feat: add spider2_lite resource server (#864) by @ryan-lempka
- Add prompt config for templating (#861) by @gwarmstrong
- Compute aggregate metrics (#890) by @gwarmstrong
- Streamline Benchmark rollouts and add aime24/math_with_judge metrics (#891) by @gwarmstrong
- added bbh-train support to gym (#894) by @arnavkomaragiri
- updated README with license info (#895) by @arnavkomaragiri
- feat: VLMEvalKit (#872) by @vadam5
- bug: Fix README table display (#897) by @bxyu-nvidia
- feat: Initial integration with OpenEnv (#898) by @ahmadki
- feat: add aime25 benchmark (#899) by @gwarmstrong
- GPQA benchmark (#903) by @gwarmstrong
- Structured Outputs update with YAML and XML (#865) by @jkyi-nvidia
- feat: langgraph integration (#877) by @vadam5
- Add proof environments (#907) by @smahdavi4
- feat: Benchmark infra refactors (#906) by @bxyu-nvidia
- [Fix] use venv Python for swerl_gen Ray workers instead of hardcoded PYTHONPATH (#920) by @spacegoing
- [Fix] guard nltk download with local find() to avoid unnecessary remote fetch (#919) by @spacegoing
- [fix] (code_gen): use runtime_env py_executable for Ray workers (#913) by @spacegoing
- docs: version bump, CTA link changes (#880) by @vadam5
- Add zero reward group option for proof judge environment (#923) by @smahdavi4
- fix: always send session cookie for starlette >= 1.0.0 (#942) by @cmunley1
- feat: Fix duplicated usage counting and errors on empty usage in subsequent model calls (#939) by @bxyu-nvidia
- benchmark: LiveCodeBench v5 and v6 (#933) by @bxyu-nvidia
- fix: reasoning gym duplicate license (#947) by @cmunley1
- SWE agent refactor (#934) by @sdevare-nv
- feat: tee gym server subprocess logs to a configurable directory (#950) by @ananthsub
- feat: Browsecomp benchmark exposure (#944) by @bxyu-nvidia
- ci: upgrade GitHub Actions for Node.js 24 compatibility (#932) by @ko3n1g
- docs: add aiohttp-over-httpx guidance and multi-turn agent patterns (#957) by @cwing-nvidia
- feat: add dataset preparation script for spider2_lite (#959) by @ryan-lempka
- feat: Start Nemotron 3 Ultra benchmarks config; expose Spider 2 lite and XSTest benchmarks (#958) by @bxyu-nvidia
- docs: dataset availability (#962) by @cmunley1
- fix: Match torch backend auto in genrm model (#963) by @bxyu-nvidia
- Support for multiple gold choices in swerl_llm_judge (#956) by @atefehsz
- feat(ether0): Add boxed and Answer: LETTER extraction fallbacks (#925) by @jubick1337
- fix: RMtree ignores errors (#964) by @bxyu-nvidia
- feat: AALCR and Ruler benchmarks; Misc infra (#966) by @bxyu-nvidia
- terminus judge improvement for sim only mode (#968) by @jialeiwang
- Abstention Environment (HotpotQA) (#954) by @MahanFathi
- chore: bump
_code_freezeworkflow tov0.86.0(#978) by @ko3n1g - SWE: update OH version (#979) by @sdevare-nv
- fix: Handle BaseModel inputs in prompt-key hashing. (#991) by @ffrujeri
- docs: llm-as-a-judge (#926) by @fsiino-nvidia
- Add the RDKit-Chemistry RL Environment (#984) by @danecor
- feat: mmlu_pro and mmlu_prox benchmarks (#988) by @fsiino-nvidia
- feat: Misc infra (#970) by @bxyu-nvidia
- feat: Introduce NVARC Resource Server with inductive and transductive modes (#1003) by @cmunley1
- Add CVDP benchmark resource server with apptainer instead of docker (#928) by @arti4nvj
- feat: add ifbench (#999) by @fsiino-nvidia
- Upstream 20260408 (#1039) by @bxyu-nvidia
- fix: GenRM lock in order to properly handle concurrent requests. (#1041) by @ffrujeri
- Tau2 benchmark (#1049) by @bxyu-nvidia
- Add tau2 to Nemotron 3 Ultra benchmarks (#1052) by @bxyu-nvidia
- feat: Fix sequential reasoning allowed (#1053) by @bxyu-nvidia
- Fix aiohttp connection limit under FastAPI/Uvicorn workers > 1 (#1054) by @bxyu-nvidia
- fix: pypi (#1056) by @cmunley1
- Additional Tau2 metrics (#1064) by @bxyu-nvidia
- Bump version to 0.2.1 and make wheel test mandatory (#1065) by @kajalj22
- renamed simple_agent to cvdp_agent for consistency (#1024) by @arti4nvj
- feat: VLM counting environment (#930) by @cmunley1
- fix: add value field to circle vlm envs (#1074) by @cmunley1
- Update ns_tools to use NeMo Skills nemo-skills-tools subpackage (#1078) by @gwarmstrong
- Update lc_judge.yaml (#1082) by @fayejf
- fix: remove XSTest string-match fallback, require judge model (#1058) by @dcfarris
- New structured outputs formats envs (#1037) by @jkyi-nvidia
- fix: Revert package info version to 0.3.0.rc0 (#1088) by @chtruong814
- StructEval (Text) Environment (#1085) by @jkyi-nvidia
- terminal pivot multi harness (#1036) by @jialeiwang
- Competitive Coding Challenges Gym Environment (#994) by @aleksficek
- Update jailbreak env: response policy based verification (#1059) by @prasoonvarshney
- RL Environment for Indirect Prompt Injection (#1051) by @makeshn
- fix: remove mini-swe dummy resources server (#1077) by @cmunley1
- feat: add labbench2 VLM benchmark (#1093) by @azkalot1
- feat: add new env for lc retrieval & count ability (#927) by @fayejf
- Add omniscience benchmark and resource server (#1095) by @gwarmstrong
- ci: Fix release workflow (#1084) by @chtruong814
- Add birdbench benchmark and bird_sql resource server (#1098) by @gwarmstrong
- Add MRCR benchmark and resource server (#1100) by @gwarmstrong
- feat: add new browsecomp benchmark (#1087) by @yuki-97
- fix: pass
num_repeats_add_seedvia metadata.extra_body (#1099) by @gwarmstrong - docs: add GitHub badges to README (#1002) by @cwing-nvidia
- Add gsm8k and hendrycks_math benchmarks (#1104) by @gwarmstrong
- feat: add miniF2F benchmark (#1111) by @stephencge
- feat: add ProofNet benchmark (#1114) by @stephencge
- feat: add PutnamBench benchmark (#1115) by @stephencge
- feat: Gymnasium style base environment (#1072) by @cmunley1
- feat: add MOBench benchmark (#1113) by @stephencge
- Simplify verifiers_agent to use upstream NeMoRLChatCompletionsClient (#1076) by @mferrato
- Add hmmt_feb25 benchmark (#1112) by @gwarmstrong
- fix: include agent-only environments in readme table (#1091) by @cmunley1
- Add hmmt_nov25 benchmark (#1117) by @gwarmstrong
- Fern docs migration with fidelity fixes (#1045) by @lbliii
- Add proof_bench_judge benchmark and resource server (#1118) by @gwarmstrong
- Add AIME24-X, AIME25-X, and GPQA-X benchmarks (#1120) by @wedu-nvidia
- Add APEX Shortlist benchmark (#1105) by @gwarmstrong
- feat: add aime26 benchmark (symbolic-only, MathArena source) (#1123) by @gwarmstrong
- feat: improve browsecomp (#1109) by @yuki-97
- feat: support disable interleaved reasoning (#1110) by @yuki-97
- Add Stirrup agent + GDPVal eval/RL environment (#1090) by @Kh4L
- rename gymnasium (#1136) by @cmunley1
- add mmlu, mmmlu, and mmlu-redux benchmarks (#1125) by @wedu-nvidia
- Simple version of #700 (#1138) by @tdene
- structured outputs v4 tool calls (#1127) by @jkyi-nvidia
- fix(stirrup_agent): pin Ray worker venv via runtime_env (fixes GDPVal reward=0) (#1140) by @agronskiy
- Add task-info logging to nemo_gym/rollout_collection.py through optional logging flag. + Add codex debugging skill (#1142) by @jkyi-nvidia
- Add LibriSpeech-PC benchmark, asr_with_pc resource server, and audio sidechannel in vllm_model (#1144) by @gwarmstrong
- fix(ruler): factor RULER's thread-unsafe nltk init into its own module (#1150) by @agronskiy
- Add ioi benchmark and resource server (#1124) by @gwarmstrong
- Add ifeval benchmark (data-only) (#1158) by @gwarmstrong
- Add longbench-v2 benchmark (data-only) (#1159) by @gwarmstrong
- Add longcodebench benchmark (data-only) (#1157) by @gwarmstrong
- Add answer-judge, global-piqa, math-500, proof-arena-judge, and supergpqa (#1151) by @wedu-nvidia
- Add livecodebench-x benchmark (data-only) (#1169) by @gwarmstrong
- fix(stirrup_agent): accept local paths in reference_file_urls (#1173) by @Kh4L
- Add file-path audio support to the vllm_model audio sidechannel (#1170) by @gwarmstrong
- SWE Updates 0428 (#1172) by @sdevare-nv
- fix(gdpval): plumb judge_responses_create_params_overrides into create() call (#1174) by @Kh4L
- Add imo_answerbench benchmark (#1155) by @gwarmstrong
- [StirrupAgent] Persist GDPVal deliverables per repeat (task_X/repeat_N/) (#1183) by @Kh4L
- fix: don't crash rollouts on malformed tool-call arguments and missing message content (#1180) by @jthomson04
- [Stirrup] Fix per-task Apptainer code_exec for GDPVal (#1182) by @Kh4L
- fix(gdpval): compare each eval rollout against all reference repeats (#1198) by @agronskiy
- [Stirrup] Lift LLM kwargs from config and clear stale deliverables (#1187) by @Kh4L
- ci: selective PR tests, full suite on merge, health-check polling (#1149) by @kajalj22
- fix: Tau2 propogates max_output_tokens (#1202) by @bxyu-nvidia
- ci(fern): track latest Fern CLI via npx instead of pinning (#1194) by @lbliii
- docs(fern): drop scheme from instance url, enable basepath-aware (#1156) by @lbliii
- feat(benchmarks): BrowseComp fixes and efficiency improvements (#1203) by @e-dobrowolska
- docs: expand CI/CD section in development setup guide (#1212) by @kajalj22
- Fix stirrup summarization tool history (#1181) by @syadav481
- fix: verifiers agent ToolEnv downstream use (#1214) by @cmunley1
- [GDPVal] Add opt-in persistence of raw judge responses (#1225) by @Kh4L
- fix(stirrup_agent): parse Tavily key list + rotate on 401/403/429 (#1226) by @agronskiy
- fix(gdpval): make Office→PDF preconvert actually work in comparison mode (#1228) by @agronskiy
- fix(gdpval): bound /verify wallclock on multimodal long-tail tasks (#1229) by @agronskiy
- HLE Benchmark (#1028) by @fsiino-nvidia
- docs: remove "Design a customer evaluation" page (#1240) by @cwing-nvidia
- docs: document how VLLMModel handles max_seq_length exceeded errors (#1207) by @cwing-nvidia
- docs: verl integration (#1116) by @cmunley1
- docs: improve product overview (#1186) by @cwing-nvidia
- Turing Envs (Covers Multichallenge, InverseIFEval, CFBench and SysBench datasets) (#951) by @MahanFathi
- Add simpleqa benchmark + simpleqa resource server (#1162) by @gwarmstrong
- Add physics benchmark + physics_judge resource server (#1163) by @gwarmstrong
- Add imo-gradingbench benchmark + imo_gradingbench resource server (#1161) by @gwarmstrong
- feat: hermes agent harness (#1033) by @cmunley1
- Add frontierscience-olympiad benchmark + frontierscience_judge resource server (#1164) by @gwarmstrong
- Add hotpotqa_closedbook benchmark + hotpotqa_qa resource server (#1166) by @gwarmstrong
- Add wmt24pp benchmark and wmt_translation resource server (#1199) by @gwarmstrong
- Add human-eval benchmark and resource server (#1201) by @gwarmstrong
- Add arena-hard-v2 benchmark and resource server (#1122) by @gwarmstrong
- ci: skip tests for benchmark-only changes (#1260) by @kajalj22
- Add flores200 benchmark (#1259) by @gwarmstrong
- Add ugphysics benchmark + ugphysics_judge resource server (#1167) by @gwarmstrong
- Add mbpp benchmark (#1257) by @gwarmstrong
- Add arena-hard benchmark (#1261) by @gwarmstrong
- Add m-arena-hard-v2 benchmark (#1262) by @gwarmstrong
- Add m-arena-hard benchmark (#1263) by @gwarmstrong
- Add human-eval-infilling (FIM) benchmark + code_fim resource server (#1258) by @gwarmstrong
- Add speed-bench benchmark and resource server (#1232) by @gwarmstrong
- Add imo_proofbench benchmark and imo_proofbench_judge resource server (#1230) by @gwarmstrong
- Add bigcodebench benchmark and resource server (#1211) by @gwarmstrong
- feat(asr_with_pc): add Hallucination and ASR_LEADERBOARD task_types (#1177) by @gwarmstrong
- Add polymath benchmark + polymath resource server (weighted, per-language) (#1168) by @gwarmstrong
- feat(benchmarks): add musan benchmark (data-only) (#1179) by @gwarmstrong
- feat(benchmarks): add numb3rs benchmark (data-only) (#1178) by @gwarmstrong
- feat(benchmarks): add asr_leaderboard benchmark (data-only) (#1176) by @gwarmstrong
- docs: add explicit guidance on when to use NeMo Gym (#1266) by @cwing-nvidia
- fix(gdpval): normalize python-docx ns0 namespacing before LibreOffice convert (#1270) by @agronskiy
- chore: pin verifiers to 0.1.14 (#1271) by @cmunley1
- benchmark: protocolqa2 labbench2 (#1238) by @azkalot1
- docs: add release notes page to About section (#1279) by @cwing-nvidia
- ci: Major refactor of release-workflows (#1242) by @ko3n1g
- docs: generalize training workflow reference in vLLM page (#1287) by @cwing-nvidia
- fix(docs): replace generic "Index" link text with actual page titles (#1284) by @cwing-nvidia
- fix(gdpval): also apt-install JRE when libreoffice is pre-baked into image (#1268) by @agronskiy
- CVDP Resources Server Fixes for commercial tooling support (#1276) by @arti4nvj
- docs: add Daytona Harbor tutorial (#1227) by @mu-hashmi
- docs(fern): main + latest GA alias, rename v0.2 → v0.2.1, fix CI fork-secret access (#1241) by @lbliii
- SWE - Update openhands, add skip eval (#1288) by @sdevare-nv
- SWE: add golden patch validation (#1296) by @sdevare-nv
- docs: restructure concepts section (#1278) by @cwing-nvidia
- ci(fern): generate library reference before publishing docs (#1297) by @lbliii
- Rollout to Metrics Mapping + Partial Reward Profiling + Reward Profiling Skill (#1145) by @jkyi-nvidia
- docs(fern): fix sidebar ordering to match original site (#1299) by @lbliii
- add openai headers (#1027) by @cdreetz
- add pivot dataset creation skill (#1308) by @jkyi-nvidia
- Codex/equiv judge extraction (#1313) by @jiacheng-xu
- Support sharded rollout aggregation via ng_aggregate_rollouts (#1314) by @gwarmstrong
- docs: improve getting started (#1283) by @cwing-nvidia
- docs: add environment concepts pages (#1265) by @cwing-nvidia
- docs: move index page to About section (#1320) by @cwing-nvidia
- docs: make Main the default docs version (#1321) by @cwing-nvidia
- docs(fern): fix redirects so Sphinx URLs actually resolve (#1310) by @lbliii
- fix(stirrup_agent): libreoffice whitespace + reference_files double-nest (#1333) by @agronskiy
- feat(openai_model): opt-in concurrency cap via per-server semaphore (#1208) by @agronskiy
- docs(fern): retire Latest slug, flatten /latest/ redirects to /main/, drop Main beta, fix duplicate H1 (#1328) by @lbliii
- feat: GRL Tetris Gymnasium Environment (#1331) by @cmunley1
- feat: GRL Sokoban Gymnasium Environment (#1330) by @cmunley1
- fix(stirrup): restore tool messages for model calls (#1277) by @syadav481
- docs: generalize training card links to all training tutorials (#1355) by @cwing-nvidia
- Consolidate
benchmarks/prompts/onto NeMo Skills' directory layout and naming (#1316) by @gwarmstrong - build: add root Makefile with Fern dev convenience targets (#1255) by @lbliii
- Port codex skills to claude (#1369) by @jkyi-nvidia
- fix(security): upgrade dependencies for CVE remediation (#1370) by @kajalj22
- feat: Gracefully handle hangs in math verify calls (#1354) by @Kipok
- Harden finance_sec_search resource server and agent for GRPO training (#1304) by @ushnish-de
- fix(stirrup_agent): stage GDPVal reference files on Ray worker, not head (#1366) by @agronskiy
- feat(stirrup_agent): per-task timeout + walltime-resilient failure routing (#1367) by @agronskiy
- example dataset/pipeline for terminus judge (#1374) by @kbhardwaj-nvidia
- feat: add environments/ and example_environments/ (#1324) by @cmunley1
- fix(security): upgrade transformers 4.x → 5.8.1 (CVE remediation) (#1372) by @kajalj22
- feat: claude code agent harness (#1336) by @cmunley1
- docs: add Verification Patterns section and move LLM-as-Judge into it (#1364) by @cwing-nvidia
- Revert OpenAI model override (#1380) by @bxyu-nvidia
- ci: remove build-docs and build-test-publish-wheel workflows (#1293) by @ko3n1g
- docs: point root README at Fern canonical URLs (#1325) by @lbliii
- fix(stirrup_agent): reasoning fallback + surface tool-arg validation errors (#1397) by @agronskiy
- feat: example multi turn gymnasium env (#1332) by @cmunley1
- docs: fix agent_ref description to reference agent server instead of resources server (#1400) by @JOBEBOLDER
- fix:
wmt_translation- stage COMET Python mirror per-writer to avoid races (#1407) by @ananthsub - docs(fern): adopt NVIDIA global theme as source of truth (#1413) by @lbliii
- docs: replace core-components with architecture page (#1323) by @cwing-nvidia
- ci: validate release branch-rules (#1392) by @ko3n1g
- chore: align CLAUDE.md with current docs, reduce drift (#1438) by @cwing-nvidia
- docs: retire docs/ Sphinx tree, replace with pointer to fern/ (#1376) by @lbliii
- docs: restore GitHub badges to README (#1443) by @cwing-nvidia
- ci: add request-nvskills-ci workflow (#1441) by @ananthsub
- ci: exclude .oms.sig from secrets-detector baseline (#1444) by @ananthsub
- docs: add tip for FlashInfer JIT cache install for vLLM model server (#1405) by @ananthsub
- docs: fix agent_ref description to reference agent server (#1418) by @ananthsub
- fix: return in-scope-filtered agent configs from
load_and_validate_server_instance_configs(#1410) by @ananthsub - feat: add '/' endpoint to SimpleServer and HeadServer (#1431) by @marta-sd
- docs: fix fern docs issues from NVBugs 6193091 (#1393) by @lbliii
- fix: skip multi-call assistant targets in chat and conversational converters for pivot datasets (#1409) by @ananthsub
- fix: rename workplace assistant environment (#1462) by @cmunley1
- Rename Turing VIF environment into VerifIF (#1470) by @odelalleau
- fix: process cleanup in CLI /
rdkit_chemistry/ns_tools/code_gen(#1406) by @ananthsub - Patch to CCC Environment Code and Docs (#1121) by @aleksficek
- docs: fix dead core-components links and canonical /main URLs (#1480) by @cwing-nvidia
- fix: process lifecycle for apptainer & nstools (#1507) by @ananthsub
- docs: update verl pin (#1504) by @cmunley1
- ci: Add community bot labeler (#1512) by @chtruong814
- ci: Update community bot version to add token fix (#1516) by @chtruong814
- docs: add v0.3.0 release notes (#1486) by @cwing-nvidia
- docs: update RL framework compatibility table (#1523) by @ananthsub
- feat: add ultra-v3 post-training environments and agent updates (#1529) by @ananthsub
- fix: add example data + metrics for swe_pivot and inverse_if (#1530) by @ananthsub
- fix: add example rollouts for swe_pivot/inverse_if and restore core coverage (#1531) by @ananthsub
- docs(fern): add v0.3.0 version snapshot for GA release (#1521) by @kajalj22
- chore: drop rc0 pre-release tag for 0.3.0 release by @kajalj22