Conversation
pankit-eng
reviewed
Oct 5, 2025
Contributor
There was a problem hiding this comment.
Did we need to check in cpython files?
pankit-eng
reviewed
Oct 6, 2025
|
|
||
|
|
||
| @dataclass | ||
| class ExecutionResult: |
Contributor
There was a problem hiding this comment.
A few points to consider:
- stdout and stderr could be long streams and may cause env container to OOM if we store it in memory. Let's discuss on how the policy would leverage this information. Better to minimize the context sharing from inside and outside of the container.
- One of the paradigms we are seeing with SWE agent training is that exit_code, failure reason are generally a good starting point for execution result. Lets discuss whether this paradigm can be applied here too.
jspisak
pushed a commit
that referenced
this pull request
Oct 22, 2025
Updating list of supporters with LastMile AI
This was referenced Oct 27, 2025
pankit-eng
pushed a commit
that referenced
this pull request
Nov 3, 2025
FIX: Handle double-nested observation in client parser
rycerzes
referenced
this pull request
in rycerzes/OpenEnv
Nov 19, 2025
Updating list of supporters with LastMile AI
rycerzes
referenced
this pull request
in rycerzes/OpenEnv
Nov 19, 2025
FIX: Handle double-nested observation in client parser
3 tasks
This was referenced Jan 13, 2026
burtenshaw
pushed a commit
that referenced
this pull request
Jan 13, 2026
* Upload current REPL state * use official prompt * unify REPLEnv api * Update default model in server side * Updated example using IP * Updated with prompt * inject final answer --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
akashkathole7
added a commit
to akashkathole7/OpenEnv
that referenced
this pull request
Apr 23, 2026
…x rank Two config fixes surfaced by Daniel Han's "LoRA Without Regret" guidance at the Scaler workshop 2026-04-22: 1. LORA_TARGETS was attention-only (q/k/v/o). Adding MLP projections (gate_proj, up_proj, down_proj) covers the MLP block. Per Daniel, MLP adapters materially close the gap with full fine-tuning at near-zero VRAM cost and were flagged as the huggingface#1 silent underperformance in attention-only LoRA setups. 2. lora_alpha was LORA_RANK (naive PEFT default = alpha equals rank). New LORA_ALPHA = LORA_RANK * 2 follows the 2x-rank convention that Thinking Machines documented as the regime where LoRA closes the gap with full fine-tuning on small-to-medium models. Both scripts share constants via train_grpo_real.py -> train_sft_warmstart.py import, so the SFT checkpoint slots cleanly into the GRPO phase without re-init. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
akashkathole7
added a commit
to akashkathole7/OpenEnv
that referenced
this pull request
Apr 24, 2026
Grepped src/openenv/core/rubrics/ and confirmed the Rubric base class + container set (WeightedSum, Sequential, Gate, RubricList, LLMJudge) already exist per RFC 004. Updated the README section to show exactly which container our rewards.py functional composition maps to, one row per component in a new mapping table. Does NOT refactor rewards.py (invariant huggingface#1 per ONSITE_BRIEFING.md). The narrative is: functional composition honors the composable-rubrics philosophy in component independence + per-component audit trail + CI contract over multi-component defense-in-depth, even though the class-inheritance refactor is deferred to avoid regressing the 6 red-team ceiling tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
17 tasks
12 tasks
12 tasks
9 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.