Conversation
📝 WalkthroughWalkthroughAdds unified task-context (CONTEXT.md) propagation to ExecutionContext and multimodal tools, implements subagent cancellation recovery with configurable min/max timeout clamping and richer SubagentResult semantics, adds CI/workflows and scripts, and introduces extensive docs, specs, and tests supporting these features. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Agent as Main Agent
participant SysMsg as System Message Builder
participant Exec as ExecutionContext
participant Context as CONTEXT.md Loader
participant Tool as Multimodal Tool
participant API as External API
Agent->>SysMsg: build_coordination_message()
SysMsg->>SysMsg: add TaskContextSection if tools/subagents enabled
Agent->>Exec: stream_with_tools()
Exec->>Context: load_task_context_with_warning()
Context-->>Exec: task_context + optional warning
Exec->>Tool: call tool with task_context
Tool->>Tool: format_prompt_with_context(task_context)
Tool->>API: invoke external API with augmented prompt
API-->>Tool: response
Tool-->>Agent: result (+ context warning if present)
sequenceDiagram
autonumber
participant Parent as Parent Agent
participant MCP as Subagent MCP Server
participant SubMgr as SubagentManager
participant Orch as Orchestrator
participant WS as Workspace / full_logs/status.json
Parent->>MCP: spawn_subagents() (no timeout_seconds)
MCP->>SubMgr: start with configured min/max timeouts
SubMgr->>SubMgr: _clamp_timeout(requested)
SubMgr->>Orch: run subagent with clamped timeout
Orch->>WS: write full_logs/status.json (phase, costs, outputs)
Note over Orch,SubMgr: timeout occurs
SubMgr->>WS: read full_logs/status.json + workspace artifacts
SubMgr->>SubMgr: extract recovered_answer, completion_percentage, token_usage
SubMgr-->>MCP: SubagentResult(status, workspace_path, completion_percentage, token_usage, warning?)
MCP-->>Parent: deliver recovered result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 2 inconclusive)
✅ Passed checks (3 passed)
✨ Finishing touches
Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
massgen/tool/_multimodal_tools/understand_file.py (1)
203-283: Document the newtask_contextparameter in the docstring.The
task_contextparameter was added to the function signature but is not documented in the docstring's Args section. This is inconsistent withunderstand_video.py(lines 480-482), which properly documents this parameter.🔎 Suggested docstring addition
Add the following to the Args section of the docstring (after
agent_cwd):agent_cwd: Agent's current working directory (automatically injected, optional) + task_context: Context string used to augment the prompt with task-specific information (Optional[str]) + - When provided, prepends contextual information to help the model better understand + task-specific terminology and requirements. + - If None (default), no context-based augmentation is applied.massgen/tool/_multimodal_tools/generation/generate_media.py (1)
148-167: Image-to-image path dropsCONTEXT.mdtruncation warning fromgenerate_mediaYou now surface
context_warningin the maingenerate_mediaresponses (batch/single), but the image‑to‑image helper_generate_single_with_input_imagesdoesn’t receive or emit that warning. As a result, callers usinginput_imagesnever see truncation or context issues even though the prompt is augmented.To keep behavior consistent with other paths and the spec (“warning” field should be visible on tool responses), propagate
context_warninginto this helper and include it in the JSON payload when present.Concrete patch sketch
async def _generate_single_with_input_images( @@ - timestamp: str, - ext: str, - task_context: Optional[str] = None, + timestamp: str, + ext: str, + task_context: Optional[str] = None, + context_warning: Optional[str] = None, @@ - if result.success: - metadata = dict(result.metadata or {}) - if input_image_paths: - metadata["input_image_paths"] = input_image_paths - - return ExecutionResult( - output_blocks=[ - TextContent( - data=json.dumps( - { - "success": True, - "operation": "generate_media", - "mode": mode, - "file_path": str(result.output_path), - "filename": result.output_path.name if result.output_path else None, - "backend": result.backend_name, - "model": result.model_used, - "file_size": result.file_size_bytes, - "duration_seconds": result.duration_seconds, - "metadata": metadata, - }, - indent=2, - ), - ), - ], - ) + if result.success: + metadata = dict(result.metadata or {}) + if input_image_paths: + metadata["input_image_paths"] = input_image_paths + + response_data = { + "success": True, + "operation": "generate_media", + "mode": mode, + "file_path": str(result.output_path), + "filename": result.output_path.name if result.output_path else None, + "backend": result.backend_name, + "model": result.model_used, + "file_size": result.file_size_bytes, + "duration_seconds": result.duration_seconds, + "metadata": metadata, + } + if context_warning: + response_data["warning"] = context_warning + + return ExecutionResult( + output_blocks=[ + TextContent( + data=json.dumps(response_data, indent=2), + ), + ], + )And in
generate_media, pass the warning through:- task_context, context_warning = load_task_context_with_warning(agent_cwd, task_context) @@ if not is_batch and input_images and media_type == MediaType.IMAGE: return await _generate_single_with_input_images( @@ - timestamp=timestamp, - ext=ext, - task_context=task_context, + timestamp=timestamp, + ext=ext, + task_context=task_context, + context_warning=context_warning, )Also applies to: 201-220, 235-255
🧹 Nitpick comments (22)
openspec/AGENTS.md (3)
43-43: Fix markdown heading syntax—use###instead of bold emphasis.Lines 43, 293, 297, and 301 use bold text (
**Workflow**,**"Change must..."**, etc.) where proper markdown headings (###) should be used. This improves document structure and rendering consistency.🔎 Proposed fixes
- **Workflow** + ### Workflow - **"Change must have at least one delta"** + ### "Change must have at least one delta" - **"Requirement must have at least one scenario"** + ### "Requirement must have at least one scenario" - **"Silent scenario parsing failures"** + ### "Silent scenario parsing failures"Also applies to: 293-293, 297-297, 301-301
125-125: Add language identifiers to fenced code blocks.Lines 125, 147, and 351 contain fenced code blocks without language specifiers. Adding them improves syntax highlighting and document clarity.
🔎 Proposed fixes
- ``` + ```yaml openspec/ ├── project.md # Project conventions - ``` + ``` New request? ├─ Bug fix restoring spec behavior? → Fix directly - ``` + ``` openspec/changes/add-2fa-notify/ ├── proposal.md ├── tasks.mdNote: Line 125 should be
yaml(directory tree structure); lines 147 and 351 are decision/directory trees—considertextor leave asbashdepending on intent.Also applies to: 147-147, 351-351
29-30: Minor: Reduce repeated phrasing in examples.Lines 29–30 repeat "I want to create" twice in succession. Slightly varying the trigger examples would improve writing variety.
🔎 Suggested revision
- "Help me create a change proposal" - "Help me plan a change" - "Help me create a proposal" - - "I want to create a spec proposal" - - "I want to create a spec" + - "I want to create a spec proposal" + - "Let me spec out a new capability"massgen/orchestrator.py (1)
869-923: Subagent min/max timeout wiring looks consistent with existing config patternThe new
min_timeout/max_timeoutdefaults, overrides fromcoordination_config, and propagation via--min-timeout/--max-timeoutinto the subagent MCP args are straightforward and align with howmax_concurrent/default_timeoutare handled. I don't see correctness or safety issues in this wiring.If
subagent_min_timeout/subagent_max_timeoutare allowed to beNoneor non-ints at the config level, consider normalizing/clamping them there so this function can continue to assume plain ints.openspec/project.md (1)
14-24: Consider trimming brittle details (exact models and file sizes) to keep this doc durableThe overall structure and intent of this project context doc are strong, but the very specific bits (exact model lists under “LLM SDKs/Providers” and approximate byte sizes for key modules) are likely to drift quickly as the codebase and dependencies evolve. You might want to either:
- Soften these to examples (“e.g., claude-3-5-sonnet, gpt-4o, gemini-2.x, …”) without implying completeness, and/or
- Drop the
(~XXXK bytes)annotations and instead describe modules by role only.This keeps the document useful for both humans and tools without requiring frequent touch-ups.
Also applies to: 101-120
massgen/tool/_multimodal_tools/understand_file.py (1)
501-504: Move import to module level for consistency and performance.The
format_prompt_with_contextimport is performed inside the function, which has minor performance implications (repeated imports on each call). In contrast,understand_video.pyimports this at the module level (line 25), which is the preferred pattern for consistency across multimodal tools.🔎 Proposed refactor
Move the import to the top of the file with other imports:
from dotenv import load_dotenv from openai import OpenAI +from massgen.context.task_context import format_prompt_with_context from massgen.tool._result import ExecutionResult, TextContentThen remove the import from inside the function:
- # Inject task context into prompt if available - from massgen.context.task_context import format_prompt_with_context - augmented_prompt = format_prompt_with_context(prompt, task_context)massgen/skills/release-prep/SKILL.md (1)
18-20: Consider adding language specifiers to fenced code blocks.Static analysis flagged three code blocks without language specifiers. While these are example outputs rather than executable code, adding specifiers improves rendering and accessibility.
🔎 Suggested improvements
Line 18 (usage example):
-``` +```bash /release-prep v0.1.34Line 164 (example output): ```diff -``` +```text ### 📸 Suggested Screenshot for v0.1.34 ...Line 184 (checklist output): ```diff -``` +```text ## Release Prep Summary for v0.1.34 ...</details> Also applies to: 164-178, 184-207 </blockquote></details> <details> <summary>openspec/changes/add-unified-project-context/design.md (1)</summary><blockquote> `72-76`: **Consider adding language specifier to error message code block.** The error message example would render better with a language specifier. Since this is an error message (text output), `text` would be appropriate. <details> <summary>🔎 Suggested improvement</summary> ```diff **Error message**: -``` +```text CONTEXT.md not found in workspace. Before using multimodal tools or spawning subagents, create a CONTEXT.md file with task context. See system prompt for instructions.</details> </blockquote></details> <details> <summary>massgen/tool/_multimodal_tools/understand_audio.py (1)</summary><blockquote> `310-327`: **Note: OpenAI Whisper doesn't use the prompt parameter.** The augmented prompt is correctly used for Gemini (line 320), but OpenAI's Whisper transcription API (line 326) doesn't actually use the `prompt` parameter—it's only included for API consistency, as noted in the docstring at line 133. The actual API call at lines 150-154 ignores it. This is harmless but creates a false impression that context affects Whisper transcriptions. <details> <summary>🔎 Optional: Use original prompt for OpenAI to clarify intent</summary> ```diff if selected_backend == "gemini": transcription = await _process_with_gemini( audio_path=audio_path, prompt=augmented_prompt, model=selected_model, ) else: # openai transcription = await _process_with_openai( audio_path=audio_path, - prompt=augmented_prompt, + prompt=None, # Whisper doesn't use prompts model=selected_model, )docs/announcements/README.md (1)
19-21: Add language identifier to code block.The fenced code block should specify a language for proper syntax highlighting.
🔎 Proposed fix
-``` +```bash /release-prep v0.1.34</details> </blockquote></details> <details> <summary>docs/announcements/current-release.md (1)</summary><blockquote> `24-26`: **Format URLs as markdown links.** The bare URLs should be formatted as proper markdown links for better rendering and accessibility. <details> <summary>🔎 Proposed fix</summary> ```diff ## Links -- **Release notes:** https://github.com/massgen/MassGen/releases/tag/v0.1.33 -- **X post:** https://x.com/massgen_ai/status/2007169573196615950 -- **LinkedIn post:** https://www.linkedin.com/posts/massgen-ai_were-excited-to-release-massgen-v0133-activity-7412935810997678080-HH1H +- **Release notes:** [v0.1.33](https://github.com/massgen/MassGen/releases/tag/v0.1.33) +- **X post:** [massgen_ai](https://x.com/massgen_ai/status/2007169573196615950) +- **LinkedIn post:** [massgen-ai](https://www.linkedin.com/posts/massgen-ai_were-excited-to-release-massgen-v0133-activity-7412935810997678080-HH1H)Install: `pip install massgen==0.1.33` -Release notes: https://github.com/massgen/MassGen/releases/tag/v0.1.33 +Release notes: [v0.1.33](https://github.com/massgen/MassGen/releases/tag/v0.1.33)Also applies to: 40-40
.github/workflows/pypi-publish.yml (2)
34-36: Quote variables to prevent word splitting.The shellcheck warnings are valid. While these specific variables are unlikely to contain spaces, quoting is a best practice for shell safety.
🔎 Proposed fix
- name: Get version from tag id: version run: | - echo "VERSION=${GITHUB_REF#refs/tags/}" >> $GITHUB_OUTPUT - echo "VERSION_WITHOUT_V=${GITHUB_REF#refs/tags/v}" >> $GITHUB_OUTPUT + echo "VERSION=${GITHUB_REF#refs/tags/}" >> "$GITHUB_OUTPUT" + echo "VERSION_WITHOUT_V=${GITHUB_REF#refs/tags/v}" >> "$GITHUB_OUTPUT"
75-87: Quote variables in step summary.Multiple shellcheck warnings flagged unquoted variables. While the
${{ }}syntax is GitHub Actions interpolation (not bash variables) and is safe, the$GITHUB_STEP_SUMMARYredirections should be quoted.🔎 Proposed fix
- name: Post publish summary run: | - echo "### 🎉 Published to PyPI Successfully!" >> $GITHUB_STEP_SUMMARY - echo "" >> $GITHUB_STEP_SUMMARY - echo "**Version:** ${{ steps.version.outputs.VERSION }}" >> $GITHUB_STEP_SUMMARY - echo "" >> $GITHUB_STEP_SUMMARY - echo "**PyPI URL:** https://pypi.org/project/massgen/${{ steps.version.outputs.VERSION_WITHOUT_V }}/" >> $GITHUB_STEP_SUMMARY - echo "" >> $GITHUB_STEP_SUMMARY - echo "**Docs:** https://massgendoc.readthedocs.io/" >> $GITHUB_STEP_SUMMARY - echo "" >> $GITHUB_STEP_SUMMARY - echo "**Install:**" >> $GITHUB_STEP_SUMMARY - echo '```bash' >> $GITHUB_STEP_SUMMARY - echo "pip install massgen==${{ steps.version.outputs.VERSION_WITHOUT_V }}" >> $GITHUB_STEP_SUMMARY - echo '```' >> $GITHUB_STEP_SUMMARY + echo "### 🎉 Published to PyPI Successfully!" >> "$GITHUB_STEP_SUMMARY" + echo "" >> "$GITHUB_STEP_SUMMARY" + echo "**Version:** ${{ steps.version.outputs.VERSION }}" >> "$GITHUB_STEP_SUMMARY" + echo "" >> "$GITHUB_STEP_SUMMARY" + echo "**PyPI URL:** https://pypi.org/project/massgen/${{ steps.version.outputs.VERSION_WITHOUT_V }}/" >> "$GITHUB_STEP_SUMMARY" + echo "" >> "$GITHUB_STEP_SUMMARY" + echo "**Docs:** https://massgendoc.readthedocs.io/" >> "$GITHUB_STEP_SUMMARY" + echo "" >> "$GITHUB_STEP_SUMMARY" + echo "**Install:**" >> "$GITHUB_STEP_SUMMARY" + echo '```bash' >> "$GITHUB_STEP_SUMMARY" + echo "pip install massgen==${{ steps.version.outputs.VERSION_WITHOUT_V }}" >> "$GITHUB_STEP_SUMMARY" + echo '```' >> "$GITHUB_STEP_SUMMARY"massgen/tool/_multimodal_tools/understand_image.py (1)
43-50: Function signature updated correctly for task_context.The
task_contextparameter is added as the last optional parameter, maintaining backward compatibility. The docstring should be updated to document this new parameter.🔎 Suggested docstring addition
Add to the Args section of the docstring:
allowed_paths: List of allowed base paths for validation (optional) agent_cwd: Agent's current working directory (automatically injected) + task_context: Optional task context string to augment the prompt with + project-specific terminology and context (auto-injected from CONTEXT.md)massgen/tests/test_subagent_cancellation_recovery.py (1)
287-321: Verify test assertion on line 321.The test
test_extract_answer_selects_by_votesasserts that"agent_1" in answeron line 321. This assertion checks that the answer content mentions "agent_1", which works because the test creates answers with the patternf"Answer from {agent_id}". However, this is a somewhat indirect assertion.A more robust assertion might directly check the answer content:
assert answer == "Answer from agent_1"This would make the test more precise and easier to debug if it fails.
openspec/changes/fix-subagent-cancellation-recovery/proposal.md (1)
20-39: Consider adding language specifiers to code blocks.The flow diagrams at lines 20-26 and 29-39 are flagged by markdownlint for missing language specifiers. Since these are conceptual diagrams rather than executable code, you could use
textorplaintextas the language:-``` +```text Parent's asyncio.wait_for() times outThis is a minor documentation polish.
openspec/changes/add-unified-project-context/specs/task-context/spec.md (1)
45-51: Add a language identifier to the fenced code block for markdownlint complianceThe fenced code block showing the
[Task Context]/[Request]template has no language tag, which trips MD040. Consider marking it astext(ormarkdown) for cleaner linting.Suggested change
-``` +```text [Task Context] {context} [Request] {original_prompt}massgen/tool/_multimodal_tools/generation/generate_media.py (1)
266-287:generate_mediacontext handling is solid; consider documentingtask_contextin the docstringThe new flow—loading
task_contextviaload_task_context_with_warning, requiringCONTEXT.md, and threading warnings into batch/single responses—looks correct and matches the intended behavior for multimodal tools.Since
task_contextis now a first‑class context param, it would help future maintainers if the docstring’s Args section briefly described it (similar toagent_cwd/multimodal_config), e.g., “auto‑injected task context loaded from CONTEXT.md”.Also applies to: 361-375, 449-471, 553-599
massgen/mcp_tools/subagent/_subagent_mcp_server.py (1)
45-47: Timeout configuration wiring is good; summary could include new recovery statusesPassing
_min_timeout/_max_timeoutintoSubagentManagerand droppingtimeout_secondsfrom the MCP interface is a clean way to centralize timeout policy and prevent models from picking bad values.One behavioral nit:
completed_but_timeoutandpartialresults are not reflected in the summary counters, which still only track"completed"and"timeout". Sincecompleted_but_timeoutis a successful recovery andpartialis a distinct state, you may want to:
- Treat
"completed_but_timeout"as completed in the summary, and/or- Add separate counts for
"partial"and"completed_but_timeout".This would make the high‑level summary better match the richer per‑subagent statuses you now expose.
Also applies to: 50-67, 94-99, 138-149, 199-204, 218-227, 330-356
.github/workflows/release-docs-automation.yml (1)
21-26: Quote GitHub output/summary file paths for shell robustnessThe validation logic is solid, but shellcheck is right to flag unquoted
$GITHUB_OUTPUT/$GITHUB_STEP_SUMMARY. Quoting avoids surprises if these paths ever contain spaces and silences SC2086.You can safely tighten this without changing behavior:
Suggested shell quoting tweaks
-echo "tag=$TAG" >> $GITHUB_OUTPUT -echo "version=${TAG#v}" >> $GITHUB_OUTPUT +echo "tag=$TAG" >> "$GITHUB_OUTPUT" +echo "version=${TAG#v}" >> "$GITHUB_OUTPUT" @@ - echo "has_errors=true" >> $GITHUB_OUTPUT + echo "has_errors=true" >> "$GITHUB_OUTPUT" @@ - echo "has_errors=false" >> $GITHUB_OUTPUT + echo "has_errors=false" >> "$GITHUB_OUTPUT" @@ - echo "has_warnings=true" >> $GITHUB_OUTPUT + echo "has_warnings=true" >> "$GITHUB_OUTPUT" @@ - echo "has_warnings=false" >> $GITHUB_OUTPUT + echo "has_warnings=false" >> "$GITHUB_OUTPUT" @@ - echo "## 📋 Release Documentation Validation" >> $GITHUB_STEP_SUMMARY + echo "## 📋 Release Documentation Validation" >> "$GITHUB_STEP_SUMMARY" @@ - echo "Release documentation validation failed. See logs above." + echo "Release documentation validation failed. See logs above."(Apply the same quoting pattern to all
>> $GITHUB_STEP_SUMMARYlines.)Also applies to: 28-88, 90-118, 119-123
massgen/subagent/models.py (1)
355-414: Type hint mismatch fortoken_usageparameter.The
token_usageparameter typeOptional[Dict[str, Any]]is inconsistent with the class field typeDict[str, int](line 241). Since token usage includesestimated_cost(a float), consider updating the field type annotation to match actual usage:🔎 Suggested fix
# In SubagentResult class definition (around line 241) - token_usage: Dict[str, int] = field(default_factory=dict) + token_usage: Dict[str, Any] = field(default_factory=dict)massgen/subagent/manager.py (1)
245-256: Consider narrowing the exception type for CONTEXT.md copy.The broad
Exceptioncatch (line 254) works but could be more specific. Sinceshutil.copy2typically raisesOSError(or subclasses likePermissionError,FileNotFoundError), narrowing the catch would improve clarity while maintaining the graceful degradation behavior.🔎 Optional refinement
try: shutil.copy2(context_md, dst) copied.append("CONTEXT.md") logger.info(f"[SubagentManager] Auto-copied CONTEXT.md for {subagent_id}") - except Exception as e: + except OSError as e: logger.warning(f"[SubagentManager] Failed to copy CONTEXT.md: {e}")
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (40)
.coderabbit.yaml.github/workflows/pypi-publish.yml.github/workflows/release-docs-automation.yml.gitignoreCLAUDE.mdCONTRIBUTING.mddocs/announcements/README.mddocs/announcements/archive/.gitkeepdocs/announcements/current-release.mddocs/announcements/feature-highlights.mddocs/source/user_guide/advanced/subagents.rstmassgen/agent_config.pymassgen/backend/base_with_custom_tool_and_mcp.pymassgen/mcp_tools/subagent/_subagent_mcp_server.pymassgen/orchestrator.pymassgen/persona_generator.pymassgen/skills/release-prep/SKILL.mdmassgen/subagent/manager.pymassgen/subagent/models.pymassgen/system_message_builder.pymassgen/system_prompt_sections.pymassgen/tests/test_subagent_cancellation_recovery.pymassgen/tool/_multimodal_tools/generation/generate_media.pymassgen/tool/_multimodal_tools/read_media.pymassgen/tool/_multimodal_tools/understand_audio.pymassgen/tool/_multimodal_tools/understand_file.pymassgen/tool/_multimodal_tools/understand_image.pymassgen/tool/_multimodal_tools/understand_video.pyopenspec/AGENTS.mdopenspec/changes/add-unified-project-context/design.mdopenspec/changes/add-unified-project-context/proposal.mdopenspec/changes/add-unified-project-context/specs/task-context/spec.mdopenspec/changes/add-unified-project-context/tasks.mdopenspec/changes/fix-subagent-cancellation-recovery/proposal.mdopenspec/changes/fix-subagent-cancellation-recovery/specs/subagent-orchestration/spec.mdopenspec/changes/fix-subagent-cancellation-recovery/specs/subagent-status-consolidation/spec.mdopenspec/changes/fix-subagent-cancellation-recovery/tasks.mdopenspec/project.mdscripts/precommit_sync_readme_wrapper.shscripts/precommit_validate_configs_wrapper.sh
🧰 Additional context used
📓 Path-based instructions (6)
massgen/skills/**/*.md
⚙️ CodeRabbit configuration file
massgen/skills/**/*.md: Skill documentation review:
- Verify SKILL.md frontmatter is complete (name, description)
- Check workflow steps are clear and actionable
- Ensure reference files are accurate
Files:
massgen/skills/release-prep/SKILL.md
massgen/**/*.py
⚙️ CodeRabbit configuration file
massgen/**/*.py: Review Python code for:
- PEP 8 compliance and Google-style docstrings
- Proper async/await patterns
- Type hints where appropriate
- No security vulnerabilities (command injection, etc.)
- Consistency with existing code patterns
Files:
massgen/tool/_multimodal_tools/understand_video.pymassgen/tool/_multimodal_tools/understand_file.pymassgen/orchestrator.pymassgen/tool/_multimodal_tools/generation/generate_media.pymassgen/tool/_multimodal_tools/understand_audio.pymassgen/tool/_multimodal_tools/read_media.pymassgen/system_message_builder.pymassgen/tool/_multimodal_tools/understand_image.pymassgen/agent_config.pymassgen/backend/base_with_custom_tool_and_mcp.pymassgen/subagent/models.pymassgen/system_prompt_sections.pymassgen/tests/test_subagent_cancellation_recovery.pymassgen/persona_generator.pymassgen/mcp_tools/subagent/_subagent_mcp_server.pymassgen/subagent/manager.py
docs/source/**/*.rst
⚙️ CodeRabbit configuration file
docs/source/**/*.rst: RST documentation review:
- Ensure proper RST syntax
- Check cross-references are valid
- Verify code examples are runnable and match current implementation
- For new features, ensure docs/source/index.rst "Recent Releases" is updated
Documentation quality:
- Keep explanations concise - avoid bloated prose
- Remove redundant or duplicate information
- Ensure file paths and parameter names match actual code
- Flag any examples that may be outdated or inconsistent with implementation
Files:
docs/source/user_guide/advanced/subagents.rst
docs/source/user_guide/**/*.rst
⚙️ CodeRabbit configuration file
docs/source/user_guide/**/*.rst: User guide documentation review:Completeness:
- Clear explanations with working code examples
- Expected output where applicable
- Links to related configuration examples
Consistency with implementation:
- Parameter names must match actual code
- File paths must be accurate
- Behavior descriptions must match what the code does
- Flag any discrepancies between docs and implementation
Usability:
- Include runnable commands users can try immediately
- Provide architecture diagrams for complex features
- Show expected output so users know what to expect
Conciseness:
- Avoid over-documentation and bloated explanations
- One clear explanation is better than multiple redundant ones
- Remove filler text and unnecessary verbosity
Files:
docs/source/user_guide/advanced/subagents.rst
massgen/backend/**/*.py
⚙️ CodeRabbit configuration file
massgen/backend/**/*.py: Backend code review - CHECK FOR THESE UPDATES:
- If adding new model support → ensure massgen/backend/capabilities.py is updated
- If adding new capabilities → update supported_capabilities in capabilities.py
- Verify inheritance from base classes is correct
- Check streaming buffer usage where applicable
- Ensure error handling follows existing patterns
Files:
massgen/backend/base_with_custom_tool_and_mcp.py
massgen/tests/**/*.py
⚙️ CodeRabbit configuration file
massgen/tests/**/*.py: Verify tests use pytest best practices:
- @pytest.mark.expensive for API-calling tests
- @pytest.mark.docker for Docker-dependent tests
- @pytest.mark.asyncio for async tests
- Clear test descriptions and proper fixtures
Files:
massgen/tests/test_subagent_cancellation_recovery.py
🧬 Code graph analysis (5)
massgen/tool/_multimodal_tools/generation/generate_media.py (3)
massgen/tool/_multimodal_tools/generation/_base.py (1)
GenerationConfig(27-56)massgen/tool/_decorators.py (1)
context_params(7-52)massgen/tool/_result.py (2)
ExecutionResult(47-69)TextContent(23-27)
massgen/tool/_multimodal_tools/read_media.py (4)
massgen/tool/_decorators.py (1)
context_params(7-52)massgen/tool/_multimodal_tools/understand_image.py (1)
understand_image(43-322)massgen/tool/_multimodal_tools/understand_audio.py (1)
understand_audio(159-367)massgen/tool/_multimodal_tools/understand_video.py (1)
understand_video(443-662)
massgen/system_message_builder.py (1)
massgen/system_prompt_sections.py (2)
TaskContextSection(2280-2340)add_section(2363-2374)
massgen/persona_generator.py (1)
massgen/structured_logging.py (4)
debug(284-290)info(276-282)error(300-306)warning(292-298)
massgen/subagent/manager.py (1)
massgen/subagent/models.py (6)
SubagentConfig(21-127)create(50-93)SubagentResult(212-414)create_error(333-353)SubagentState(478-506)create_timeout_with_recovery(356-414)
🪛 actionlint (1.7.9)
.github/workflows/pypi-publish.yml
34-34: shellcheck reported issue in this script: SC2086:info:1:44: Double quote to prevent globbing and word splitting
(shellcheck)
34-34: shellcheck reported issue in this script: SC2086:info:2:55: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2086:info:10:19: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2086:info:11:79: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2086:info:12:15: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2086:info:1:49: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2086:info:2:12: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2086:info:3:61: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2086:info:4:12: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2086:info:5:106: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2086:info:6:12: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2086:info:7:56: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2086:info:8:12: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2086:info:9:24: Double quote to prevent globbing and word splitting
(shellcheck)
75-75: shellcheck reported issue in this script: SC2129:style:1:1: Consider using { cmd1; cmd2; } >> file instead of individual redirects
(shellcheck)
.github/workflows/release-docs-automation.yml
30-30: shellcheck reported issue in this script: SC2086:info:48:29: Double quote to prevent globbing and word splitting
(shellcheck)
30-30: shellcheck reported issue in this script: SC2086:info:50:30: Double quote to prevent globbing and word splitting
(shellcheck)
30-30: shellcheck reported issue in this script: SC2086:info:55:31: Double quote to prevent globbing and word splitting
(shellcheck)
30-30: shellcheck reported issue in this script: SC2086:info:57:32: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:10:86: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:12:52: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:13:14: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:14:39: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:16:37: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:17:14: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:18:52: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:21:12: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:22:29: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:23:58: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:24:55: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:25:30: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:2:49: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:3:12: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:4:29: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:5:12: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:8:37: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2086:info:9:14: Double quote to prevent globbing and word splitting
(shellcheck)
92-92: shellcheck reported issue in this script: SC2129:style:12:3: Consider using { cmd1; cmd2; } >> file instead of individual redirects
(shellcheck)
92-92: shellcheck reported issue in this script: SC2129:style:16:3: Consider using { cmd1; cmd2; } >> file instead of individual redirects
(shellcheck)
92-92: shellcheck reported issue in this script: SC2129:style:21:1: Consider using { cmd1; cmd2; } >> file instead of individual redirects
(shellcheck)
92-92: shellcheck reported issue in this script: SC2129:style:2:1: Consider using { cmd1; cmd2; } >> file instead of individual redirects
(shellcheck)
92-92: shellcheck reported issue in this script: SC2129:style:8:3: Consider using { cmd1; cmd2; } >> file instead of individual redirects
(shellcheck)
🪛 LanguageTool
CLAUDE.md
[style] ~192-~192: This adverb was used twice in the sentence. Consider removing one of them or replacing them with a synonym.
Context: ...d changes - --type committed - Review only committed changes - --base develop - ...
(ADVERB_REPETITION_PREMIUM)
openspec/AGENTS.md
[style] ~30-~30: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ..."I want to create a spec proposal" - "I want to create a spec" Loose matching guidance...
(REP_WANT_TO_VB)
openspec/changes/add-unified-project-context/specs/task-context/spec.md
[style] ~19-~19: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...ng multimodal tools - WHEN an agent needs to use read_media or generate_media - **TH...
(REP_NEED_TO_VB)
🪛 markdownlint-cli2 (0.18.1)
openspec/changes/fix-subagent-cancellation-recovery/specs/subagent-status-consolidation/spec.md
92-92: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
150-150: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
massgen/skills/release-prep/SKILL.md
18-18: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
164-164: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
184-184: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
openspec/changes/add-unified-project-context/design.md
73-73: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
docs/announcements/README.md
19-19: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
openspec/changes/fix-subagent-cancellation-recovery/proposal.md
20-20: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
29-29: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
docs/announcements/feature-highlights.md
11-11: Bare URL used
(MD034, no-bare-urls)
13-13: Bare URL used
(MD034, no-bare-urls)
15-15: Bare URL used
(MD034, no-bare-urls)
17-17: Bare URL used
(MD034, no-bare-urls)
19-19: Bare URL used
(MD034, no-bare-urls)
21-21: Bare URL used
(MD034, no-bare-urls)
23-23: Bare URL used
(MD034, no-bare-urls)
25-25: Bare URL used
(MD034, no-bare-urls)
27-27: Bare URL used
(MD034, no-bare-urls)
openspec/AGENTS.md
43-43: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
125-125: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
147-147: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
293-293: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
297-297: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
301-301: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
351-351: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
openspec/changes/add-unified-project-context/specs/task-context/spec.md
45-45: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
docs/announcements/current-release.md
24-24: Bare URL used
(MD034, no-bare-urls)
25-25: Bare URL used
(MD034, no-bare-urls)
26-26: Bare URL used
(MD034, no-bare-urls)
40-40: Bare URL used
(MD034, no-bare-urls)
🪛 Ruff (0.14.10)
massgen/subagent/manager.py
254-254: Do not catch blind exception: Exception
(BLE001)
1151-1151: Do not catch blind exception: Exception
(BLE001)
openspec/changes/fix-subagent-cancellation-recovery/specs/subagent-status-consolidation/spec.md
Show resolved
Hide resolved
openspec/changes/fix-subagent-cancellation-recovery/specs/subagent-status-consolidation/spec.md
Outdated
Show resolved
Hide resolved
fix: Fix responses compression input issue
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
massgen/backend/response.py (1)
183-234: LGTM - Compression recovery flow enhanced with better user feedback.The implementation correctly handles context length errors by:
- Gathering the streaming buffer for partial response content
- Compressing messages via LLM-based summarization
- Removing
previous_response_idto prevent server-side context re-accumulation (which would defeat compression)- Retrying with compressed messages
- Reporting success with accurate input counts
The logic is sound and the user-facing messages are informative.
Note: The compression recovery logic is duplicated between
_stream_without_custom_and_mcp_tools(lines 183-234) and_stream_with_custom_and_mcp_tools(lines 395-452). Consider extracting the common compression flow into a helper method to improve maintainability and reduce the risk of divergent bug fixes.💡 Suggested refactor to reduce duplication
async def _handle_compression_recovery( self, e: Exception, messages: List[Dict[str, Any]], all_params: Dict[str, Any], tools: List[Dict[str, Any]], agent_id: Optional[str], context_label: str = "", ) -> Tuple[Any, List[Dict[str, Any]]]: """Handle compression recovery for context length errors. Returns: Tuple of (retry_stream, compressed_messages) """ logger.warning( f"[{self.get_provider_name()}] Context length exceeded{context_label}, " f"attempting compression recovery...", ) # Notify user yield StreamChunk( type="compression_status", status="compressing", content=f"\n📦 [Compression] Context limit exceeded - compressing {len(messages)} messages...", source=agent_id, ) # Compress messages buffer_for_compression = self._get_streaming_buffer() compressed_messages = await self._compress_messages_for_context_recovery( messages, buffer_content=buffer_for_compression, ) # Remove previous_response_id compression_params = {k: v for k, v in all_params.items() if k != "previous_response_id"} api_params = await self.api_params_handler.build_api_params( compressed_messages, tools, compression_params, ) # Retry client = self._create_client(**all_params) stream = await client.responses.create(**api_params) # Notify success input_count = len(compressed_messages) if compressed_messages else 0 result_msg = f"✅ [Compression] Recovered via summarization ({input_count} items) - continuing..." yield StreamChunk( type="compression_status", status="compression_complete", content=result_msg, source=agent_id, ) logger.info( f"[{self.get_provider_name()}] Compression recovery successful via summarization " f"({input_count} items)", ) return stream, compressed_messagesThen call it from both locations:
if is_context_length_error(e) and not _compression_retry: stream, compressed_messages = await self._handle_compression_recovery( e, processed_messages, all_params, tools, agent_id, context_label=" in MCP mode" ) current_messages = compressed_messages # Update for subsequent recursions if needed
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
massgen/backend/base_with_custom_tool_and_mcp.pymassgen/backend/response.py
🚧 Files skipped from review as they are similar to previous changes (1)
- massgen/backend/base_with_custom_tool_and_mcp.py
🧰 Additional context used
📓 Path-based instructions (2)
massgen/**/*.py
⚙️ CodeRabbit configuration file
massgen/**/*.py: Review Python code for:
- PEP 8 compliance and Google-style docstrings
- Proper async/await patterns
- Type hints where appropriate
- No security vulnerabilities (command injection, etc.)
- Consistency with existing code patterns
Files:
massgen/backend/response.py
massgen/backend/**/*.py
⚙️ CodeRabbit configuration file
massgen/backend/**/*.py: Backend code review - CHECK FOR THESE UPDATES:
- If adding new model support → ensure massgen/backend/capabilities.py is updated
- If adding new capabilities → update supported_capabilities in capabilities.py
- Verify inheritance from base classes is correct
- Check streaming buffer usage where applicable
- Ensure error handling follows existing patterns
Files:
massgen/backend/response.py
🧬 Code graph analysis (1)
massgen/backend/response.py (3)
massgen/backend/_streaming_buffer_mixin.py (1)
_get_streaming_buffer(183-189)massgen/backend/base_with_custom_tool_and_mcp.py (1)
_compress_messages_for_context_recovery(2890-2924)massgen/api_params_handler/_response_api_params_handler.py (1)
build_api_params(64-176)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: pre-commit (macos-14, 3.10)
- GitHub Check: pre-commit (macos-latest, 3.11)
- GitHub Check: pre-commit (macos-latest, 3.10)
- GitHub Check: pre-commit (macos-latest, 3.12)
- GitHub Check: pre-commit (macos-14, 3.10)
🔇 Additional comments (3)
massgen/backend/response.py (3)
395-452: LGTM - Consistent compression recovery implementation.This block mirrors the compression recovery logic from
_stream_without_custom_and_mcp_toolswith appropriate adaptations for the MCP tool execution path. The implementation is correct and properly:
- Updates
current_messagesfor subsequent recursive calls (line 432)- Tracks API timing with
end_api_call_timing(line 402)- Provides context-appropriate log messages ("in MCP mode")
The duplication concern noted in the previous comment applies here as well.
587-605: LGTM - Clearer logic for response item handling.The introduction of the
will_use_previous_response_idvariable (line 595) significantly improves readability by making the conditional logic explicit:
- When
previous_response_idwill be used, the API automatically includes all items from that response- When staying stateless (compression retry), items must be manually added to avoid missing reasoning items
The extended comments (lines 587-593) provide valuable context about reasoning model requirements and the rationale for the conditional injection. This defensive approach prevents "Duplicate item found with id" errors while ensuring reasoning continuity.
1488-1520: LGTM - Improved docstring clarity.The updated docstring accurately describes the compression mechanism as "LLM-based summarization" (line 1493) and provides a more precise return description (line 1500). This aligns with the implementation details and the comments added in the compression recovery blocks.
The method signature and implementation remain consistent with the parent class (base_with_custom_tool_and_mcp.py).
There was a problem hiding this comment.
Actionable comments posted: 2
Fix all issues with AI Agents 🤖
In @CLAUDE.md:
- Around line 1-337: The CLAUDE.md doc references missing files/dirs and a
misnamed skill; update the doc to either create or stop referencing them: remove
or correct the reference to docs/scripts/generate_backend_tables.py (or add that
script to docs/scripts if intended), remove or create the missing directories
docs/announcements/, docs/dev_notes/, and docs/source/ (or replace their
mentions with the actual existing paths), and fix the symlink instruction to use
the real skill directory name evolving-skill-creator instead of skill-creator so
the command and examples match the repository.
In @massgen/skills/pr-checks/SKILL.md:
- Around line 156-162: Add a short explanatory comment above the gh api graphql
mutation that tells the reader to replace THREAD_ID with the actual review
thread ID obtained from the /pr-comments output (step 3) and that the
resolveReviewThread mutation will mark that review thread as resolved after
addressing CodeRabbit's suggestion; reference the mutation name
resolveReviewThread and the placeholder THREAD_ID so it's clear what to
substitute and where the ID comes from.
🧹 Nitpick comments (5)
massgen/skills/pr-checks/SKILL.md (1)
83-83: Minor: Templates reference Claude-specific tooling.The PR description and commit templates (lines 83 and 218–220) include footers referencing "Claude Code" and Claude authorship. While these are in template form meant to be customized, consider clarifying that users should adapt these for their development environment and team conventions.
For example, you might note inline:
🤖 Generated with [Claude Code](https://claude.com/claude-code) ← Adapt this footer to your team's conventionsAlso applies to: 218-220
massgen/agent_config.py (1)
146-148: Consider simplifying f-string concatenation.The adjacent f-strings on line 147 could be combined into a single f-string for clarity:
f"subagent_min_timeout ({self.subagent_min_timeout}) must be <= subagent_max_timeout ({self.subagent_max_timeout})"This also applies to line 152. Minor stylistic improvement—not blocking.
massgen/orchestrator.py (2)
870-923: Timeout min/max propagation looks correct; consider documenting expected units and valid ranges.The new
min_timeout/max_timeoutdefaults and coordination-config overrides are wired cleanly into the subagent MCP args, and the CLI flag names are consistent within this file. The only thing I’d consider is making it explicit (in config docs or a brief comment) that these values are seconds and should satisfy0 < min_timeout <= default_timeout <= max_timeoutso misconfigurations are easier to spot and don’t rely solely on_clamp_timeout()downstream.
1788-1898: Reading subagent metrics fromfull_logs/status.jsonis aligned; maybe distinguish “phase” vs “status”.Using
full_logs/status.jsonas the single source of truth and pullingcosts.total_*,meta.elapsed_seconds, andmeta.questionlooks robust and matches the new status file shape. One minor nit: you currently expose the coordinationphaseunder the key"status", which is fine for internal metrics but can be confusing next to the spec’s notion of a simplified status (running/pending/failed/etc.). Either rename this field to"phase"or add a short comment that, in this metrics summary,"status"reflects raw coordination phase rather than the model-facing derived status.openspec/changes/fix-subagent-cancellation-recovery/specs/subagent-status-consolidation/spec.md (1)
65-86: Clarify source ofelapsed_secondsin the status transformation example.In the “Status transformation” scenario, the input example JSON doesn’t show a
metablock, but the derived status includeselapsed_seconds. To avoid confusion, consider either adding ametasection (withelapsed_seconds) to the input example or a short note that this field comes frommeta.elapsed_secondsinfull_logs/status.json, matching the implementation.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
CLAUDE.mddocs/source/user_guide/advanced/subagents.rstmassgen/agent_config.pymassgen/orchestrator.pymassgen/skills/pr-checks/SKILL.mdmassgen/subagent/models.pyopenspec/changes/fix-subagent-cancellation-recovery/specs/subagent-status-consolidation/spec.md
🧰 Additional context used
📓 Path-based instructions (4)
docs/source/**/*.rst
⚙️ CodeRabbit configuration file
docs/source/**/*.rst: RST documentation review:
- Ensure proper RST syntax
- Check cross-references are valid
- Verify code examples are runnable and match current implementation
- For new features, ensure docs/source/index.rst "Recent Releases" is updated
Documentation quality:
- Keep explanations concise - avoid bloated prose
- Remove redundant or duplicate information
- Ensure file paths and parameter names match actual code
- Flag any examples that may be outdated or inconsistent with implementation
Files:
docs/source/user_guide/advanced/subagents.rst
docs/source/user_guide/**/*.rst
⚙️ CodeRabbit configuration file
docs/source/user_guide/**/*.rst: User guide documentation review:Completeness:
- Clear explanations with working code examples
- Expected output where applicable
- Links to related configuration examples
Consistency with implementation:
- Parameter names must match actual code
- File paths must be accurate
- Behavior descriptions must match what the code does
- Flag any discrepancies between docs and implementation
Usability:
- Include runnable commands users can try immediately
- Provide architecture diagrams for complex features
- Show expected output so users know what to expect
Conciseness:
- Avoid over-documentation and bloated explanations
- One clear explanation is better than multiple redundant ones
- Remove filler text and unnecessary verbosity
Files:
docs/source/user_guide/advanced/subagents.rst
massgen/skills/**/*.md
⚙️ CodeRabbit configuration file
massgen/skills/**/*.md: Skill documentation review:
- Verify SKILL.md frontmatter is complete (name, description)
- Check workflow steps are clear and actionable
- Ensure reference files are accurate
Files:
massgen/skills/pr-checks/SKILL.md
massgen/**/*.py
⚙️ CodeRabbit configuration file
massgen/**/*.py: Review Python code for:
- PEP 8 compliance and Google-style docstrings
- Proper async/await patterns
- Type hints where appropriate
- No security vulnerabilities (command injection, etc.)
- Consistency with existing code patterns
Files:
massgen/agent_config.pymassgen/orchestrator.pymassgen/subagent/models.py
🧬 Code graph analysis (2)
massgen/agent_config.py (1)
massgen/structured_logging.py (1)
warning(292-298)
massgen/subagent/models.py (1)
massgen/agent_config.py (1)
to_dict(875-914)
🪛 LanguageTool
CLAUDE.md
[style] ~192-~192: This adverb was used twice in the sentence. Consider removing one of them or replacing them with a synonym.
Context: ...d changes - --type committed - Review only committed changes - --base develop - ...
(ADVERB_REPETITION_PREMIUM)
🪛 Ruff (0.14.10)
massgen/agent_config.py
140-140: Avoid specifying long messages outside the exception class
(TRY003)
143-143: Avoid specifying long messages outside the exception class
(TRY003)
146-148: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: pre-commit (macos-14, 3.10)
- GitHub Check: pre-commit (macos-14, 3.11)
- GitHub Check: pre-commit (macos-latest, 3.11)
🔇 Additional comments (17)
CLAUDE.md (1)
191-192: Duplicate comment on adverbial structure.This flagged item was already dismissed in a previous review as intentional parallel structure for documentation clarity. The "only" adverbs in both lines maintain consistent structure across the options list.
massgen/skills/pr-checks/SKILL.md (1)
200-204: The script reference is correct—scripts/validate_all_configs.pyexists and the invocation pattern is consistent with other steps in the workflow.docs/source/user_guide/advanced/subagents.rst (6)
77-85: LGTM!The timeout configuration documentation is clear and matches the implementation defaults in
agent_config.py(min=60, max=600). The note about clamping behavior provides helpful context for users.
210-241: LGTM!The Status Values table provides clear documentation of all possible status outcomes with appropriate success flags and descriptions. The note about
completed_but_timeoutbeing a success case is helpful for users who might otherwise treat it as a failure.
296-302: LGTM - past concern addressed.The
completion_percentagedocumentation now correctly describes it as a continuous 0-100 value with the formula(answers + votes) / (agents × 2) × 100. The percentages are appropriately labeled as approximate milestones rather than discrete phase markers.
346-378: LGTM - past concern addressed.The Mixed Results example now includes all fields from the actual return structure:
workspace,execution_time_seconds,token_usage,completion_percentageper result, and thesummaryobject at top level. The example effectively demonstrates handling of mixed outcomes (completed, completed_but_timeout, timeout).
465-479: LGTM - past concern addressed.The
check_subagent_statusexample now includes all fields:task,workspace,started_at, along with the existing status and metrics fields. This matches the actual transformed output from_transform_orchestrator_status.
389-429: The fix has been confirmed. The_collect_subagent_costs()method correctly reads fromfull_logs/status.json(line 1814) and uses the proper field paths:costs.total_input_tokens,costs.total_output_tokens,costs.total_estimated_cost,meta.elapsed_seconds, andcoordination.phase. The implementation aligns with the documentation's claim thatfull_logs/status.jsonis the single source of truth.massgen/agent_config.py (3)
94-95: LGTM!The field docstrings clearly describe the purpose of each timeout bound.
125-126: LGTM!The default values (60s min, 600s max) are reasonable bounds that match the documentation. The inline comments clarify the human-readable durations.
135-153: LGTM - past concern addressed.The
_validate_timeout_configmethod implements comprehensive validation:
- Positive value checks for both min and max
- Logical ordering constraint (min ≤ max)
- Warning when default falls outside the configured bounds
This provides early feedback to users at config time, complementing the runtime clamping in
SubagentManager.massgen/subagent/models.py (6)
147-208: LGTM: enable_web_search field properly implemented.The
enable_web_searchfield is correctly added with:
- Optional type with None default for inheritance semantics
- Conditional serialization in
to_dict(only when not None) for backwards compatibility- Clear documentation of inheritance behavior
218-284: LGTM: SubagentResult expanded with recovery semantics.The expanded status literals and new fields properly support timeout recovery scenarios:
- Clear documentation distinguishes "completed_but_timeout" vs "partial" vs "timeout"
completion_percentageandwarningfields properly typed and serialized- Backwards-compatible serialization (conditional inclusion in
to_dict)
286-353: LGTM: Factory methods consistently updated with warning parameter.All factory methods (
create_success,create_timeout,create_error) consistently accept an optionalwarningparameter and properly pass it to the constructor.
468-475: Verify: SubagentPointer doesn't distinguish "partial" status.Line 470 maps
SubagentResultstatuses toSubagentPointerstatuses. The new "partial" status (indicating partial work with no winner) gets mapped to "failed" becauseresult.success = Falseandresult.status != "timeout". This loses the distinction between partial work and outright failure.While this may be an intentional simplification (SubagentPointer's status type doesn't include "partial"), consider whether tracking partial completions separately would be valuable for debugging and observability.
477-506: LGTM: SubagentState status expanded consistently.The status literal in
SubagentStateis correctly expanded to include "completed_but_timeout" and "partial", maintaining consistency with theSubagentResultstatus values.
355-414: The parameter type at line 363 is alreadyOptional[Dict[str, Any]]— the correct type for token usage data that includes float values likeestimated_cost.The review comment incorrectly claims line 363 uses
Optional[Dict[str, Any]]due to a type mismatch with the field. However, examination of the code shows:
- The parameter is already
Optional[Dict[str, Any]](line 363)- Tests demonstrate token_usage contains float values (e.g.,
"estimated_cost": 0.048142at line 57-61 of test file)- The manager constructs token_usage with
"estimated_cost": costs.get("total_estimated_cost", 0.0), explicitly using float defaults- The field definition at line 241 incorrectly declares
Dict[str, int], which cannot accommodate these floatsThe error message set at line 409 even when
success=Trueis intentional design—the test at line 71 explicitly asserts this behavior ("Still notes it was a timeout").No changes are needed for this method; the type annotation is correct.
Likely an incorrect or invalid review comment.
| # Find most recent log directory | ||
| log_dirs = sorted(massgen_logs.glob("log_*"), reverse=True) | ||
| for log_dir in log_dirs: | ||
| nested_status = log_dir / "turn_1" / "attempt_1" / "status.json" | ||
| if nested_status.exists(): | ||
| status_file = nested_status | ||
| break |
There was a problem hiding this comment.
This is not necessarily true because we could've run another massgen cmd afterwards so most recent doesn't belong here.
PR: Recover partial persona output on timeout
Linear Issue
Summary
When persona generation subagent times out, the system now checks for valid
personas.jsonfiles produced before the timeout instead of immediately falling back to generic personas.Problem
The persona generation feature uses a 5-minute timeout. When
diversity_mode: implementationtriggers a nested multi-agent orchestration, this often exceeds the timeout. Previously, the code only checked for output files whenresult.successwas True, causing the system to fall back to generic personas (analytical/creative/systematic) even when valid task-specific personas had been generated.Changes
massgen/persona_generator.pyCheck output files regardless of success status (lines 615-659)
personas.jsoneven when subagent fails/times outEnhanced
_find_personas_jsonsearch patterns (lines 452-469)full_logs/final/agent_*/workspace/- completed workspacesfull_logs/agent_*/*/*/- timestamped run directoriesworkspace/snapshots/agent_*/- snapshot locationsworkspace/agent_*/- direct workspace directoriesworkspace/temp/agent_*/agent*/- temp directories from nested agentsTesting
log_20260102_120203_588712that had the timeout issuepytest massgen/tests/ -k persona)Before/After
Before (generic fallback):
After (recovered task-specific):
PR: Fix subagent cancellation recovery and status consolidation
Linear Issues
Summary
This PR fixes subagent timeout handling to properly recover completed work and consolidates the status.json files to a single source of truth.
Problems Fixed
full_logs/status.json, the code was overwriting this with hardcoded"timeout"or"failed"statusChanges
Status File Consolidation
Removed the outer
status.jsonfile (subagents/{id}/status.json). Now there is a single source of truth:full_logs/status.jsonwritten by the subagent's Orchestrator.Before:
After:
massgen/subagent/manager.py_write_status()method - No longer writing outer status.jsonget_subagent_status()- Now reads fromfull_logs/status.json_transform_orchestrator_status()- Transforms rich Orchestrator status into simplified view for MCPmin_timeoutandmax_timeoutparameters_clamp_timeout()method - Consistent timeout clampingfull_logs/{agentId}/{timestamp}/answer.txtfor persisted answersmassgen/subagent/models.pySUBAGENT_MIN_TIMEOUTfrom 300 to 60 (now a default, not hardcoded)SubagentConfig(moved to manager level)massgen/agent_config.pysubagent_min_timeoutandsubagent_max_timeouttoCoordinationConfigmassgen/orchestrator.pysubagent_min_timeoutandsubagent_max_timeoutto subagent MCP servermassgen/mcp_tools/subagent/_subagent_mcp_server.py--min-timeoutand--max-timeoutCLI argumentstimeout_secondsparameter fromspawn_subagentstoolDocumentation
docs/source/user_guide/advanced/subagents.rstwith new directory structure and status valuesSpecs
openspec/changes/fix-subagent-cancellation-recovery/specs/subagent-orchestration/spec.md- Recovery requirementsopenspec/changes/fix-subagent-cancellation-recovery/specs/subagent-status-consolidation/spec.md- Status consolidationTesting
Status Values
completedtruecompleted_but_timeouttruepartialfalsetimeoutfalseerrorfalseBefore/After
Before (bug):
{ "status": "failed", "token_usage": {}, "answer": null }After (fixed):
{ "status": "completed_but_timeout", "token_usage": {"input_tokens": 50000, "output_tokens": 3000, "estimated_cost": 0.05}, "answer": "Research completed. Created movies.md with...", "completion_percentage": 100 }PR: Release Automation and CodeRabbit Integration
Closes MAS-195
Summary
Adds automated release workflows, CodeRabbit AI code review configuration, and a
release-prepskill to streamline the release process.Changes
CodeRabbit Configuration
.coderabbit.yaml- AI code review configuration with:dev/v0.1.X)Release Automation
.github/workflows/pypi-publish.yml- New workflow triggered on GitHub Release publish:.github/workflows/release-docs-automation.yml- Modified to validate announcement files exist on tag pushAnnouncement System
docs/announcements/README.md- Documents the announcement workflowdocs/announcements/feature-highlights.md- Long-lived feature list (~3k chars for LinkedIn)docs/announcements/current-release.md- Template for active release announcementdocs/announcements/archive/- Directory for past announcementsRelease Prep Skill
massgen/skills/release-prep/SKILL.md- Claude Code skill that:Documentation Updates
CLAUDE.md- Added release workflow section, now tracked in gitCONTRIBUTING.md- Added sections for:.gitignore- RemovedCLAUDE.mdfrom ignore listFiles Changed
.coderabbit.yaml.github/workflows/pypi-publish.yml.github/workflows/release-docs-automation.yml.gitignoreCLAUDE.mdCONTRIBUTING.mddocs/announcements/README.mddocs/announcements/feature-highlights.mddocs/announcements/current-release.mddocs/announcements/archive/.gitkeepmassgen/skills/release-prep/SKILL.mdSecrets Required
PYPI_API_TOKENREADTHEDOCS_API_TOKENUsage
Release Workflow
release-prep v0.1.X- generates CHANGELOG, announcement, screenshot suggestionsgit tag v0.1.X && git push origin v0.1.XCodeRabbit Commands
In PR comments:
@coderabbitai review- Trigger incremental review@coderabbitai resolve- Mark all comments as resolvedSummary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.
` - Regenerate PR summaryPR: Unified Task Context for Tools and Subagents
Linear Issue
Summary
Custom tools (
read_media,generate_media) and subagents making external API calls now receive task context, preventing hallucinations like GPT-4.1 interpreting "MassGen" as "Massachusetts General Hospital."Problem
External API calls from multimodal tools and subagents had no project context. When analyzing images or generating media for a "MassGen website," external models would misinterpret terminology because they lacked context about what the user was building.
Solution
Main agents dynamically create a
CONTEXT.mdfile before spawning subagents or using multimodal tools. This file is automatically:Changes
New Files
massgen/context/__init__.pymassgen/context/task_context.pyload_task_context(),load_task_context_with_warning(),format_prompt_with_context(),TaskContextResultclassopenspec/changes/add-unified-project-context/Modified Files
massgen/system_prompt_sections.pyTaskContextSectionwith instructions for creating CONTEXT.mdmassgen/system_message_builder.pymassgen/backend/base_with_custom_tool_and_mcp.pytask_contextfield to ExecutionContextmassgen/tool/_multimodal_tools/read_media.pymassgen/tool/_multimodal_tools/understand_image.pymassgen/tool/_multimodal_tools/understand_audio.pymassgen/tool/_multimodal_tools/understand_video.pymassgen/tool/_multimodal_tools/understand_file.pymassgen/tool/_multimodal_tools/generation/generate_media.pymassgen/subagent/manager.pyload_task_context(), propagate truncation warning to SubagentResultmassgen/subagent/models.pywarningfield to SubagentResult, updated all factory methods to accept warning parametermassgen/tests/test_generate_media_batch.pySubagent System Prompt Enhancement
Added
SUBAGENT RELIABILITYsection to main agent's subagent delegation instructions:Key Design Decisions
Context Injection Format
Example CONTEXT.md (agent-generated)
Testing
Verified with real MassGen runs:
[TaskContext] Loaded 441 chars from .../CONTEXT.md- dynamic loading worksgenerate_mediasucceeds with"success": trueBefore/After
Before (hallucination):
After (grounded):