feat: Comprehensive logfire and analysis command by ncrispino · Pull Request #761 · massgen/MassGen

ncrispino · 2026-01-07T00:05:36Z

PR: Enhanced Log Analysis CLI

Closes MAS-225, MAS-227

Summary

This PR adds the massgen logs analyze command for generating log analysis prompts and running multi-agent self-analysis, along with improvements to massgen logs list with analysis status tracking.

Changes

New Features

`massgen logs analyze` Command

Prompt mode (default): Generates a prompt referencing the massgen-log-analyzer skill for use in coding CLIs
Self-analysis mode (--mode self): Runs MassGen with a 3-agent team to analyze logs from different perspectives
Supports --log-dir to specify a log directory (defaults to latest)
Supports --ui flag to choose UI mode: automation, rich_terminal (default), or webui
Supports --config for custom analysis configuration
Supports --turn/-t to specify which turn to analyze (defaults to latest turn)
Supports --force/-f to overwrite existing reports without prompting

Per-Turn Analysis Reports (MAS-227)

Reports are now placed at turn_N/ANALYSIS_REPORT.md (per-turn) instead of turn_N/attempt_M/ANALYSIS_REPORT.md (per-attempt)
Each turn (coordination session) gets its own analysis report as a sibling to attempt directories
Overwrite handling:
- Interactive mode: Prompts "Report exists for turn_N. Overwrite? [y/N]"
- With --force: Overwrites without prompting
- Automation mode without --force: Errors with guidance to use --force

Enhanced `massgen logs list`

New "Analyzed" column showing which logs have ANALYSIS_REPORT.md
--analyzed flag to filter to only analyzed logs
--unanalyzed flag to filter to only unanalyzed logs

Multi-Agent Log Analysis Config

New massgen/configs/analysis/log_analysis.yaml for 3-agent analysis
Agents use Gemini Flash with Docker execution and MCP tools
Dynamically injects context_paths for safe log file access (read-only log dir, write access only for report)
Automatically disables Logfire MCP if LOGFIRE_READ_TOKEN not set

`direct_mcp_servers` Config Option

New YAML parameter to keep specific MCP servers as direct protocol tools when using code-based tools
When enable_code_based_tools: true, user MCP servers are normally filtered to code-only access
direct_mcp_servers exempts specified servers, keeping them callable as native tools in the prompt
Useful for debugging/monitoring tools (e.g., Logfire) that need immediate access
Subagents automatically inherit direct_mcp_servers from parent
Logs warning if a server in direct_mcp_servers isn't found in mcp_servers

Example:

enable_code_based_tools: true
direct_mcp_servers:
  - logfire  # Kept as native tool in prompt
mcp_servers:
  - name: logfire
    type: stdio
    command: uvx
    args: ["logfire-mcp@latest"]

Bug Fixes

Fixed subprocess execution in self-analysis to use uv run massgen instead of sys.executable -m massgen
Config loader now returns both expanded and raw configs to avoid logging secrets in execution metadata
Fixed unknown tool handling in orchestrator: Unknown/malformed tool names (e.g., Gemini's default_api: prefix) no longer cause agent termination. Only client-provided external tools trigger the external tool call path; unknown tools are now logged and skipped gracefully.
Fixed vote-only mode wasting rounds (MAS-225): When agents reach max_new_answers_per_agent, they now correctly enter vote-only mode where:
- System message no longer mentions new_answer tool (only vote)
- Tools array excludes new_answer
- Internal tool name filtering uses agent-specific tools (prevents hallucinated new_answer calls from passing validation)
- Previously, agents would waste a full API round attempting new_answer before being rejected

Files Changed

File	Changes
`massgen/cli.py`	Added `analyze` subparser with `--mode`, `--log-dir`, `--ui`, `--config` flags; added `--analyzed`/`--unanalyzed` to list; config loader returns raw config for safe logging
`massgen/logs_analyzer.py`	Added `has_analysis_report()`, `display_analyze_prompt()`, `run_self_analysis()`; enhanced `display_list()` with analysis status column and filtering
`massgen/configs/analysis/log_analysis.yaml`	New 3-agent config for log analysis with Docker, MCP, Logfire support; uses `direct_mcp_servers` for logfire
`massgen/skills/massgen-log-analyzer/SKILL.md`	Updated with CLI quick reference section
`massgen/orchestrator.py`	Fixed unknown tool handling; added vote-only mode check before system message build; fixed `internal_tool_names` to use agent-specific tools
`massgen/system_message_builder.py`	Added `vote_only` parameter to `build_coordination_message()`
`massgen/system_prompt_sections.py`	Added `vote_only` parameter to `EvaluationSection` with vote-only content
`massgen/filesystem_manager/_filesystem_manager.py`	Added `direct_mcp_servers` parameter; skip direct MCPs in code generation; validation warning
`massgen/backend/base_with_custom_tool_and_mcp.py`	Updated filtering to preserve direct MCP servers as protocol tools
`massgen/backend/base.py`	Added `direct_mcp_servers` to filesystem_params and exclusion set
`massgen/api_params_handler/_api_params_handler_base.py`	Added `direct_mcp_servers` to exclusion set
`massgen/subagent/manager.py`	Added `direct_mcp_servers` to inheritance list
`docs/source/reference/yaml_schema.rst`	Added `direct_mcp_servers` parameter documentation
`docs/source/user_guide/tools/code_based_tools.rst`	Added "Direct MCP Servers" section
`massgen/configs/tools/mcp/direct_mcp_servers_example.yaml`	New example config demonstrating the feature

Usage Examples

# List logs with analysis status
massgen logs list                    # Shows "Analyzed" column
massgen logs list --analyzed         # Only logs with ANALYSIS_REPORT.md
massgen logs list --unanalyzed       # Only logs needing analysis

# Generate analysis prompt (for use in coding CLI)
massgen logs analyze                 # Analyze latest turn of latest log
massgen logs analyze --log-dir PATH  # Analyze specific log
massgen logs analyze --turn 1        # Analyze specific turn

# Run multi-agent self-analysis
massgen logs analyze --mode self                  # 3-agent analysis team (prompts if report exists)
massgen logs analyze --mode self --force          # Overwrite existing report without prompting
massgen logs analyze --mode self --turn 2         # Analyze specific turn
massgen logs analyze --mode self --ui automation --force  # Headless mode (--force required)
massgen logs analyze --mode self --config PATH    # Custom config

Test Plan

Notes

Self-analysis mode runs MassGen as a subprocess with the analysis config
Log directory is mounted read-only to protect existing files
Only ANALYSIS_REPORT.md path has write permission
Logfire MCP is automatically disabled if LOGFIRE_READ_TOKEN not set
Reports are placed per-turn at turn_N/ANALYSIS_REPORT.md, not per-attempt

Summary by CodeRabbit

New Features
- Logs: added --analyzed / --unanalyzed filters and a new logs analyze command with "prompt" and "self" modes
- New config option to mark selected MCP servers as direct protocol tools
- Self-analysis workflow to generate ANALYSIS_REPORT.md and run multi-agent analyses
- Vote-only coordination mode for orchestration
Bug Fixes
- Metadata saving now uses raw/unexpanded config to avoid logging expanded secrets
Documentation
- Expanded CLI, user guide, schema, analysis, and observability docs and examples

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…ermissions fix for Gemini; add jq to Docker

coderabbitai · 2026-01-07T00:05:51Z

📝 Walkthrough

Walkthrough

Preserves raw (unexpanded) config for metadata, adds direct_mcp_servers to bypass code-based MCP filtering, introduces log analysis and self-analysis CLI workflows, expands observability/logging attributes (MAS-199), adds vote-only coordination mode, and threads log paths through coordination/subagent flows; includes docs, configs, and Docker packaging tweaks.

Changes

Cohort / File(s)	Summary
Documentation `docs/source/reference/cli.rst`, `docs/source/reference/yaml_schema.rst`, `docs/source/user_guide/logging.rst`, `docs/source/user_guide/tools/code_based_tools.rst`, `massgen/skills/massgen-log-analyzer/SKILL.md`	Doc updates: CLI logs docs (list filters, `logs analyze`, analyze modes/UI/config), YAML schema documents new `direct_mcp_servers`, expanded log-analysis guidance and templates.
Analysis Configs `massgen/configs/analysis/log_analysis.yaml`, `massgen/configs/analysis/log_analysis_cli.yaml`	New multi-agent analysis configs for self-analysis, context_paths injection notes, orchestrator/timeouts and UI preferences.
CLI & Raw Config Handling `massgen/cli.py`, `massgen/__init__.py`, `massgen/frontend/web/server.py`	`load_config_file` now returns `(expanded_config, raw_config)`; `raw_config_for_metadata` threaded through interactive, run, and web flows to avoid logging expanded secrets; callsites updated.
Logs Analyzer & CLI Integration `massgen/logs_analyzer.py`	New LogAnalyzer, ANALYSIS_REPORT detection, natural sorting, list filtering (analyzed/unanalyzed), prompt display, and `run_self_analysis` (inject temp config, spawn MassGen subprocess); CLI wiring for `logs list`/`logs analyze`.
Direct MCP Servers `massgen/filesystem_manager/_filesystem_manager.py`, `massgen/backend/base.py`, `massgen/backend/base_with_custom_tool_and_mcp.py`, `massgen/api_params_handler/_api_params_handler_base.py`, `massgen/subagent/manager.py`	Added `direct_mcp_servers` config and propagation: filesystem manager skips listed MCPs during code-tool extraction; backend recognizes/excludes key; MCP filtering retains framework OR direct MCPs; subagent YAML generation passes setting; API param exclusion updated.
Observability & Structured Logging `massgen/structured_logging.py`, `massgen/structured_logging_utils.py`, `massgen/coordination_tracker.py`, `massgen/orchestrator.py`	MAS-199 attributes and helpers: truncation utilities, preview constants, MAX_FILES_CREATED; new logging params (log_path, agent_log_path, answer_path, error_context, restart triggers, agents_with_answers, file/workspace telemetry); propagate log_path through coordination/orchestrator; added `log_agent_workspace_files`.
Vote-Only / Prompt Sections `massgen/system_prompt_sections.py`, `massgen/system_message_builder.py`	Added `vote_only` flag to EvaluationSection and propagated into build_coordination_message to support vote-only workflows; added GrokGuidanceSection.
Gemini Backend `massgen/backend/gemini.py`	Added Gemini tool-name normalization; removed `GeminiBackend._setup_permission_hooks` override.
Frontend / Web Server `massgen/frontend/web/server.py`	Use raw (unexpanded) config for metadata when starting coordination runs; callsites updated.
Docker Images `massgen/docker/Dockerfile`, `massgen/docker/Dockerfile.sudo`	Added `jq` to system packages.
Proposals & Specs `openspec/changes/add-logfire-workflow-analysis/...`	New proposal, spec, and tasks describing Logfire workflow-analysis attributes, truncation rules, and implementation tasks.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant User as User/CLI
    participant Logs as LogAnalyzer
    participant FS as Filesystem (logs)
    participant Subproc as MassGen subprocess
    participant Logfire as Logfire (optional)

    Note over User,Logs: User runs `massgen logs analyze --mode self`
    User->>Logs: analyze(log_dir, mode, config?, ui_mode, turn, force)
    Logs->>FS: validate log_dir, check for ANALYSIS_REPORT.md
    alt prompt mode
      Logs->>User: render generated analysis prompt
    else self-analysis mode
      Logs->>Logs: prepare temp config (inject context_paths, UI adjustments)
      Logs->>Subproc: spawn MassGen subprocess with temp config
      Subproc->>FS: read logs/snapshots
      Subproc->>Subproc: run multi-agent analysis (agents/orchestrator)
      Subproc->>FS: write ANALYSIS_REPORT.md
      par optional publish
        Subproc->>Logfire: push traces/results
      end
      Subproc-->>Logs: exit status & output
      Logs->>FS: load ANALYSIS_REPORT.md and display results to User
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: v0.1.33 #724 — Overlaps edits to get_base_excluded_params in the API params handler.
feat: Fix subagent responses #740 — Overlaps subagent orchestration/status and subagent config propagation changes.
feat: Reactive Context Compression, Tool Result Eviction, Coordination Enhancements, Bug Fixes #697 — Related MCP/tool-handling and coordination/vote-only behavior changes.

Suggested reviewers

a5507203

🚥 Pre-merge checks | ✅ 4 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Documentation Updated	⚠️ Warning	PR lacks architectural design document in docs/dev_notes/ for significant MAS-199 workflow analysis feature and vote_only mode user documentation.	Add design doc at docs/dev_notes/logfire_workflow_analysis_design.md and user guide section explaining vote_only mode behavior and limitations.
Capabilities Registry Check	⚠️ Warning	PR introduces new Gemini 3 models in config files but token_manager.py lacks pricing data for these models, causing token cost tracking to fail.	Update token_manager.py to add pricing entries for gemini-3-flash-preview and gemini-3-pro-preview models, then run test suite to verify token tracking.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and concisely describes the main feature addition: comprehensive logfire and analysis command. It is specific, directly related to the primary changes (new logs analyze command, logfire workflow integration), and follows the required format.
Description check	✅ Passed	The PR description is comprehensive and well-structured, covering summary, new features, bug fixes, files changed, usage examples, test plan, and notes. It includes links to related issues, specific command examples, and clear explanations of the direct_mcp_servers feature and vote-only mode fixes.
Docstring Coverage	✅ Passed	Docstring coverage is 88.52% which is sufficient. The required threshold is 80.00%.
Config Parameter Sync	✅ Passed	The new `direct_mcp_servers` YAML parameter has been correctly added to both required locations: `massgen/backend/base.py` lines 85-98 and `massgen/api_params_handler/_api_params_handler_base.py` lines 40-53.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8952077 and 9d7ed12.

📒 Files selected for processing (4)

massgen/backend/base_with_custom_tool_and_mcp.py
massgen/backend/gemini.py
massgen/system_message_builder.py
massgen/system_prompt_sections.py

🚧 Files skipped from review as they are similar to previous changes (1)

massgen/backend/base_with_custom_tool_and_mcp.py

🧰 Additional context used

📓 Path-based instructions (4)

massgen/backend/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Backend implementations must inherit from base.py and register in backend/init.py

Files:

massgen/backend/gemini.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always use --automation flag for programmatic MassGen execution
Use Google-style docstrings for auto-generating API documentation

For new or changed functions, include Google-style docstrings

Files:

massgen/backend/gemini.py
massgen/system_prompt_sections.py
massgen/system_message_builder.py

massgen/**/*.py

⚙️ CodeRabbit configuration file

massgen/**/*.py: Review Python code for:

PEP 8 compliance and Google-style docstrings

Proper async/await patterns

Type hints where appropriate

No security vulnerabilities (command injection, etc.)

Consistency with existing code patterns

Files:

massgen/backend/gemini.py
massgen/system_prompt_sections.py
massgen/system_message_builder.py

massgen/backend/**/*.py

⚙️ CodeRabbit configuration file

massgen/backend/**/*.py: Backend code review - CHECK FOR THESE UPDATES:

If adding new model support → ensure massgen/backend/capabilities.py is updated

If adding new capabilities → update supported_capabilities in capabilities.py

Verify inheritance from base classes is correct

Check streaming buffer usage where applicable

Ensure error handling follows existing patterns

Files:

massgen/backend/gemini.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: docs/source/development/writing_configs.rst:0-0
Timestamp: 2026-01-07T03:58:40.702Z
Learning: Applies to docs/source/development/examples/tools/**/*.yaml : When using MCP tools, configure mcp_servers under agent.backend and do not add a filesystem MCP server; filesystem access is via cwd

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: docs/source/development/writing_configs.rst:0-0
Timestamp: 2026-01-07T03:58:40.702Z
Learning: Applies to docs/source/development/examples/tools/**/*.yaml : For multi-agent configs, configure voting_sensitivity and coordination only under orchestrator; do not duplicate per agent

📚 Learning: 2026-01-07T03:57:17.438Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: docs/dev_notes/release_checklist.md:0-0
Timestamp: 2026-01-07T03:57:17.438Z
Learning: Applies to docs/dev_notes/massgen/config_builder.py : Update `massgen/config_builder.py` to add new YAML parameters to interactive wizard, parameter validation, and help text when new configuration options are added

Applied to files:

massgen/system_prompt_sections.py
massgen/system_message_builder.py

🧬 Code graph analysis (1)

massgen/system_message_builder.py (1)

massgen/system_prompt_sections.py (2)

GrokGuidanceSection (249-276)

add_section (2408-2419)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: pre-commit (macos-14, 3.10)
GitHub Check: pre-commit (macos-14, 3.12)
GitHub Check: pre-commit (macos-latest, 3.10)
GitHub Check: pre-commit (macos-latest, 3.10)
GitHub Check: pre-commit (macos-14, 3.10)

🔇 Additional comments (10)

massgen/system_message_builder.py (4)

29-29: LGTM!

Import is correctly added and used for the Grok-specific guidance section.

96-97: LGTM!

The vote_only parameter is properly added with a sensible default, correct type hint, and well-documented in the docstring following Google-style format.

Also applies to: 113-114

135-141: LGTM!

The logic correctly adds GPT5GuidanceSection for both GPT-5 and Grok models (both benefit from solution persistence and tool preamble guidance), while additionally adding GrokGuidanceSection specifically for Grok models to address the HTML-escaping issue.

149-155: LGTM!

The vote_only parameter is correctly propagated to EvaluationSection, enabling the vote-only mode when the agent has reached their maximum answer count.

massgen/system_prompt_sections.py (3)

249-276: LGTM!

The GrokGuidanceSection class is well-implemented:

Follows the established pattern of GPT5GuidanceSection

Priority 4 aligns with related guidance sections

Docstring is comprehensive and explains the known Grok 4.1 issue

Content provides clear, actionable instructions to prevent HTML-escaping of file content

1495-1511: LGTM!

The vote_only parameter is correctly added to EvaluationSection:

Docstring is updated with proper description

Default value of False maintains backward compatibility

Instance variable is properly stored for use in build_content

1513-1527: LGTM!

The vote-only mode implementation is correct:

Early return pattern keeps the code clean

Prompt is clear that only the vote tool is available

Timestamp is included for context (consistent with the normal evaluation prompt)

The message appropriately instructs the agent that they've reached their answer limit

massgen/backend/gemini.py (3)

69-91: LGTM! Clean implementation of tool name normalization.

The function correctly strips Gemini-specific prefixes and is well-documented with a Google-style docstring. Using a tuple for prefixes allows easy extension if additional prefixes are discovered.

757-758: LGTM! Normalization correctly applied at extraction point.

The tool name is normalized immediately upon extraction from the Gemini API response, ensuring consistent matching against registered tools.

1343-1344: Consistent normalization of tool names across all extraction points.

All tool name extractions from Gemini API responses correctly apply the _normalize_gemini_tool_name() function. The normalization is applied consistently at both extraction points (lines 758 and 1344), properly stripping Gemini-specific prefixes like 'default_api:' to match registered tools.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In @massgen/configs/analysis/log_analysis_cli.yaml:
- Around line 122-151: Remove the hard-coded, user-specific path from
orchestrator.context_paths and replace it with an empty list or a runtime
placeholder so the config is portable; update the context_paths entry under the
orchestrator section (symbol: orchestrator.context_paths / context_paths) to be
[] or a clearly-named placeholder to be populated when running with --mode self,
ensuring no absolute personal file paths remain committed.

In @massgen/logs_analyzer.py:
- Around line 616-618: The current selection of the first attempt directory uses
lexicographic sorting via sorted(attempt_dirs)[0], which misorders multi-digit
turns/attempts (e.g., turn_10 vs turn_2); replace this with a natural sort so
attempt_dir picks the true earliest attempt: either use
natsorted(attempt_dirs)[0] (from natsort) or implement a key that tokenizes the
path string into numeric and non-numeric parts (e.g., splitting with a regex and
converting digit tokens to int) and pass that as the key to sorted, then set
report_path = attempt_dir / "ANALYSIS_REPORT.md" as before.

In @massgen/skills/massgen-log-analyzer/SKILL.md:
- Around line 68-77: The markdown table under "Key Local Log Files:" in SKILL.md
lacks surrounding blank lines which breaks proper rendering; edit the SKILL.md
content around the table (the block containing `metrics_summary.json`,
`coordination_events.json`, etc.) and add a blank line before the table and a
blank line after the table so it is separated from surrounding
paragraphs/headings.

In @massgen/structured_logging.py:
- Around line 762-764: Create a new module named
massgen.structured_logging_utils and define the missing constants referenced by
massgen/structured_logging.py: PREVIEW_ERROR_CONTEXT, PREVIEW_RESTART_REASON,
PREVIEW_ANSWER_EACH, PREVIEW_VOTE_REASON, PREVIEW_SUBAGENT_TASK, and
MAX_FILES_CREATED; export them with sensible default values (strings for the
PREVIEW_* constants and an integer for MAX_FILES_CREATED) so that the imports in
structured_logging.py (e.g., the from massgen.structured_logging_utils import
PREVIEW_ERROR_CONTEXT lines and the other seven import sites) succeed without
raising ModuleNotFoundError.

🧹 Nitpick comments (19)

massgen/skills/massgen-log-analyzer/SKILL.md (3)
619-639: Clarify working directory for analysis commands.

The shell commands (lines 621-639) assume execution from the log directory but this isn't explicitly stated. Consider adding a note at the beginning of this section to clarify that commands should be run from [log_dir]/turn_1/attempt_1/.

Example addition:
### Analysis Commands

**Note:** Run these commands from the log directory (`[log_dir]/turn_1/attempt_1/`).
860-876: Review phase categorization logic in efficiency query.

The CASE statement at lines 861-865 has a potential logic issue:

agent.%.round_0 matches initial_answer (line 862)

agent.%.round_% matches voting (line 863)

Since round_0 matches both patterns, the order matters. SQL CASE statements are evaluated top-to-bottom, so round_0 will correctly match as 'initial_answer'. However, this is subtle and could be clarified.

Consider making the patterns more explicit:
CASE
  WHEN span_name LIKE 'agent.%.round_0' THEN 'initial_answer'
  WHEN span_name LIKE 'agent.%.round_%' AND span_name NOT LIKE 'agent.%.round_0' THEN 'voting'
  WHEN span_name LIKE 'agent.%.presentation' THEN 'presentation'
  ELSE 'other'
END as phase
Or add a comment clarifying the evaluation order dependency.

93-93: Consider wrapping bare URLs for markdown compliance (optional).

Static analysis flagged the bare URLs at lines 93 and 107. While acceptable in instructional documentation, you could optionally wrap them in angle brackets for markdown compliance: <https://logfire.pydantic.dev/>

Also applies to: 107-107
massgen/filesystem_manager/_filesystem_manager.py (1)
1282-1289: Helpful validation for misconfigured direct_mcp_servers.

The warning for servers listed in direct_mcp_servers but not configured in mcp_servers helps catch configuration errors early. This is a good defensive check.

One minor observation: the warning message formatting uses f-string concatenation which could be simplified.
🔎 Optional: Simplify the warning message formatting
                 if direct_server not in configured_server_names:
                     logger.warning(
-                        f"[FilesystemManager] direct_mcp_servers contains '{direct_server}' " f"but no MCP server with that name is configured in mcp_servers",
+                        f"[FilesystemManager] direct_mcp_servers contains '{direct_server}' "
+                        f"but no MCP server with that name is configured in mcp_servers",
                     )
openspec/changes/add-logfire-workflow-analysis/tasks.md (1)

1-64: Checklist is comprehensive; track remaining manual validation

The tasks file is well-structured and maps cleanly onto the proposal (constants, logging hooks, coordination tracker, file references, docs). The only remaining item is 10.3 (manual verification with Logfire); once you’ve completed that, it’d be good to flip it to [x] so OpenSpec reflects reality.
massgen/configs/analysis/log_analysis_cli.yaml (2)
1-15: Tighten usage comments for automation

Since this config is intended for scripted/CLI analysis, consider documenting the --automation flag in the usage example so downstream tooling invokes MassGen in automation mode:
Suggested tweak to usage header
-#   uv run massgen --config massgen/configs/analysis/log_analysis_cli.yaml "Analyze the log directory at …"
+#   uv run massgen --automation --config massgen/configs/analysis/log_analysis_cli.yaml \
+#     "Analyze the log directory at …"
16-121: Config structure is consistent; minor model & doc polish

Backend-level cwd and orchestrator-level context_paths placement is correct.

All three agents share identical backend/system settings, which is what you want for a symmetric multi‑agent analysis config.

You might consider defaulting to a more cost-effective Gemini model (e.g., gemini-2.5-flash) unless there’s a specific need for gemini-3-flash-preview.

If you expect others to reuse this, a short “What happens” comment summarizing the 3-agent analysis flow (who plans, who critiques, who synthesizes) near the header would make the config more self-explanatory.
massgen/coordination_tracker.py (2)
515-582: Answer-path construction is correct; could leverage pathlib for robustness

The new block that derives answer_path from self.log_path and the snapshot mapping:
answer_path = None
if self.log_path and snapshot_timestamp:
    relative_path = self._make_snapshot_path("answer", agent_id, snapshot_timestamp)
    answer_path = f"{self.log_path}/{relative_path}"
is functionally correct and will yield the same shape used elsewhere (agent_id/timestamp/answer.txt under the log root). To make this a bit more robust across OS/path edge cases, you could optionally switch to Path-based joining:
Optional pathlib-based refactor
-        answer_path = None
-        if self.log_path and snapshot_timestamp:
-            relative_path = self._make_snapshot_path("answer", agent_id, snapshot_timestamp)
-            answer_path = f"{self.log_path}/{relative_path}"
+        answer_path = None
+        if self.log_path and snapshot_timestamp:
+            relative_path = self._make_snapshot_path("answer", agent_id, snapshot_timestamp)
+            from pathlib import Path
+            answer_path = str(Path(self.log_path) / relative_path)
584-690: Vote-context enrichment is correct; consider breaking early in label lookup

The new MAS-199 vote-context fields are wired sensibly:

agents_with_answers counts agents with at least one answer.

answer_label_mapping maps each iteration_available_labels entry back to its owning agent.

The triple loop to build answer_label_mapping is correct but does more work than needed once a label is found:
answer_label_mapping = {}
for label in self.iteration_available_labels:
    for aid, answers_list in self.answers_by_agent.items():
        for ans in answers_list:
            if ans.label == label:
                answer_label_mapping[label] = aid
                break
You could add an outer break or factor the search into a helper to avoid scanning remaining agents once a label is resolved. This is minor (answer counts are usually small) but would simplify reasoning and avoid redundant iterations.
massgen/backend/base_with_custom_tool_and_mcp.py (1)

2348-2379: Direct MCP server filtering correctly preserves intended protocol tools

The updated filtering that treats both FRAMEWORK_MCPS and filesystem_manager.direct_mcp_servers as protocol MCPs, while pushing other user MCPs to code-based access, is consistent and safe. The conservative fallback of retaining tools with unknown server_name avoids accidental regression, and the logging of removed_tools should make behavior observable during rollout.

massgen/orchestrator.py (3)

1245-1248: Session reinitialization with log_path is correct; consider centralizing initialization

Re‑initializing the coordination session once you have self.current_task and log_path is aligned with MAS‑199 and looks correct. Right now initialize_session is called in __init__, here in chat, and again in handle_restart; functionally this is fine (last call wins), but if initialization becomes heavier you might consider centralizing it (e.g., only from chat/handle_restart) to avoid multiple resets per session.

4191-4192: Tighten “unknown tool” detection to avoid mislabeling valid MCP/custom tools

internal_tool_names is built only from workflow tools, so any other tool name (including valid MCP/custom tools such as mcp__... or custom_tool...) will satisfy tool_name and tool_name not in internal_tool_names and hit the “unknown tool” warning path, even though they’re later handled correctly in the MCP/custom branches. That can produce confusing double logging (“unknown” then “using … tool”) and noisy traces.

Consider narrowing the “unknown” check to names that are genuinely unsupported, e.g.:

Skip the warning for names starting with mcp, containing '__', or starting with custom_tool.

Or, reuse the same predicates you use later (startswith("mcp"), "__" in tool_name, startswith("custom_tool")) to decide which tools are considered known before emitting the warning.

This keeps the new graceful handling for truly hallucinated tools (e.g., default_api:foo) without mislabeling valid back‑end tools.

Also applies to: 4260-4280

5519-5523: Restart tracker re‑init: optionally pass the current task for clearer logs

Recreating CoordinationTracker on restart and seeding it with agent IDs and log_path aligns with the new log‑path‑aware session model. Since self.current_task is already known here, you might also pass it as the question argument (mirroring the call in chat) so the restarted tracker has full context immediately, rather than only after the next chat() re‑initialization.
massgen/cli.py (2)
4370-4378: raw_config_for_metadata plumbing is solid; adjust typing and a small nit

The end-to-end use of raw_config_for_metadata for save_execution_metadata in both interactive and single-question flows looks correct and achieves the goal of logging the raw (unexpanded) config where available.

Two small improvements:

Make the annotation explicitly optional to satisfy PEP 484 / Ruff (RUF013) and improve type-checking:
Suggested type-hint tweak
-async def run_interactive_mode(
-    agents: Dict[str, SingleAgent],
-    ui_config: Dict[str, Any],
-    original_config: Dict[str, Any] = None,
-    orchestrator_cfg: Dict[str, Any] = None,
-    config_path: Optional[str] = None,
-    memory_session_id: Optional[str] = None,
-    initial_question: Optional[str] = None,
-    restore_session_if_exists: bool = False,
-    debug: bool = False,
-    raw_config_for_metadata: Dict[str, Any] = None,
-    **kwargs,
-):
+async def run_interactive_mode(
+    agents: Dict[str, SingleAgent],
+    ui_config: Dict[str, Any],
+    original_config: Dict[str, Any] = None,
+    orchestrator_cfg: Dict[str, Any] = None,
+    config_path: Optional[str] = None,
+    memory_session_id: Optional[str] = None,
+    initial_question: Optional[str] = None,
+    restore_session_if_exists: bool = False,
+    debug: bool = False,
+    raw_config_for_metadata: Optional[Dict[str, Any]] = None,
+    **kwargs,
+):
In the single-question metadata call, the "resolved_path" in locals() check is redundant because resolved_path is always defined at the top of main. You can simplify and slightly clarify the expression:
Suggested config_path simplification
-            save_execution_metadata(
-                query=args.question,
-                config_path=(str(resolved_path) if args.config and "resolved_path" in locals() else None),
-                config_content=raw_config_for_metadata,
-                cli_args=vars(args),
-            )
+            save_execution_metadata(
+                query=args.question,
+                config_path=(str(resolved_path) if args.config and resolved_path else None),
+                config_content=raw_config_for_metadata,
+                cli_args=vars(args),
+            )
Overall, the metadata logging changes themselves look correct.

Also applies to: 4954-4964, 5128-5129, 5227-5228, 5450-5456, 5470-5487

5672-5681: logs list/analyze flags align with the new workflow; consider tightening CLI semantics

The additions for --analyzed / --unanalyzed and the new logs analyze subcommand fit the described log-analysis workflow and are wired into the dedicated logs_command parser cleanly.

Two refinements to consider:

Make --analyzed and --unanalyzed mutually exclusive at the parser level, so contradictory filters are rejected early instead of relying on downstream logic:
Suggested mutually exclusive filter group for logs list
-        list_parser = logs_subparsers.add_parser("list", help="List recent runs")
-        list_parser.add_argument(
-            "--limit",
-            type=int,
-            default=10,
-            help="Number of runs to show",
-        )
-        list_parser.add_argument(
-            "--analyzed",
-            action="store_true",
-            help="Show only logs with ANALYSIS_REPORT.md",
-        )
-        list_parser.add_argument(
-            "--unanalyzed",
-            action="store_true",
-            help="Show only logs without ANALYSIS_REPORT.md",
-        )
+        list_parser = logs_subparsers.add_parser("list", help="List recent runs")
+        list_parser.add_argument(
+            "--limit",
+            type=int,
+            default=10,
+            help="Number of runs to show",
+        )
+
+        filter_group = list_parser.add_mutually_exclusive_group()
+        filter_group.add_argument(
+            "--analyzed",
+            action="store_true",
+            help="Show only logs with ANALYSIS_REPORT.md",
+        )
+        filter_group.add_argument(
+            "--unanalyzed",
+            action="store_true",
+            help="Show only logs without ANALYSIS_REPORT.md",
+        )
For logs analyze, --config and --ui only make sense in --mode self. Ensuring logs_command warns or errors when they’re passed with --mode prompt would help prevent accidental misuse, even if it’s just a runtime validation rather than additional argparse wiring.

Also applies to: 5695-5721
massgen/logs_analyzer.py (4)
478-483: Consider catching specific exceptions instead of bare Exception.

The broad except Exception at line 480 catches all errors during metrics parsing. While this prevents crashes, it may hide unexpected issues. Consider catching more specific exceptions like json.JSONDecodeError and KeyError.
🔎 Suggested improvement
-            except Exception:
+            except (json.JSONDecodeError, KeyError, TypeError) as e:
+                logger.debug(f"Error reading metrics for {log_dir}: {e}")
                 table.add_row(str(i), timestamp, "?", "?", analyzed_str, "[red]Error reading metrics[/red]")
603-608: Consider catching specific exceptions for config loading.

The broad except Exception may hide issues like YAML syntax errors or permission problems. Consider catching specific exceptions and providing more informative error messages.
🔎 Suggested improvement
     try:
         with open(config_path) as f:
             config = yaml.safe_load(f)
-    except Exception as e:
-        console.print(f"[red]Error loading config:[/red] {e}")
+    except yaml.YAMLError as e:
+        console.print(f"[red]Error parsing YAML config:[/red] {e}")
+        return 1
+    except OSError as e:
+        console.print(f"[red]Error reading config file:[/red] {e}")
         return 1
706-711: Consider logging cleanup failures instead of silently ignoring them.

The try-except-pass pattern silently swallows cleanup errors. While cleanup failures shouldn't crash the program, logging them aids debugging.
🔎 Suggested improvement
     finally:
         # Clean up temporary config file
         try:
             Path(tmp_config_path).unlink()
-        except Exception:
-            pass
+        except OSError as e:
+            logger.debug(f"Failed to clean up temp config file: {e}")
Note: This requires adding from loguru import logger at the top of the file if not already present.
392-405: Consider optimizing filtering for large log directories.

The current implementation iterates through all logs before applying the limit. For directories with thousands of logs, this could be slow. Consider early termination once limit matching logs are found.
🔎 Suggested improvement
     # Filter by analysis status if requested
     if analyzed_only or unanalyzed_only:
         filtered_logs = []
         for log_dir in all_logs:
+            if len(filtered_logs) >= limit:
+                break
             has_report = has_analysis_report(log_dir)
             if analyzed_only and has_report:
                 filtered_logs.append(log_dir)
             elif unanalyzed_only and not has_report:
                 filtered_logs.append(log_dir)
-        logs = filtered_logs[:limit]
+        logs = filtered_logs
     else:
         logs = all_logs[:limit]

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 68cc406 and 99cc7e4.

📒 Files selected for processing (27)

docs/source/reference/cli.rst
docs/source/reference/yaml_schema.rst
docs/source/user_guide/logging.rst
docs/source/user_guide/tools/code_based_tools.rst
massgen/__init__.py
massgen/api_params_handler/_api_params_handler_base.py
massgen/backend/base.py
massgen/backend/base_with_custom_tool_and_mcp.py
massgen/backend/gemini.py
massgen/cli.py
massgen/configs/analysis/log_analysis.yaml
massgen/configs/analysis/log_analysis_cli.yaml
massgen/coordination_tracker.py
massgen/docker/Dockerfile
massgen/docker/Dockerfile.sudo
massgen/filesystem_manager/_filesystem_manager.py
massgen/frontend/web/server.py
massgen/logs_analyzer.py
massgen/orchestrator.py
massgen/skills/massgen-log-analyzer/SKILL.md
massgen/structured_logging.py
massgen/subagent/manager.py
massgen/system_message_builder.py
massgen/system_prompt_sections.py
openspec/changes/add-logfire-workflow-analysis/proposal.md
openspec/changes/add-logfire-workflow-analysis/specs/observability/spec.md
openspec/changes/add-logfire-workflow-analysis/tasks.md

💤 Files with no reviewable changes (1)

massgen/backend/gemini.py

🧰 Additional context used

📓 Path-based instructions (15)

docs/source/**/*.rst

⚙️ CodeRabbit configuration file

docs/source/**/*.rst: RST documentation review:

Ensure proper RST syntax

Check cross-references are valid

Verify code examples are runnable and match current implementation

For new features, ensure docs/source/index.rst "Recent Releases" is updated

Documentation quality:

Keep explanations concise - avoid bloated prose

Remove redundant or duplicate information

Ensure file paths and parameter names match actual code

Flag any examples that may be outdated or inconsistent with implementation

Files:

docs/source/reference/cli.rst
docs/source/user_guide/logging.rst
docs/source/reference/yaml_schema.rst
docs/source/user_guide/tools/code_based_tools.rst

massgen/configs/**/*.yaml

📄 CodeRabbit inference engine (CLAUDE.md)

Configuration YAML files should be organized in massgen/configs/ directories: basic/, tools/, providers/, and teams/

Files:

massgen/configs/analysis/log_analysis.yaml
massgen/configs/analysis/log_analysis_cli.yaml

⚙️ CodeRabbit configuration file

massgen/configs/**/*.yaml: Validate YAML configs follow MassGen conventions:

Property placement: cwd at BACKEND-level, context_paths at ORCHESTRATOR-level

Prefer cost-effective models (gpt-5-nano, gpt-5-mini, gemini-2.5-flash)

All agents should have identical system_message for multi-agent setups

Never reference legacy paths or massgen v1

Include "What happens" comments explaining execution flow

Files:

massgen/configs/analysis/log_analysis.yaml
massgen/configs/analysis/log_analysis_cli.yaml

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always use --automation flag for programmatic MassGen execution
Use Google-style docstrings for auto-generating API documentation

Files:

massgen/coordination_tracker.py
massgen/backend/base.py
massgen/backend/base_with_custom_tool_and_mcp.py
massgen/system_prompt_sections.py
massgen/subagent/manager.py
massgen/__init__.py
massgen/orchestrator.py
massgen/api_params_handler/_api_params_handler_base.py
massgen/frontend/web/server.py
massgen/filesystem_manager/_filesystem_manager.py
massgen/system_message_builder.py
massgen/structured_logging.py
massgen/logs_analyzer.py
massgen/cli.py

massgen/**/*.py

⚙️ CodeRabbit configuration file

massgen/**/*.py: Review Python code for:

PEP 8 compliance and Google-style docstrings

Proper async/await patterns

Type hints where appropriate

No security vulnerabilities (command injection, etc.)

Consistency with existing code patterns

Files:

massgen/coordination_tracker.py
massgen/backend/base.py
massgen/backend/base_with_custom_tool_and_mcp.py
massgen/system_prompt_sections.py
massgen/subagent/manager.py
massgen/__init__.py
massgen/orchestrator.py
massgen/api_params_handler/_api_params_handler_base.py
massgen/frontend/web/server.py
massgen/filesystem_manager/_filesystem_manager.py
massgen/system_message_builder.py
massgen/structured_logging.py
massgen/logs_analyzer.py
massgen/cli.py

openspec/changes/*/specs/**/*.md

📄 CodeRabbit inference engine (openspec/AGENTS.md)

openspec/changes/*/specs/**/*.md: Use ## ADDED|MODIFIED|REMOVED|RENAMED Requirements headers in spec delta files
Include at least one #### Scenario: per requirement in spec delta files using the format #### Scenario: [Name] with bullets - **WHEN** ... - **THEN** ...
Use #### Scenario: (4 hashtags) for scenario headers, not 3 hashtags, bold, or bullet points
Use SHALL/MUST for normative requirements in spec delta files, avoid should/may unless intentionally non-normative
When using MODIFIED for a requirement, copy the entire existing requirement block and paste under ## MODIFIED Requirements with updated content
In multi-capability changes, create separate delta spec files under changes/[change-id]/specs/[capability]/spec.md for each affected capability
Distinguish between ADDED (new standalone requirement), MODIFIED (changed behavior of existing requirement), and RENAMED (name change) operations in spec deltas

Files:

openspec/changes/add-logfire-workflow-analysis/specs/observability/spec.md

massgen/backend/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Backend implementations must inherit from base.py and register in backend/init.py

Files:

massgen/backend/base.py
massgen/backend/base_with_custom_tool_and_mcp.py

massgen/backend/base.py

📄 CodeRabbit inference engine (CLAUDE.md)

When adding new YAML parameters, update both massgen/backend/base.py and massgen/api_params_handler/_api_params_handler_base.py in get_base_excluded_config_params()

Files:

massgen/backend/base.py

massgen/backend/**/*.py

⚙️ CodeRabbit configuration file

massgen/backend/**/*.py: Backend code review - CHECK FOR THESE UPDATES:

If adding new model support → ensure massgen/backend/capabilities.py is updated

If adding new capabilities → update supported_capabilities in capabilities.py

Verify inheritance from base classes is correct

Check streaming buffer usage where applicable

Ensure error handling follows existing patterns

Files:

massgen/backend/base.py
massgen/backend/base_with_custom_tool_and_mcp.py

openspec/changes/*/tasks.md

📄 CodeRabbit inference engine (openspec/AGENTS.md)

Create tasks.md with numbered implementation checklist in every change proposal

Files:

openspec/changes/add-logfire-workflow-analysis/tasks.md

massgen/skills/**/*.md

⚙️ CodeRabbit configuration file

massgen/skills/**/*.md: Skill documentation review:

Verify SKILL.md frontmatter is complete (name, description)

Check workflow steps are clear and actionable

Ensure reference files are accurate

Files:

massgen/skills/massgen-log-analyzer/SKILL.md

openspec/changes/*/proposal.md

📄 CodeRabbit inference engine (openspec/AGENTS.md)

Create proposal.md with sections: Why, What Changes, and Impact when creating a change proposal

Files:

openspec/changes/add-logfire-workflow-analysis/proposal.md

docs/source/user_guide/*.rst

📄 CodeRabbit inference engine (CLAUDE.md)

For new features, update docs/source/user_guide/ RST files with runnable commands and expected output

Files:

docs/source/user_guide/logging.rst

docs/source/user_guide/**/*.rst

⚙️ CodeRabbit configuration file

docs/source/user_guide/**/*.rst: User guide documentation review:

Completeness:

Clear explanations with working code examples

Expected output where applicable

Links to related configuration examples

Consistency with implementation:

Parameter names must match actual code

File paths must be accurate

Behavior descriptions must match what the code does

Flag any discrepancies between docs and implementation

Usability:

Include runnable commands users can try immediately

Provide architecture diagrams for complex features

Show expected output so users know what to expect

Conciseness:

Avoid over-documentation and bloated explanations

One clear explanation is better than multiple redundant ones

Remove filler text and unnecessary verbosity

Files:

docs/source/user_guide/logging.rst
docs/source/user_guide/tools/code_based_tools.rst

massgen/api_params_handler/_api_params_handler_base.py

📄 CodeRabbit inference engine (CLAUDE.md)

When adding new YAML parameters, update both massgen/backend/base.py and massgen/api_params_handler/_api_params_handler_base.py in get_base_excluded_config_params()

Files:

massgen/api_params_handler/_api_params_handler_base.py

docs/source/reference/yaml_schema.rst

📄 CodeRabbit inference engine (CLAUDE.md)

For new YAML parameters, update docs/source/reference/yaml_schema.rst

Files:

docs/source/reference/yaml_schema.rst

🧠 Learnings (12)

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Applies to massgen/configs/**/*.yaml : Configuration YAML files should be organized in massgen/configs/ directories: basic/, tools/, providers/, and teams/

Applied to files:

massgen/configs/analysis/log_analysis.yaml
massgen/configs/analysis/log_analysis_cli.yaml

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Use OpenSpec proposal format from @/openspec/AGENTS.md for significant changes and reference Linear issue IDs

Applied to files:

openspec/changes/add-logfire-workflow-analysis/specs/observability/spec.md
openspec/changes/add-logfire-workflow-analysis/tasks.md
openspec/changes/add-logfire-workflow-analysis/proposal.md

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Open @/openspec/AGENTS.md when requests mention planning, proposals, specs, changes, plans, new capabilities, breaking changes, or architecture shifts

Applied to files:

openspec/changes/add-logfire-workflow-analysis/specs/observability/spec.md

📚 Learning: 2026-01-05T17:34:18.455Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: openspec/AGENTS.md:0-0
Timestamp: 2026-01-05T17:34:18.455Z
Learning: Applies to openspec/changes/*/specs/**/*.md : Distinguish between ADDED (new standalone requirement), MODIFIED (changed behavior of existing requirement), and RENAMED (name change) operations in spec deltas

Applied to files:

openspec/changes/add-logfire-workflow-analysis/specs/observability/spec.md

📚 Learning: 2026-01-05T17:34:18.455Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: openspec/AGENTS.md:0-0
Timestamp: 2026-01-05T17:34:18.455Z
Learning: Applies to openspec/changes/*/design.md : Create `design.md` only if the change involves cross-cutting concerns, new dependencies, data model changes, security/performance complexity, or architectural ambiguity

Applied to files:

openspec/changes/add-logfire-workflow-analysis/specs/observability/spec.md

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Applies to massgen/backend/base.py : When adding new YAML parameters, update both massgen/backend/base.py and massgen/api_params_handler/_api_params_handler_base.py in get_base_excluded_config_params()

Applied to files:

massgen/backend/base.py
massgen/api_params_handler/_api_params_handler_base.py

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Applies to massgen/api_params_handler/_api_params_handler_base.py : When adding new YAML parameters, update both massgen/backend/base.py and massgen/api_params_handler/_api_params_handler_base.py in get_base_excluded_config_params()

Applied to files:

massgen/backend/base.py
massgen/api_params_handler/_api_params_handler_base.py

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Applies to massgen/backend/capabilities.py : Add new backend capabilities to massgen/backend/capabilities.py

Applied to files:

massgen/backend/base.py

📚 Learning: 2026-01-05T17:34:18.455Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: openspec/AGENTS.md:0-0
Timestamp: 2026-01-05T17:34:18.455Z
Learning: Applies to openspec/changes/*/tasks.md : Create `tasks.md` with numbered implementation checklist in every change proposal

Applied to files:

openspec/changes/add-logfire-workflow-analysis/tasks.md

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Applies to **/*.py : Always use --automation flag for programmatic MassGen execution

Applied to files:

massgen/skills/massgen-log-analyzer/SKILL.md

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Applies to massgen/tool/*/TOOL.md : All custom tools in massgen/tool/ require a TOOL.md file with YAML frontmatter including name, description, category, requires_api_keys, tasks, and keywords

Applied to files:

massgen/skills/massgen-log-analyzer/SKILL.md

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Applies to docs/source/reference/yaml_schema.rst : For new YAML parameters, update docs/source/reference/yaml_schema.rst

Applied to files:

docs/source/reference/yaml_schema.rst

🧬 Code graph analysis (7)

massgen/coordination_tracker.py (1)

massgen/structured_logging.py (3)

trace_coordination_session (904-954)

log_agent_answer (1114-1152)

log_agent_vote (1196-1251)

massgen/backend/base_with_custom_tool_and_mcp.py (1)

massgen/tests/test_code_based_tools_integration.py (1)

filesystem_manager (127-135)

massgen/__init__.py (2)

massgen/cli.py (1)

load_config_file (386-443)

massgen/logger_config.py (1)

save_execution_metadata (262-318)

massgen/frontend/web/server.py (2)

massgen/cli.py (1)

load_config_file (386-443)

massgen/logger_config.py (1)

save_execution_metadata (262-318)

massgen/filesystem_manager/_filesystem_manager.py (1)

massgen/structured_logging.py (1)

warning (292-298)

massgen/logs_analyzer.py (2)

massgen/frontend/displays/rich_terminal_display.py (3)

Console (54-55)

Panel (51-52)

Table (63-64)

massgen/__init__.py (1)

run (262-540)

massgen/cli.py (2)

massgen/v1/config.py (1)

ConfigurationError (24-25)

massgen/logger_config.py (1)

save_execution_metadata (262-318)

🪛 markdownlint-cli2 (0.18.1)

massgen/skills/massgen-log-analyzer/SKILL.md

69-69: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)

93-93: Bare URL used

(MD034, no-bare-urls)

107-107: Bare URL used

(MD034, no-bare-urls)

🪛 Ruff (0.14.10)

massgen/__init__.py

409-409: Avoid specifying long messages outside the exception class

(TRY003)

massgen/logs_analyzer.py

480-480: Do not catch blind exception: Exception

(BLE001)

606-606: Do not catch blind exception: Exception

(BLE001)

689-689: subprocess call: check for execution of untrusted input

(S603)

710-711: try-except-pass detected, consider logging the exception

(S110)

710-710: Do not catch blind exception: Exception

(BLE001)

massgen/cli.py

435-437: Abstract raise to an inner function

(TRY301)

435-437: Avoid specifying long messages outside the exception class

(TRY003)

4370-4370: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: pre-commit (ubuntu-latest, 3.11)
GitHub Check: pre-commit (macos-14, 3.11)
GitHub Check: pre-commit (macos-14, 3.12)
GitHub Check: pre-commit (ubuntu-latest, 3.12)
GitHub Check: pre-commit (macos-latest, 3.10)
GitHub Check: pre-commit (ubuntu-latest, 3.10)
GitHub Check: pre-commit (macos-14, 3.12)
GitHub Check: pre-commit (macos-latest, 3.12)

🔇 Additional comments (48)

massgen/docker/Dockerfile (1)

28-39: LGTM! Clean addition of jq utility.

The addition of jq is appropriate for JSON processing in shell scripts and aligns well with the new log analysis workflows. The inline documentation clearly explains its purpose, and the tool is lightweight and stable.

massgen/docker/Dockerfile.sudo (1)

31-43: LGTM! Consistent with main Dockerfile.

The addition of jq mirrors the change in the main Dockerfile, maintaining consistency between the standard and sudo-enabled variants.

massgen/skills/massgen-log-analyzer/SKILL.md (3)

1-4: Frontmatter is complete and accurate.

The YAML frontmatter includes both required fields (name, description) with clear, comprehensive information. The description accurately reflects the new ANALYSIS_REPORT.md generation capability.

20-48: CLI reference is clear and actionable.

The quick reference provides well-structured examples with clear flag descriptions. The commands follow the pattern of using uv run massgen as documented in coding guidelines.

642-825: Report template is comprehensive and well-structured.

The markdown template provides clear sections with appropriate placeholders and tables. The structure aligns well with the Standard Analysis Questions and provides actionable guidance for analysts.

massgen/subagent/manager.py (1)

693-703: LGTM! Correctly propagates direct_mcp_servers to subagents.

The addition of "direct_mcp_servers" to the code_tools_settings list ensures that this new configuration parameter is properly inherited from parent agents to subagents. This is consistent with how other code-based tools settings are propagated and aligns with the PR's goal of allowing specific MCP servers to remain callable even when code-based tools are enabled.

massgen/system_message_builder.py (1)

84-96: LGTM! Clean integration of vote_only mode.

The addition of the vote_only parameter is well-implemented:

Non-breaking default value (False)

Properly documented in the docstring

Cleanly passed through to EvaluationSection

Type-hinted appropriately

This aligns with the vote-only mode feature described in the PR objectives, enabling agents to be restricted to voting when they've reached their maximum answer count.

Also applies to: 112-113, 144-150

docs/source/user_guide/tools/code_based_tools.rst (1)

284-331: Documentation for Direct MCP Servers is well-structured and comprehensive.

The new section clearly explains:

The purpose of direct_mcp_servers (bypassing code-based filtering)

Practical YAML example with logfire and weather servers

Use cases (debugging tools, frequently-used MCPs)

Subagent inheritance behavior

The example correctly demonstrates the difference between direct protocol tools (logfire) and code-filtered tools (weather).

docs/source/reference/cli.rst (2)

346-377: CLI documentation for log analysis commands is clear and complete.

The new entries properly document:

Filtering options (--analyzed, --unanalyzed)

The analyze subcommand with prompt and self modes

UI mode options and custom config path for self-analysis

The table format is consistent with existing documentation structure.

453-459: Updated example output correctly shows the new Analyzed column.

The example demonstrates the visual indicators (✓ for analyzed, - for unanalyzed) which aligns with the filtering options documented above.

massgen/filesystem_manager/_filesystem_manager.py (2)

70-70: direct_mcp_servers parameter correctly added to FilesystemManager.

The implementation properly:

Accepts the parameter with a sensible default (None → empty list)

Documents it thoroughly with description and example in the docstring

Stores it on the instance for later use in tool schema extraction

Also applies to: 108-110, 128-128

852-856: Correct filtering logic for direct MCP servers.

Direct MCP servers are properly skipped during tool schema extraction, ensuring they remain as protocol tools rather than being converted to code-based tools. The debug logging provides good visibility for troubleshooting.

docs/source/user_guide/logging.rst (2)

411-419: Clear documentation for filtering logs by analysis status.

The code examples show both the visual indicator in listings and the filter flags. This helps users quickly understand how to find analyzed vs unanalyzed logs.

461-499: Comprehensive self-analysis documentation.

The section effectively covers:

Basic usage with massgen logs analyze

Prompt mode for external coding CLIs

Self-analysis mode with 3-agent team

UI mode options

Custom config path

Key details: ANALYSIS_REPORT.md output, read-only log directory safeguard

The note about LOGFIRE_READ_TOKEN is important for users setting up Logfire integration.

docs/source/reference/yaml_schema.rst (1)

697-701: direct_mcp_servers correctly documented in YAML schema reference.

The entry accurately describes:

Type as list

Optional parameter

Supported by all backends with MCP support

Clear explanation of behavior (keeps servers as direct protocol tools)

Practical example values

This aligns with the implementation and other documentation updates. Based on learnings, this is the correct place to document new YAML parameters.

massgen/configs/analysis/log_analysis.yaml (4)

1-13: Good header documentation explaining usage and prerequisites.

The comments clearly document:

Purpose of the config

How to use it (massgen logs analyze --mode self)

That it's for internal use (not direct execution)

Prerequisites (LOGFIRE_READ_TOKEN)

This aligns with the guideline to include "What happens" comments.

14-88: Agent configuration follows best practices.

Uses cost-effective gemini-3-flash-preview model per guidelines

cwd correctly placed at BACKEND-level

direct_mcp_servers: [logfire] aligns with the new feature

Docker configuration is comprehensive

One observation: The PR summary mentions "3 agents" but only 2 are active (agent_a, agent_b). agent_c is commented out. If 3 agents are intended for the "different perspectives" analysis, consider uncommenting agent_c.

Is the 2-agent setup intentional, or should agent_c be uncommented for the described "3 agents analyzing from different perspectives" workflow?

126-158: Orchestrator configuration is well-structured.

context_paths: [] correctly placed at ORCHESTRATOR-level per guidelines

Clear comments explaining dynamic injection for --mode self

Subagent orchestrator uses cost-effective grok-4-1-fast-reasoning

Good coordination settings (skills, task planning, memory)

160-165: Timeout and UI settings are appropriate.

30-minute timeout is reasonable for complex analysis

rich_terminal display aligns with the default UI mode documented in CLI reference

massgen/api_params_handler/_api_params_handler_base.py (1)

93-93: LGTM - direct_mcp_servers correctly added to excluded params.

The parameter is properly grouped with other code-based tools parameters and is consistently updated in both massgen/api_params_handler/_api_params_handler_base.py (line 93) and massgen/backend/base.py (lines 164 and 272), maintaining the required synchronization between these files.

openspec/changes/add-logfire-workflow-analysis/proposal.md (1)

1-73: Proposal structure and scope look solid

The proposal follows the OpenSpec format (Why / What Changes / Impact), clearly enumerates the new attributes and their limits, and references MAS-199 and affected modules. Reads as implementable and internally consistent.

massgen/__init__.py (2)

373-441: Config-source and raw-config tracking logic is consistent

The introduction of raw_config_for_metadata and its population across all config-selection branches (config_dict, models, model+filesystem, config file, lightweight model, default config) looks correct and ensures the logging path always has a non-None config payload when a config exists. Using the second return of load_config_file() only in the file-based branches keeps behavior aligned with the CLI’s (expanded_config, raw_config) contract.

Just keep this branch table in sync if you add new config sources in the future so metadata logging remains consistent.

452-465: Good: metadata now logs raw config instead of expanded secrets

Switching save_execution_metadata(..., config_content=raw_config_for_metadata) to use the unexpanded config is a solid hardening step: it preserves the effective structure for reconstruction while avoiding logging environment-expanded secrets.

No issues from this change as long as load_config_file continues to guarantee (expanded, raw) and callers don’t mutate raw_config_for_metadata in-place later.

massgen/coordination_tracker.py (1)

165-243: Session-level log_path wiring into tracing looks correct

Storing self.log_path during initialize_session() and passing the same value into trace_coordination_session(..., log_path=log_path) cleanly ties MAS-199’s massgen.log_path attribute back to the coordination tracker. This keeps the session span and subsequent per-answer paths consistent without changing existing behavior.

massgen/frontend/web/server.py (2)

4130-4310: History-based coordination now logs raw (unexpanded) config safely

Updating run_coordination_with_history to:

unpack config, raw_config_for_metadata = load_config_file(...), and

pass config_content=raw_config_for_metadata into save_execution_metadata

aligns this path with the new CLI/config contract and avoids logging env-expanded secrets while preserving the true on-disk config. The rest of the function still uses the expanded config for runtime behavior, so semantics are unchanged.

4507-4705: Initial WebUI runs also correctly switched to raw config for metadata

The analogous change in run_coordination—capturing both (config, raw_config_for_metadata) and using the raw config in save_execution_metadata once display.log_session_dir is known—brings the Web UI’s first-turn execution in line with CLI and programmatic flows.

This preserves backward-compatible behavior while improving metadata hygiene; no further issues from this change.

massgen/system_prompt_sections.py (1)

1463-1497: Vote-only evaluation mode is cleanly implemented and safely short-circuited

The new vote_only flag and early-return branch give a clear, self-contained prompt that forbids new_answer while preserving the existing behavior when vote_only is false. No functional or prompt-consistency issues spotted.

openspec/changes/add-logfire-workflow-analysis/specs/observability/spec.md (1)

1-149: Observability spec structure and scenarios align with OpenSpec guidelines

The spec uses the required ## ADDED Requirements header, #### Scenario: blocks with WHEN/THEN bullets, and normative SHALL language, and it clearly defines Logfire attribute names, truncation limits, and presence/omission rules for each case. This is ready to implement against.

As per coding guidelines, this matches the expected OpenSpec delta format.

massgen/backend/base.py (1)

140-171: direct_mcp_servers is correctly threaded through filesystem setup and excluded from backend options

Passing direct_mcp_servers into FilesystemManager and adding it to get_base_excluded_config_params() cleanly separates this new YAML/config knob from provider API params while making it available for MCP/tool filtering logic. This matches the intended pattern for new YAML parameters; assuming _api_params_handler_base.get_base_excluded_config_params() has the same entry, the wiring looks complete. Based on learnings, this is the right place to handle it.

Also applies to: 242-293

massgen/orchestrator.py (2)

3861-3863: Per‑agent agent_log_path wiring into structured logging looks good

Deriving agent_log_path from get_log_session_dir() and threading it into log_agent_round_context gives the tracker enough context to link round metadata to on‑disk logs; the None fallback also keeps this safe when logging is disabled. This hunk looks solid.

Also applies to: 3869-3869

3892-3896: Vote‑only semantics now consistently influence both prompt and tools

Using _is_vote_only_mode(agent_id) to derive vote_only_for_system_message and passing it into build_coordination_message, while separately recomputing vote_only when selecting agent_workflow_tools, keeps the system prompt and available tools in sync for agents that have exhausted their answer budget. This should address prior inconsistencies where vote‑only agents could still see new_answer.

Also applies to: 3907-3907

massgen/cli.py (1)

386-441: load_config_file tuple return and env-var expansion look good

The refactor to return (expanded_config, raw_config) and to run _expand_env_vars on a deepcopy of raw_config cleanly separates runtime vs logging concerns without mutating the original config. Error handling and search order remain consistent with the existing CLI behavior. No changes needed here.

massgen/logs_analyzer.py (5)

8-21: LGTM!

New imports and the DEFAULT_ANALYSIS_CONFIG constant are appropriately added for the new analysis functionality. The path construction using Path(__file__).parent is the correct pattern for locating sibling config files.

40-52: LGTM!

The has_analysis_report function correctly checks for the presence of analysis reports with an efficient early-return pattern. The Google-style docstring is complete with Args and Returns sections.

522-555: LGTM!

The prompt generation functions are clean and focused. Good UX consideration to warn users when an analysis report already exists before displaying the prompt.

688-693: LGTM!

The subprocess invocation is secure: it uses list form (not shell=True), the UI mode is validated against a whitelist, and the config path is generated internally. Passing the environment copy ensures necessary tokens like LOGFIRE_READ_TOKEN are available to the child process.

728-767: LGTM!

The logs_command modifications correctly handle the new analyze subcommand with both "prompt" and "self" modes. Error handling for missing directories is appropriate, and the default "automation" UI mode aligns with the coding guidelines for programmatic MassGen execution.

massgen/structured_logging.py (11)

762-786: LGTM!

The error_context parameter addition follows the existing patterns in this file. The inline import from structured_logging_utils is consistent with other functions in this module, and the truncation logic with conditional attribute emission is correctly implemented.

875-895: LGTM!

The restart_trigger parameter addition is consistent with the MAS-199 workflow attributes pattern. The reason truncation correctly applies to both the log message and the structured attributes.

908-946: LGTM!

The log_path parameter addition to trace_coordination_session correctly propagates the run's log directory path as a workflow attribute for observability.

1075-1111: LGTM!

The round_intent and agent_log_path parameters extend the round context logging with useful workflow analysis attributes. The preview truncation is consistent with the constants from structured_logging_utils.

1114-1145: LGTM!

The answer_path parameter addition enables tracking where agent answers are stored, which is valuable for the hybrid local/Logfire analysis workflow.

1155-1193: LGTM!

The new log_agent_workspace_files function is well-designed for detecting repeated work and understanding agent outputs. The file list limiting with MAX_FILES_CREATED prevents log bloat, and the conditional logging avoids unnecessary empty events.

1196-1244: LGTM!

The agents_with_answers and answer_label_mapping parameters enhance vote logging with context about the voting state. The reason truncation with PREVIEW_VOTE_REASON (500 chars) provides more context than the previous 200-char limit.

1379-1436: LGTM!

The subagent_log_path parameter addition enables tracking subagent log locations. The task preview is correctly applied to both legacy (subagent.task_preview) and new (massgen.subagent.task) attributes for backward compatibility.

1439-1492: LGTM!

The subagent_log_path parameter addition to log_subagent_spawn is consistent with the trace_subagent_execution changes. The task preview truncation follows the same pattern.

1495-1562: LGTM!

The workspace file tracking in log_subagent_complete follows the same pattern as log_agent_workspace_files. Setting file_count unconditionally (even when 0) is a good design choice for observability queries.

1686-1689: LGTM!

The log_agent_workspace_files function is correctly added to __all__ for public API exposure.

massgen/configs/analysis/log_analysis_cli.yaml

massgen/logs_analyzer.py

coderabbitai · 2026-01-07T00:14:11Z

massgen/skills/massgen-log-analyzer/SKILL.md

+**Key Local Log Files:**
+| File | Contains |
+|------|----------|
+| `metrics_summary.json` | Cost, tokens, tool stats, round history |
+| `coordination_events.json` | Full event timeline with tool calls |
+| `coordination_table.txt` | Human-readable coordination flow |
+| `streaming_debug.log` | Raw streaming data including command strings |
+| `agent_*/*/vote.json` | Vote reasoning and context |
+| `execution_metadata.yaml` | Config and session metadata |
+


⚠️ Potential issue | 🟡 Minor

Add blank lines around the table for proper markdown formatting.

The table at line 69 should be surrounded by blank lines to comply with markdown best practices.

🔎 Proposed fix

**Key Local Log Files:** + | File | Contains | |------|----------| | `metrics_summary.json` | Cost, tokens, tool stats, round history | | `coordination_events.json` | Full event timeline with tool calls | | `coordination_table.txt` | Human-readable coordination flow | | `streaming_debug.log` | Raw streaming data including command strings | | `agent_*/*/vote.json` | Vote reasoning and context | | `execution_metadata.yaml` | Config and session metadata | + ## Logfire Setup

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

**Key Local Log Files:**

| File | Contains |

|------|----------|

| `metrics_summary.json` | Cost, tokens, tool stats, round history |

| `coordination_events.json` | Full event timeline with tool calls |

| `coordination_table.txt` | Human-readable coordination flow |

| `streaming_debug.log` | Raw streaming data including command strings |

| `agent_*/*/vote.json` | Vote reasoning and context |

| `execution_metadata.yaml` | Config and session metadata |

**Key Local Log Files:**

| File | Contains |

|------|----------|

| `metrics_summary.json` | Cost, tokens, tool stats, round history |

| `coordination_events.json` | Full event timeline with tool calls |

| `coordination_table.txt` | Human-readable coordination flow |

| `streaming_debug.log` | Raw streaming data including command strings |

| `agent_*/*/vote.json` | Vote reasoning and context |

| `execution_metadata.yaml` | Config and session metadata |

🧰 Tools

🪛 markdownlint-cli2 (0.18.1)

69-69: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)

🤖 Prompt for AI Agents

In @massgen/skills/massgen-log-analyzer/SKILL.md around lines 68 - 77, The markdown table under "Key Local Log Files:" in SKILL.md lacks surrounding blank lines which breaks proper rendering; edit the SKILL.md content around the table (the block containing `metrics_summary.json`, `coordination_events.json`, etc.) and add a blank line before the table and a blank line after the table so it is separated from surrounding paragraphs/headings.

massgen/structured_logging.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

massgen/skills/massgen-log-analyzer/SKILL.md (2)
94-94: Format URL as markdown link for better rendering.

Bare URLs should use angle brackets or markdown link format.
🔎 Proposed fix
-Go to https://logfire.pydantic.dev/ and create a free account.
+Go to <https://logfire.pydantic.dev/> and create a free account.
Or as a markdown link:
-Go to https://logfire.pydantic.dev/ and create a free account.
+Go to [Logfire](https://logfire.pydantic.dev/) and create a free account.
108-108: Format URL as markdown link for better rendering.

Bare URLs should use angle brackets or markdown link format.
🔎 Proposed fix
-1. Go to https://logfire.pydantic.dev/ and log in
+1. Go to <https://logfire.pydantic.dev/> and log in
Or as a markdown link:
-1. Go to https://logfire.pydantic.dev/ and log in
+1. Go to [Logfire](https://logfire.pydantic.dev/) and log in

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 99cc7e4 and 2f817a7.

📒 Files selected for processing (5)

massgen/configs/analysis/log_analysis.yaml
massgen/configs/analysis/log_analysis_cli.yaml
massgen/logs_analyzer.py
massgen/skills/massgen-log-analyzer/SKILL.md
massgen/structured_logging_utils.py

🚧 Files skipped from review as they are similar to previous changes (2)

massgen/configs/analysis/log_analysis.yaml
massgen/configs/analysis/log_analysis_cli.yaml

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always use --automation flag for programmatic MassGen execution
Use Google-style docstrings for auto-generating API documentation

Files:

massgen/structured_logging_utils.py
massgen/logs_analyzer.py

massgen/**/*.py

⚙️ CodeRabbit configuration file

massgen/**/*.py: Review Python code for:

PEP 8 compliance and Google-style docstrings

Proper async/await patterns

Type hints where appropriate

No security vulnerabilities (command injection, etc.)

Consistency with existing code patterns

Files:

massgen/structured_logging_utils.py
massgen/logs_analyzer.py

massgen/skills/**/*.md

⚙️ CodeRabbit configuration file

massgen/skills/**/*.md: Skill documentation review:

Verify SKILL.md frontmatter is complete (name, description)

Check workflow steps are clear and actionable

Ensure reference files are accurate

Files:

massgen/skills/massgen-log-analyzer/SKILL.md

🧠 Learnings (3)

📚 Learning: 2026-01-05T17:34:18.455Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: openspec/AGENTS.md:0-0
Timestamp: 2026-01-05T17:34:18.455Z
Learning: Applies to openspec/changes/*/design.md : Create `design.md` only if the change involves cross-cutting concerns, new dependencies, data model changes, security/performance complexity, or architectural ambiguity

Applied to files:

massgen/skills/massgen-log-analyzer/SKILL.md

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Applies to **/*.py : Always use --automation flag for programmatic MassGen execution

Applied to files:

massgen/skills/massgen-log-analyzer/SKILL.md

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Applies to massgen/tool/*/TOOL.md : All custom tools in massgen/tool/ require a TOOL.md file with YAML frontmatter including name, description, category, requires_api_keys, tasks, and keywords

Applied to files:

massgen/skills/massgen-log-analyzer/SKILL.md

🧬 Code graph analysis (1)

massgen/logs_analyzer.py (3)

massgen/frontend/displays/rich_terminal_display.py (3)

Console (54-55)

Panel (51-52)

Table (63-64)

massgen/filesystem_manager/background_shell.py (1)

append (56-63)

massgen/__init__.py (1)

run (262-540)

🪛 markdownlint-cli2 (0.18.1)

massgen/skills/massgen-log-analyzer/SKILL.md

94-94: Bare URL used

(MD034, no-bare-urls)

108-108: Bare URL used

(MD034, no-bare-urls)

🪛 Ruff (0.14.10)

massgen/structured_logging_utils.py

185-210: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

massgen/logs_analyzer.py

501-501: Do not catch blind exception: Exception

(BLE001)

627-627: Do not catch blind exception: Exception

(BLE001)

711-711: subprocess call: check for execution of untrusted input

(S603)

732-733: try-except-pass detected, consider logging the exception

(S110)

732-732: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: pre-commit (macos-latest, 3.10)
GitHub Check: pre-commit (macos-14, 3.12)
GitHub Check: pre-commit (ubuntu-latest, 3.12)
GitHub Check: pre-commit (macos-latest, 3.11)
GitHub Check: pre-commit (macos-latest, 3.12)
GitHub Check: pre-commit (macos-latest, 3.10)

🔇 Additional comments (12)

massgen/skills/massgen-log-analyzer/SKILL.md (3)
1-19: LGTM - Comprehensive skill documentation setup.

The frontmatter is complete with name and description. The expanded description appropriately reflects the comprehensive analysis workflow including ANALYSIS_REPORT.md generation. The purpose section clearly outlines the skill's capabilities.

562-839: Excellent comprehensive analysis workflow documentation.

The structured report template with standard analysis questions provides clear guidance for generating ANALYSIS_REPORT.md. The warning about avoiding infinite recursion when running massgen logs commands (line 683) is critical and well-placed. The workflow steps are actionable and logically sequenced.

Based on learnings, this follows best practices for skill documentation by providing clear, actionable steps.

68-78: Add blank lines around the markdown table.

The table requires blank lines before and after for proper markdown rendering. This issue was previously identified.
🔎 Proposed fix
 **Key Local Log Files:**
+
 | File | Contains |
 |------|----------|
 | `metrics_summary.json` | Cost, tokens, tool stats, round history |
 | `coordination_events.json` | Full event timeline with tool calls |
 | `coordination_table.txt` | Human-readable coordination flow |
 | `streaming_debug.log` | Raw streaming data including command strings |
 | `agent_*/*/vote.json` | Vote reasoning and context |
 | `execution_metadata.yaml` | Config and session metadata |
+
 ## Logfire Setup
Likely an incorrect or invalid review comment.
massgen/structured_logging_utils.py (3)

1-67: Excellent module structure with self-documenting constants.

The comprehensive docstring with usage examples and the semantic aliases (PREVIEW_TASK, PREVIEW_ANSWER, etc.) make this module highly maintainable. The PreviewLengths class provides a clear centralized location for truncation constants, and the MAS-199 workflow context is documented.

75-182: Well-implemented utility functions with robust error handling.

The functions demonstrate excellent practices:

truncate_for_log properly handles None and non-string inputs, accounts for suffix length

extract_workspace_info uses PurePath for cross-platform path parsing

Comprehensive Google-style docstrings with examples

Appropriate type hints throughout

185-210: Current logical grouping of __all__ is preferable to alphabetical sorting.

While static analysis suggests sorting __all__, the current organization by category (constants, workflow constants, workspace constants, functions) with comments is more maintainable and readable. Alphabetical sorting would scatter related items.

massgen/logs_analyzer.py (6)

1-42: Good addition of natural sorting to handle multi-digit turn/attempt numbers.

The _natural_sort_key function correctly addresses the lexicographic sorting issue mentioned in previous reviews. The implementation properly splits paths into numeric and non-numeric components, converting digits to integers for correct numeric comparison (turn_2 < turn_10).

61-507: Clean implementation of analysis status filtering.

The has_analysis_report function and enhanced display_list properly support filtering by analysis status. The exception handling at line 501 is appropriate for display code that needs to be resilient to malformed metrics files - the error is handled gracefully by showing "Error reading metrics" rather than crashing.

543-577: LGTM - Clear prompt generation interface.

The prompt generation and display functions provide a clean interface for Claude Code integration. The check for existing ANALYSIS_REPORT.md at line 572 provides helpful feedback to users.

579-640: Solid validation and setup for self-analysis workflow.

The function properly validates inputs, warns about missing LOGFIRE_READ_TOKEN (allowing graceful degradation), and uses natural sorting at line 639 to select the correct attempt directory. The exception handling at line 627 for YAML parsing is appropriate as it provides a clear error message to users.

642-727: Well-implemented self-analysis execution with proper security measures.

The implementation demonstrates good practices:

Context paths injection (lines 650-653) correctly restricts access: read-only for log directory, write-only for the report

Logfire MCP removal when token unavailable (lines 656-664) enables graceful degradation

--automation flag usage at line 703 follows coding guidelines for programmatic execution

Subprocess call at line 711: The S603 static analysis hint is a false positive. The command array is safely constructed with validated inputs (config path validated at lines 619-621, question is internally constructed).

Based on learnings, the --automation flag usage is correct for programmatic MassGen execution.

728-820: Appropriate exception handling in cleanup and clean command routing.

The silent exception handling at line 732 is appropriate for best-effort cleanup in a finally block - failure to delete a temporary file shouldn't affect the main operation or produce noise. The enhanced command handler cleanly routes list, analyze, and open commands with proper argument handling.

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In @massgen/logs_analyzer.py:
- Around line 692-695: The current code creates an empty ANALYSIS_REPORT.md via
report_path.touch() before running the analysis subprocess which can leave a
bogus "analysis present" state; wrap the subprocess invocation in a
try/except/finally so that if the subprocess fails or is interrupted you remove
the empty report file (check report_path.exists() and report_path.stat().st_size
== 0 before deleting), and also update has_analysis_report to return True only
when report_path.exists() and report_path.stat().st_size > 0 (i.e., non-empty)
so the presence check is based on content not just existence; reference
report_path.touch(), has_analysis_report, and ANALYSIS_REPORT.md when locating
the changes.

In @massgen/skills/massgen-log-analyzer/SKILL.md:
- Line 114: Lines containing bare URLs (e.g., the line "Go to
https://logfire.pydantic.dev/ and create a free account." and the other
occurrence at line 128) violate MD034; replace each bare URL with a markdown
link or inline code. For example, change the text to use
[https://logfire.pydantic.dev/](https://logfire.pydantic.dev/) or wrap the URL
in backticks (`https://logfire.pydantic.dev/`) in the SKILL.md content where
those bare URLs appear.
- Around line 56-65: The fenced directory-tree code block in SKILL.md (the block
showing "log_YYYYMMDD_HHMMSS/ … attempt_1/") lacks a language tag; update that
triple-backtick fence to include a language (e.g., ```bash or ```text) so the
tree is highlighted correctly—edit the block beginning before
"log_YYYYMMDD_HHMMSS/" in SKILL.md to use ```bash (or ```text) instead of plain
``` to fix the issue.
- Around line 141-149: Update the Prerequisites paragraph to clarify that
Logfire MCP is optional: note that the skill will automatically disable Logfire
MCP if the environment variable LOGFIRE_READ_TOKEN is not set, and that the
listed MCP tools (mcp__logfire__arbitrary_query, mcp__logfire__schema_reference,
mcp__logfire__find_exceptions_in_file, mcp__logfire__logfire_link) are only
available when LOGFIRE_READ_TOKEN/configured; add a short sentence stating basic
log analysis workflows still work without Logfire MCP enabled.

🧹 Nitpick comments (6)

massgen/cli.py (3)
4360-4378: Threading raw_config_for_metadata into interactive metadata is sound; consider Optional typing

Using raw_config_for_metadata in run_interactive_mode and passing config_content=raw_config_for_metadata or original_config to save_execution_metadata gives interactive runs the same “raw vs expanded” separation as single-question runs, and still falls back gracefully if callers don’t supply the new arg. One small nit: to satisfy Ruff’s RUF013 and keep type hints precise, you may want to change the parameter annotation to an explicit optional:
async def run_interactive_mode(
    ...,
-   raw_config_for_metadata: Dict[str, Any] = None,
+   raw_config_for_metadata: Optional[Dict[str, Any]] = None,
    **kwargs,
):
(or dict[str, Any] | None), and optionally align the existing original_config/orchestrator_cfg annotations in a follow-up.

Also applies to: 4954-4964

5672-5680: Clarify behavior when both --analyzed and --unanalyzed are passed

The new --analyzed / --unanalyzed filters on logs list are useful, but the CLI currently allows both to be set simultaneously, which is semantically ambiguous and leaves resolution entirely to logs_command. Consider either making these options mutually exclusive at the parser level or explicitly defining the precedence (e.g., treat both set as “no filter” or as an error) so users get predictable behavior.

5695-5733: logs analyze subcommand surface is coherent; ensure downstream self-mode uses automation

The logs analyze subparser and its options (--log-dir, --mode, --config, --ui, --turn, --force) line up well with the PR’s described behavior and give enough flexibility for both prompt- and self-analysis workflows. One follow-up to keep in mind: when implementing --mode self in massgen/logs_analyzer.py, ensure that the spawned MassGen subprocess always includes the --automation flag for programmatic execution, per the coding guidelines.
massgen/logs_analyzer.py (3)
413-426: Minor inefficiency in filtering: all logs are scanned before applying limit.

When analyzed_only or unanalyzed_only is set, the code checks has_analysis_report() for all logs in all_logs before slicing to limit. For large log directories, this could be slow since has_analysis_report() performs filesystem I/O.

Consider early termination once enough matching logs are found:
🔎 Proposed optimization
     # Filter by analysis status if requested
     if analyzed_only or unanalyzed_only:
         filtered_logs = []
         for log_dir in all_logs:
+            if len(filtered_logs) >= limit:
+                break
             has_report = has_analysis_report(log_dir)
             if analyzed_only and has_report:
                 filtered_logs.append(log_dir)
             elif unanalyzed_only and not has_report:
                 filtered_logs.append(log_dir)
-        logs = filtered_logs[:limit]
+        logs = filtered_logs
     else:
         logs = all_logs[:limit]
644-650: Consider logging or specifying the exception type for config loading errors.

The static analysis flags catching a blind Exception. While yaml.safe_load can raise various errors, specifying yaml.YAMLError would be more precise and follow best practices.
🔎 Proposed fix
     try:
         with open(config_path) as f:
             config = yaml.safe_load(f)
-    except Exception as e:
+    except (yaml.YAMLError, OSError) as e:
         console.print(f"[red]Error loading config:[/red] {e}")
         return 1
783-786: Consider logging temp file cleanup failures for debugging.

The static analysis flagged the try-except-pass pattern. While acceptable for cleanup code, silently ignoring cleanup failures could mask filesystem permission issues during debugging.
🔎 Proposed improvement
     finally:
         # Clean up temporary config file
         try:
             Path(tmp_config_path).unlink()
-        except Exception:
-            pass
+        except OSError:
+            pass  # Best-effort cleanup; temp dir will be cleaned eventually
Using OSError (which covers PermissionError, FileNotFoundError, etc.) is more precise than catching all exceptions.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f817a7 and 5d7af2c.

📒 Files selected for processing (3)

massgen/cli.py
massgen/logs_analyzer.py
massgen/skills/massgen-log-analyzer/SKILL.md

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always use --automation flag for programmatic MassGen execution
Use Google-style docstrings for auto-generating API documentation

Files:

massgen/cli.py
massgen/logs_analyzer.py

massgen/**/*.py

⚙️ CodeRabbit configuration file

massgen/**/*.py: Review Python code for:

PEP 8 compliance and Google-style docstrings

Proper async/await patterns

Type hints where appropriate

No security vulnerabilities (command injection, etc.)

Consistency with existing code patterns

Files:

massgen/cli.py
massgen/logs_analyzer.py

massgen/skills/**/*.md

⚙️ CodeRabbit configuration file

massgen/skills/**/*.md: Skill documentation review:

Verify SKILL.md frontmatter is complete (name, description)

Check workflow steps are clear and actionable

Ensure reference files are accurate

Files:

massgen/skills/massgen-log-analyzer/SKILL.md

🧠 Learnings (2)

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Applies to **/*.py : Always use --automation flag for programmatic MassGen execution

Applied to files:

massgen/skills/massgen-log-analyzer/SKILL.md

📚 Learning: 2026-01-05T17:34:01.001Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-05T17:34:01.001Z
Learning: Applies to massgen/tool/*/TOOL.md : All custom tools in massgen/tool/ require a TOOL.md file with YAML frontmatter including name, description, category, requires_api_keys, tasks, and keywords

Applied to files:

massgen/skills/massgen-log-analyzer/SKILL.md

🧬 Code graph analysis (2)

massgen/cli.py (2)

massgen/v1/config.py (1)

ConfigurationError (24-25)

massgen/logger_config.py (1)

save_execution_metadata (262-318)

massgen/logs_analyzer.py (2)

massgen/frontend/displays/rich_terminal_display.py (3)

Console (54-55)

Panel (51-52)

Table (63-64)

massgen/__init__.py (1)

run (262-540)

🪛 markdownlint-cli2 (0.18.1)

massgen/skills/massgen-log-analyzer/SKILL.md

56-56: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

114-114: Bare URL used

(MD034, no-bare-urls)

128-128: Bare URL used

(MD034, no-bare-urls)

🪛 Ruff (0.14.10)

massgen/cli.py

435-437: Abstract raise to an inner function

(TRY301)

435-437: Avoid specifying long messages outside the exception class

(TRY003)

4370-4370: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

massgen/logs_analyzer.py

501-501: Do not catch blind exception: Exception

(BLE001)

648-648: Do not catch blind exception: Exception

(BLE001)

764-764: subprocess call: check for execution of untrusted input

(S603)

785-786: try-except-pass detected, consider logging the exception

(S110)

785-785: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: pre-commit (ubuntu-latest, 3.12)
GitHub Check: pre-commit (macos-latest, 3.12)
GitHub Check: pre-commit (macos-latest, 3.11)
GitHub Check: pre-commit (macos-14, 3.10)
GitHub Check: pre-commit (macos-14, 3.12)
GitHub Check: pre-commit (macos-latest, 3.12)

🔇 Additional comments (12)

massgen/skills/massgen-log-analyzer/SKILL.md (3)

20-50: CLI Quick Reference is clear, comprehensive, and well-structured.

The new section provides quick access to all log analysis commands with concrete examples for listing, prompt generation, and self-analysis modes. The command structure aligns perfectly with the PR objectives and should help users quickly discover and invoke the new features.

582-859: Part 6 analysis report template is comprehensive and actionable.

The new "Comprehensive Log Analysis Report" section provides a detailed template with:

Standard Analysis Questions covering correctness, efficiency, command patterns, work duplication, agent behavior, cost/tokens, and errors

Clear data source mappings showing where to find answers

Practical analysis commands (grep, jq) for extracting insights

A markdown report template with checkboxes and structured sections

Workflow steps for generating and saving reports

This directly supports the PR objective of generating ANALYSIS_REPORT.md per turn with a documented multi-turn and multi-agent workflow.

88-98: Table markdown formatting is now correct.

The table at lines 90–97 is properly surrounded by blank lines (line 89 before, line 98 after), resolving the previous markdown formatting issue flagged in earlier reviews.

massgen/cli.py (2)

386-444: Config loader tuple return and env expansion look correct

The change to return (expanded_config, raw_config) and to derive expanded_config via _expand_env_vars(copy.deepcopy(raw_config)) cleanly separates runtime config from a log-safe raw view while preserving original ${VAR} syntax. Path resolution and YAML/JSON handling remain consistent with prior behavior, so this is a safe additive API evolution; downstream unpacking in main() matches the new return shape.

5121-5233: Main flow correctly preserves raw vs expanded config and uses raw for all metadata logging

Unpacking config, raw_config_for_metadata = load_config_file(...) and, for simple configs, setting raw_config_for_metadata = copy.deepcopy(config) before later mutation gives you an immutable, log-safe snapshot of the original configuration. That same raw view is then used consistently in both single-question (save_execution_metadata(..., config_content=raw_config_for_metadata, ...)) and interactive modes (via the new raw_config_for_metadata argument), which meets the goal of avoiding env-expanded secret leakage in metadata while leaving runtime behavior unchanged. The wiring and fallbacks look correct in all branches.

Also applies to: 5448-5487

massgen/logs_analyzer.py (7)

21-22: LGTM!

The DEFAULT_ANALYSIS_CONFIG constant is well-defined using Path(__file__).parent for reliable relative path resolution.

25-42: LGTM!

Good implementation addressing the natural sort issue from the previous review. The regex-based tokenization correctly handles multi-digit numbers for proper ordering (e.g., turn_2 < turn_10).

61-73: LGTM!

Clean implementation with early return for efficiency. Docstring follows Google style.

543-562: LGTM!

Simple utility function with proper type hints and docstring.

565-593: LGTM!

Good use of _natural_sort_key for turn directory ordering and helpful UX with the existing report notification.

808-844: LGTM!

The analyze subcommand handling is well-structured with proper validation of paths and modes. The ui_mode defaults to "automation" which aligns with the coding guideline for programmatic MassGen execution.

746-768: Subprocess argument handling is secure; no changes needed.

The subprocess call correctly uses subprocess.run() with list arguments rather than shell=True, which prevents shell interpretation of the embedded log_dir path. The Path.resolve() call at line 637 ensures the path is valid and absolute before use. The --automation flag is properly appended per coding guidelines for automation mode.

coderabbitai · 2026-01-07T03:09:37Z

massgen/logs_analyzer.py

+    # Create empty ANALYSIS_REPORT.md if it doesn't exist
+    # This is required for context_paths validation
+    if not report_path.exists():
+        report_path.touch()


⚠️ Potential issue | 🟡 Minor

Creating an empty file before analysis may cause issues if the subprocess fails.

An empty ANALYSIS_REPORT.md is created before running the subprocess. If the subprocess fails or is interrupted, the empty file remains, potentially making the log appear "analyzed" when it's not (since has_analysis_report only checks for file existence, not content).

Consider either:

Cleaning up the empty file on failure

Having has_analysis_report check for non-empty file content

🔎 Proposed fix: Clean up empty report on failure

if result.returncode == 0: console.print("\n[green]Analysis complete![/green]") - if report_path.exists(): + if report_path.exists() and report_path.stat().st_size > 0: console.print(f"[green]Report saved to:[/green] {report_path}") else: console.print("[yellow]Note: ANALYSIS_REPORT.md was not created by the agents.[/yellow]") + # Clean up empty placeholder file + if report_path.exists() and report_path.stat().st_size == 0: + report_path.unlink() else: console.print(f"\n[red]Analysis failed with exit code {result.returncode}[/red]") + # Clean up empty placeholder file on failure + if report_path.exists() and report_path.stat().st_size == 0: + report_path.unlink()

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In @massgen/logs_analyzer.py around lines 692 - 695, The current code creates an empty ANALYSIS_REPORT.md via report_path.touch() before running the analysis subprocess which can leave a bogus "analysis present" state; wrap the subprocess invocation in a try/except/finally so that if the subprocess fails or is interrupted you remove the empty report file (check report_path.exists() and report_path.stat().st_size == 0 before deleting), and also update has_analysis_report to return True only when report_path.exists() and report_path.stat().st_size > 0 (i.e., non-empty) so the presence check is based on content not just existence; reference report_path.touch(), has_analysis_report, and ANALYSIS_REPORT.md when locating the changes.

massgen/skills/massgen-log-analyzer/SKILL.md

coderabbitai · 2026-01-07T03:09:37Z

massgen/skills/massgen-log-analyzer/SKILL.md

 ## Prerequisites

 **Required MCP Server:**
-This skill requires the Logfire MCP server to be configured. The MCP server provides these tools:
+This skill requires the Logfire MCP server to be configured (see setup above). The MCP server provides these tools:
 - `mcp__logfire__arbitrary_query` - Run SQL queries against logfire data
 - `mcp__logfire__schema_reference` - Get the database schema
 - `mcp__logfire__find_exceptions_in_file` - Find exceptions in a file
 - `mcp__logfire__logfire_link` - Create links to traces in the UI



⚠️ Potential issue | 🟠 Major

Clarify that Logfire MCP is optional when LOGFIRE_READ_TOKEN is unavailable.

The Prerequisites section states that the Logfire MCP server is required, but the PR objectives indicate that "Logfire MCP is automatically disabled if LOGFIRE_READ_TOKEN is not set." This automatic fallback behavior should be explicitly documented to avoid user confusion about whether Logfire is truly required for basic log analysis workflows.

🔎 Suggested clarification

## Prerequisites **Required MCP Server:** -This skill requires the Logfire MCP server to be configured (see setup above). The MCP server provides these tools: +The Logfire MCP server is **optional but recommended** for advanced trace analysis. If `LOGFIRE_READ_TOKEN` is not set, the MCP server is automatically disabled and the skill falls back to local log file analysis. When configured, the MCP server provides these tools: - `mcp__logfire__arbitrary_query` - Run SQL queries against logfire data

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

## Prerequisites

**Required MCP Server:**

This skill requires the Logfire MCP server to be configured. The MCP server provides these tools:

This skill requires the Logfire MCP server to be configured (see setup above). The MCP server provides these tools:

- `mcp__logfire__arbitrary_query` - Run SQL queries against logfire data

- `mcp__logfire__schema_reference` - Get the database schema

- `mcp__logfire__find_exceptions_in_file` - Find exceptions in a file

- `mcp__logfire__logfire_link` - Create links to traces in the UI

## Prerequisites

**Required MCP Server:**

The Logfire MCP server is **optional but recommended** for advanced trace analysis. If `LOGFIRE_READ_TOKEN` is not set, the MCP server is automatically disabled and the skill falls back to local log file analysis. When configured, the MCP server provides these tools:

- `mcp__logfire__arbitrary_query` - Run SQL queries against logfire data

- `mcp__logfire__schema_reference` - Get the database schema

- `mcp__logfire__find_exceptions_in_file` - Find exceptions in a file

- `mcp__logfire__logfire_link` - Create links to traces in the UI

🤖 Prompt for AI Agents

In @massgen/skills/massgen-log-analyzer/SKILL.md around lines 141 - 149, Update the Prerequisites paragraph to clarify that Logfire MCP is optional: note that the skill will automatically disable Logfire MCP if the environment variable LOGFIRE_READ_TOKEN is not set, and that the listed MCP tools (mcp__logfire__arbitrary_query, mcp__logfire__schema_reference, mcp__logfire__find_exceptions_in_file, mcp__logfire__logfire_link) are only available when LOGFIRE_READ_TOKEN/configured; add a short sentence stating basic log analysis workflows still work without Logfire MCP enabled.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

massgen/orchestrator.py (1)

5609-5615: Bug: follow-up irreversibility analysis always enables planning mode and drops blocked tools

In _handle_followup, the result of _analyze_question_irreversibility is a dict, but it’s used directly as the has_irreversible flag:

has_irreversible = await self._analyze_question_irreversibility(user_message, conversation_context or {})
...
agent.backend.set_planning_mode(has_irreversible)

Since any non-empty dict is truthy, this will always enable planning mode for follow-ups, regardless of analysis outcome, and it never propagates the blocked_tools set you already use in the initial coordination path.

This is a behaviour bug that can cause all follow-up questions to run with MCP tools blocked (or at least in “planning” mode) unexpectedly.

You likely want the same pattern as in the main chat flow:

Proposed fix for `_handle_followup` planning-mode usage

-        # Analyze the follow-up question for irreversibility before re-coordinating
-        has_irreversible = await self._analyze_question_irreversibility(user_message, conversation_context or {})
-
-        # Set planning mode for all agents based on analysis
-        for agent_id, agent in self.agents.items():
-            if hasattr(agent.backend, "set_planning_mode"):
-                agent.backend.set_planning_mode(has_irreversible)
-                log_orchestrator_activity(
-                    self.orchestrator_id,
-                    f"Set planning mode for {agent_id} (follow-up)",
-                    {"planning_mode_enabled": has_irreversible, "reason": "follow-up irreversibility analysis"},
-                )
+        # Analyze the follow-up question for irreversibility before re-coordinating
+        analysis_result = await self._analyze_question_irreversibility(user_message, conversation_context or {})
+        has_irreversible = analysis_result.get("has_irreversible", False)
+        blocked_tools = analysis_result.get("blocked_tools", set())
+
+        # Set planning mode for all agents based on analysis
+        for agent_id, agent in self.agents.items():
+            backend = getattr(agent, "backend", None)
+            if not backend:
+                continue
+
+            if hasattr(backend, "set_planning_mode"):
+                backend.set_planning_mode(has_irreversible)
+            if hasattr(backend, "set_planning_mode_blocked_tools"):
+                backend.set_planning_mode_blocked_tools(blocked_tools)
+
+            log_orchestrator_activity(
+                self.orchestrator_id,
+                f"Set planning mode for {agent_id} (follow-up)",
+                {
+                    "planning_mode_enabled": has_irreversible,
+                    "blocked_tools_count": len(blocked_tools),
+                    "reason": "follow-up irreversibility analysis",
+                },
+            )

This restores the intended semantics and keeps initial and follow-up flows consistent.

🧹 Nitpick comments (2)

massgen/orchestrator.py (2)
3892-3896: Vote-only mode and tool classification logic are solid; consider minor robustness tweaks

The new wiring does several good things:

vote_only_for_system_message = self._is_vote_only_mode(agent_id) (Line 3892) and passing vote_only into build_coordination_message (Line 3907) correctly align system prompt instructions with the agent’s remaining answer budget.

Later, re-checking vote-only status per attempt and regenerating agent_workflow_tools with vote_only=True (Lines 4127-4135) ensures only the vote tool is exposed once the limit is hit.

Computing internal_tool_names from agent_workflow_tools (Line 4191) and external_tool_names from self._external_tools (Line 4260) then:

Routing external tool calls into external_tool_calls and short-circuiting the stream (Lines 4268-4273, 4339-4344).

Treating MCP/custom tools via is_mcp_tool_call / is_custom_tool_call without warnings (Lines 4273-4281, 4603-4607).

Emitting explicit warnings and non-fatal coordination traces for unknown tools (Lines 4283-4290).

In the workflow pass, marking MCP/custom calls as workflow_tool_found only if the agent can still answer (Lines 4603-4613) is a nice guard so answer-limited agents are pushed into voting instead of indefinitely tool-calling.

Two minor robustness nits you may want to tighten:
Name extraction for internal/external tools
Right now internal_tool_names / external_tool_names are built via (t.get("function", {}) or {}).get("name"). If any tool definitions ever use a different shape (e.g., a top-level "name" field only), they’ll silently fall out of these sets and be treated as “unknown”. Consider using a small helper that mirrors the backend’s extract_tool_name semantics and filters out None:
Possible refactor for tool-name extraction
-internal_tool_names = {(t.get("function", {}) or {}).get("name") for t in (agent_workflow_tools or []) if isinstance(t, dict)}
+def _get_tool_def_name(tool_def: dict) -> str | None:
+    # Mirrors backend.extract_tool_name: supports {"function": {"name": ...}} and {"name": ...}
+    if "function" in tool_def:
+        return tool_def.get("function", {}).get("name")
+    return tool_def.get("name")
+
+internal_tool_names = {
+    name
+    for t in (agent_workflow_tools or [])
+    if isinstance(t, dict) and (name := _get_tool_def_name(t))
+}
External tools mixed with workflow tools
The current flow immediately returns when any external_tool_calls exist (Lines 4339-4344). That’s correct for the “client owns tool execution” contract, but it means that a mixed response (external + vote/new_answer) will ignore the workflow tools. If that’s not desired, you may want to either prevent such mixes via enforcement or document the precedence explicitly.
Overall, the behaviour is a clear improvement (no crashes on unknown tools, correct vote-only gating); the above are just hardening suggestions.

Also applies to: 3907-3908, 4127-4142, 4191-4192, 4260-4261, 4268-4291, 4603-4613

5527-5531: Close previous coordination session before replacing CoordinationTracker on restart

Now that initialize_session opens a Logfire coordination session (MAS-199), handle_restart replaces self.coordination_tracker without calling _end_session() on the old instance. That can leave tracing spans open and drop final SESSION_END events for aborted attempts.

Consider best-effort closing before reassigning:
Suggested change to cleanly close prior session
-        # Reset coordination tracker for new attempt (MAS-199: includes log_path)
-        self.coordination_tracker = CoordinationTracker()
+        # Reset coordination tracker for new attempt (MAS-199: includes log_path)
+        # Close existing coordination session first so spans end cleanly.
+        try:
+            self.coordination_tracker._end_session()
+        except Exception:
+            # Best-effort: logging/telemetry issues should not block restart
+            pass
+        self.coordination_tracker = CoordinationTracker()
         log_dir = get_log_session_dir()
         log_path = str(log_dir) if log_dir else None
         self.coordination_tracker.initialize_session(list(self.agents.keys()), log_path=log_path)
This keeps observability cleaner without changing runtime behaviour.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d7af2c and 8952077.

📒 Files selected for processing (1)

massgen/orchestrator.py

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always use --automation flag for programmatic MassGen execution
Use Google-style docstrings for auto-generating API documentation

For new or changed functions, include Google-style docstrings

Files:

massgen/orchestrator.py

massgen/**/*.py

⚙️ CodeRabbit configuration file

massgen/**/*.py: Review Python code for:

PEP 8 compliance and Google-style docstrings

Proper async/await patterns

Type hints where appropriate

No security vulnerabilities (command injection, etc.)

Consistency with existing code patterns

Files:

massgen/orchestrator.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: docs/source/development/writing_configs.rst:0-0
Timestamp: 2026-01-07T03:58:40.702Z
Learning: Applies to docs/source/development/examples/tools/**/*.yaml : When using MCP tools, configure mcp_servers under agent.backend and do not add a filesystem MCP server; filesystem access is via cwd

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: docs/source/development/writing_configs.rst:0-0
Timestamp: 2026-01-07T03:58:40.702Z
Learning: Applies to docs/source/development/examples/tools/**/*.yaml : For multi-agent configs, configure voting_sensitivity and coordination only under orchestrator; do not duplicate per agent

📚 Learning: 2026-01-07T03:58:40.702Z

Learnt from: CR
Repo: massgen/MassGen PR: 0
File: docs/source/development/writing_configs.rst:0-0
Timestamp: 2026-01-07T03:58:40.702Z
Learning: Applies to docs/source/development/examples/tools/**/*.yaml : For multi-agent configs, configure voting_sensitivity and coordination only under orchestrator; do not duplicate per agent

Applied to files:

massgen/orchestrator.py

🧬 Code graph analysis (1)

massgen/orchestrator.py (5)

massgen/logger_config.py (1)

get_log_session_dir (63-116)

massgen/coordination_tracker.py (1)

initialize_session (208-247)

massgen/structured_logging.py (2)

log_agent_round_context (1045-1111)

info (276-282)

massgen/backend/base.py (2)

extract_tool_name (677-699)

extract_tool_arguments (701-736)

massgen/backend/base_with_custom_tool_and_mcp.py (1)

is_mcp_tool_call (3570-3572)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: pre-commit (ubuntu-latest, 3.12)
GitHub Check: pre-commit (ubuntu-latest, 3.11)
GitHub Check: pre-commit (macos-latest, 3.11)
GitHub Check: pre-commit (macos-14, 3.10)
GitHub Check: pre-commit (macos-14, 3.10)
GitHub Check: pre-commit (macos-14, 3.12)

🔇 Additional comments (2)

massgen/orchestrator.py (2)

1245-1248: MAS-199: Reinitializing coordination session with log_path and user prompt looks correct

Re-calling initialize_session in the idle branch with user_prompt=self.current_task and log_path derived from get_log_session_dir() wires the coordination tracker into the real task and log directory, which the earlier constructor-time init did not have. This matches the MAS-199 design and should give better session tracing without affecting flow control.

3861-3870: Per-agent agent_log_path propagation into structured logging is consistent with MAS-199

Deriving agent_log_path as log_session_dir / agent_id and threading it into log_agent_round_context aligns with the new logging schema (massgen.agent.log_path) and doesn’t change behaviour elsewhere. This will make downstream analysis (e.g., log analyzers) much easier while remaining low-risk.

ncrispino · 2026-01-07T11:16:50Z

docs/source/reference/cli.rst

+   * - ``massgen logs analyze``
+     - Generate analysis prompt for use in coding CLIs
+   * - ``massgen logs analyze --mode self``
+     - Run multi-agent self-analysis


clarify that this will run via massgen with a preset config, but can be adjusted

ncrispino added 7 commits January 4, 2026 18:04

Improve compr logging

33a0e60

Update log analyzer skill

ac1fdb3

Merge branch 'main' into comprehensive_logfire

e261dfd

Fix metadata, external tools, add massgen log analysis commands

496e5ec

Vote-only mode fix to prevent extra answer and enforcement restart; p…

4947576

…ermissions fix for Gemini; add jq to Docker

Merge branch 'main' into comprehensive_logfire

51bad06

Let MCPs run directly, improve skill & voting

99cc7e4

coderabbitai bot reviewed Jan 7, 2026

View reviewed changes

Small fixes

2f817a7

coderabbitai bot reviewed Jan 7, 2026

View reviewed changes

Make analysis per-turn

5d7af2c

coderabbitai bot reviewed Jan 7, 2026

View reviewed changes

Unknown tool warning fix

8952077

coderabbitai bot reviewed Jan 7, 2026

View reviewed changes

Grok, gemini tool fixes

9d7ed12

ncrispino commented Jan 7, 2026

View reviewed changes

Henry-811 changed the base branch from main to dev/v0.1.35 January 7, 2026 15:50

Henry-811 merged commit 33f3677 into dev/v0.1.35 Jan 7, 2026
24 checks passed

This was referenced Jan 7, 2026

feat: v0.1.35 #764

Merged

feat: Hook Framework for Agent Lifecycle Events #769

Merged

feat: v0.1.36 #774

Merged

coderabbitai bot mentioned this pull request Jan 14, 2026

feat: v0.1.38 #787

Merged

18 tasks

Conversation

ncrispino commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR: Enhanced Log Analysis CLI

Summary

Changes

New Features

massgen logs analyze Command

Per-Turn Analysis Reports (MAS-227)

Enhanced massgen logs list

Multi-Agent Log Analysis Config

direct_mcp_servers Config Option

Bug Fixes

Files Changed

Usage Examples

Test Plan

Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

ncrispino Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ncrispino commented Jan 7, 2026 •

edited

Loading

`massgen logs analyze` Command

Enhanced `massgen logs list`

`direct_mcp_servers` Config Option

coderabbitai bot commented Jan 7, 2026 •

edited

Loading