feat: Improve skill use and system prompt organization by ncrispino · Pull Request #515 · massgen/MassGen

ncrispino · 2025-11-14T13:33:20Z

Description

Major refactoring of system prompt architecture using priority-based XML sections, addition of semantic search capabilities via semtools and serena skills, enhanced browser automation with image persistence, and support for local skill execution outside Docker.

Key improvements:

System prompt architecture: New class-based, priority-driven design with XML structure (Closes MAS-76)
Semantic search: Added semtools and serena skills for meaning-based code/document search (Closes MAS-72)
Browser automation: Screenshots now persist as files in workspace
Local skills: Skills can run locally without Docker container requirement
Code quality: Reduced complexity in orchestrator (-428 lines) and message templates (-682 lines)

Type of change

Detailed Changes

1. System Prompt Architecture Refactor (Closes MAS-76)

New files:

massgen/system_prompt_sections.py (+1,284 lines): Class-based section architecture
- Priority enum: CRITICAL → AUXILIARY ordering
- SystemPromptSection base class with XML support
- 20+ specialized section classes (AgentIdentity, MassGenPrimitives, Skills, Memory, etc.)
- Based on Lakera AI Prompt Engineering Guide 2025 & Anthropic best practices
massgen/system_message_builder.py (+488 lines): Declarative prompt builder
- Automatic section ordering by priority
- XML wrapping with priority attributes
- Subsection support for hierarchical structure
docs/dev_notes/system_prompt_architecture_redesign.md (+593 lines): Design rationale

Modified files:

massgen/orchestrator.py (-428 lines): Cleaner, delegates to SystemMessageBuilder
massgen/message_templates.py (-682 lines): Removed redundant prompt logic
massgen/backend/claude_code.py: Integration with new system message builder

2. Semantic Search Skills (Closes MAS-72)

New skills:

massgen/skills/semtools/SKILL.md (+635 lines)
- Rust-based CLI for embedding-based semantic search
- Workspace management for large codebases
- Document parsing (PDF, DOCX, PPTX) with API key support
- Find code by meaning, not just keywords
massgen/skills/serena/SKILL.md (+522 lines)
- Complementary semantic search capabilities
- Alternative approach for different use cases

Documentation:

docs/source/user_guide/skills.rst (+222 lines): Comprehensive semantic vs. keyword search guide
docs/source/reference/yaml_schema.rst (+36 lines): Updated schema docs

3. Local Skill Execution

massgen/filesystem_manager/skills_manager.py: Refactored for local mode
massgen/skills/file-search/SKILL.md: Renamed from always/file_search
massgen/backend/claude_code.py: Ensures CC uses execute_command instead of raw bash
Skills now run directly on host without Docker requirement (Docker still supported)

4. Browser Automation Enhancement

massgen/tools/custom_tools/_browser_automation/browser_automation_tool.py (+39 lines)
- Screenshots saved as files within workspace
- Similar functionality to crawl4ai
- Enables persistent visual context for agents

5. Other Improvements

Updated Docker README and Dockerfiles
Memory filesystem mode refactored to use direct file operations
Multiple example configs for new features
README updates for PyPI and main repo

Checklist

I have run pre-commit on my changed files and all checks pass
My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Pre-commit status

# Run pre-commit on all changed files:
git diff --name-only main...HEAD | xargs uv run pre-commit run --files

How to Test

Test 1: System Prompt Architecture

CLI Command:

# Run existing orchestration tests to verify system prompt generation
uv run pytest massgen/tests/test_orchestration_restart.py -v

Expected Results:

All tests pass
System prompts use new XML structure internally
No behavior changes from user perspective

Test 2: Semantic Search Skills

CLI Command:

# Test semtools skill (requires semtools installed: cargo install semtools)
uv run massgen massgen/configs/skills/test_semantic_skills.yaml

# Test with semantic search example
uv run massgen massgen/configs/skills/semantic_search_comprehensive.yaml

Expected Results:

Skills load successfully
Semantic search finds code by meaning, not just keywords
Document parsing works with API key configured

Test 3: Local Skill Execution

CLI Command:

# Test skills running locally without Docker
uv run massgen massgen/configs/skills/skills_local_mode.yaml

Expected Results:

Skills execute on host machine
No Docker container required
Same functionality as Docker mode

Test 4: Browser Automation with Image Persistence

CLI Command:

# Test browser automation with screenshot saving
uv run massgen massgen/configs/tools/custom_tools/multimodal_tools/playwright_with_img_understanding.yaml

Expected Results:

Screenshots saved as files in workspace
Agents can reference saved images
Visual context persists across turns

Additional context

Impact: This is a significant architectural improvement that:

Makes system prompts more maintainable and debuggable
Adds powerful semantic search capabilities beyond keyword matching
Simplifies skill deployment (no Docker required for basic use)
Improves visual context handling in browser automation

Stats: 26 files changed: +4,234 insertions, -1,122 deletions (net +3,112 lines)

Future work (separate PRs):

Allow custom tools to run within Docker (MAS-79)
MCPs in Docker for better isolation
Evaluate browser automation approach (custom tool vs. direct code execution)

…d serena for semantic search

…skill execution

ncrispino added 8 commits November 12, 2025 20:00

Clean up system prompts

73a4766

Refactoring system prompt sections seems done; also added semtools an…

2d26e39

…d serena for semantic search

Playwright browser automation saves imgs as file

b9269e9

Refactor skills and associated messaging; allow local mode

db25277

Ensure CC always uses our execute_command instead of bash; fix local …

b4901e2

…skill execution

End cleanup of sysprompt and skill docs

58823d9

Fix minor pr problems

a238e43

Fix change in precommit

b6eae2e

Henry-811 changed the base branch from main to dev/v0.1.12 November 14, 2025 15:27

Henry-811 merged commit 5a08d6f into dev/v0.1.12 Nov 14, 2025
22 checks passed

Henry-811 deleted the improve_skills_memory branch November 14, 2025 15:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Improve skill use and system prompt organization#515

feat: Improve skill use and system prompt organization#515
Henry-811 merged 8 commits intodev/v0.1.12from
improve_skills_memory

ncrispino commented Nov 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ncrispino commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Detailed Changes

1. System Prompt Architecture Refactor (Closes MAS-76)

2. Semantic Search Skills (Closes MAS-72)

3. Local Skill Execution

4. Browser Automation Enhancement

5. Other Improvements

Checklist

Pre-commit status

How to Test

Test 1: System Prompt Architecture

Test 2: Semantic Search Skills

Test 3: Local Skill Execution

Test 4: Browser Automation with Image Persistence

Additional context

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ncrispino commented Nov 14, 2025 •

edited

Loading