Skip to content

Harsh9005/deep-work

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Deep Work

Maximum reasoning + multi-agent orchestration for any task.
"Don't think harder. Think deeper, in parallel, and verify."

License: MIT GitHub Stars Claude Code Skill Agents Phases

Quick Start Β· How It Works Β· Comparison Β· Patterns Β· Quality Gate


The Problem

AI coding assistants often:

  • Rush to a solution without thinking deeply
  • Handle complex tasks sequentially when parts could run in parallel
  • Skip verification and deliver unchecked output
  • Use the same approach for a typo fix and an architecture migration

Deep Work fixes all four. It's a meta-orchestrator skill that automatically classifies task complexity, scales agent count to match, parallelizes independent work, enforces deep reasoning, and quality-gates every deliverable with a fresh-eyes critic.


Quick Start

Install (Claude Code)

# Option 1: Clone the repo
git clone https://github.com/Harsh9005/deep-work.git
cp deep-work/SKILL.md ~/.claude/skills/deep-work/SKILL.md

# Option 2: Direct download
mkdir -p ~/.claude/skills/deep-work
curl -o ~/.claude/skills/deep-work/SKILL.md \
  https://raw.githubusercontent.com/Harsh9005/deep-work/main/SKILL.md

Use It

In Claude Code, just say:

deep work: refactor the authentication module
deep mode: write a technical design doc for the new API
max mode: debug why the payment flow is failing in production

Deep Work activates automatically and orchestrates the full pipeline.

Manual Integration

The skill is defined entirely in SKILL.md as a prompt-based orchestration framework. You can:

  • Use it directly as a system prompt for any Claude-based agent
  • Adapt the phase structure for your own multi-agent pipelines
  • Extract individual phases (e.g., just the Quality Gate) for simpler workflows

How It Works

flowchart TD
    A["Phase 0: Deep Analysis\n🧠 ultrathink"] --> B{"Complexity?"}
    B -->|LIGHT| D
    B -->|MEDIUM| C["Phase 1: Parallel Research\nπŸ” 2-4 Explore agents"]
    B -->|HEAVY| C
    B -->|WRITING| C
    C -->|HEAVY only| P["Phase 2: Architect\nπŸ“ Plan agent"]
    C -->|MEDIUM/WRITING| D
    P --> D["Phase 3: Parallel Execution\n⚑ 2-5 Worker agents (opus)"]
    D --> E["Phase 4: Synthesis\nπŸ”— ultrathink merge"]
    E --> F["Phase 5: Quality Gate\nπŸ”Ž Fresh-eyes Critic"]
    F -->|PASS| G["Phase 6: Deliver βœ…"]
    F -->|NEEDS_FIX| H["Fix Issues"]
    H --> F
    H -->|"Max 2 iterations"| G

    style A fill:#6C5CE7,color:#fff
    style C fill:#00B894,color:#fff
    style D fill:#0984E3,color:#fff
    style E fill:#6C5CE7,color:#fff
    style F fill:#E17055,color:#fff
    style G fill:#00B894,color:#fff
Loading

Adaptive Complexity Scaling

Not every task needs 10 agents. Deep Work classifies and adapts:

Complexity Signals Total Agents Phases Used
LIGHT Single file, clear fix, < 3 steps 2 0 β†’ 3 β†’ 5 β†’ 6
MEDIUM Multi-file, some ambiguity, 3-7 steps 5-6 0 β†’ 1 β†’ 3 β†’ 4 β†’ 5 β†’ 6
HEAVY Architecture change, multi-system, 8+ steps 8-10 All 7 phases
WRITING Documents, reports, manuscripts 5-7 0 β†’ 1 β†’ 3 β†’ 4 β†’ 5 β†’ 6

Phase Breakdown

Phase What Happens Model Concurrency
0. Deep Analysis Classify task, assess complexity, decompose into subtasks ultrathink β€”
1. Parallel Research Gather context: code patterns, dependencies, test coverage, prior work haiku All agents parallel
2. Architect Design execution strategy, dependency map, risk assessment opus Single
3. Parallel Execution Workers execute subtasks with deep reasoning opus + ultrathink All independent workers parallel
4. Synthesis Merge outputs, resolve conflicts, fill gaps, verify completeness ultrathink β€”
5. Quality Gate Fresh-eyes critic reviews against original task opus + ultrathink Single
6. Fix + Deliver Address issues (max 2 iterations), deliver with summary β€” β€”

How It Compares

Single Agent CrewAI AutoGen LangGraph Deep Work
Adaptive complexity β€” Fixed roles Fixed roles Manual graph 4-tier auto-scaling
Quality gate β€” β€” β€” β€” Mandatory fresh-eyes critic
Deep reasoning Optional β€” β€” β€” Enforced at every phase
Zero config Yes Python setup Python setup Python setup Copy 1 file, done
Parallel execution β€” Sequential Round-robin Configurable Auto-parallel independent tasks
Fix iteration loop β€” β€” β€” Manual Built-in (max 2 rounds)
Task decomposition Manual Manual Manual Manual Automatic in Phase 0
Works without code N/A No No No Prompt-only, no dependencies
Error recovery β€” β€” Retry Configurable Retry β†’ absorb β†’ downgrade

Key differentiator: Deep Work is a prompt-only skill β€” no Python packages, no API wrappers, no infrastructure. It works with any Claude-based agent by defining the orchestration protocol in a single markdown file.


Task Patterns

Code β€” Feature Implementation

Worker 1: Implement core logic (main module)       ─┐
Worker 2: Write tests (test file)                   β”œβ”€ All parallel
Worker 3: Update configuration/types (if needed)   β”€β”˜

Code β€” Refactoring

Worker 1: Refactor module A  ─┐
Worker 2: Refactor module B  ──── Parallel
                              β”‚
Worker 3: Update imports     β”€β”˜β”€β”€ Sequential (needs 1-2 output)

Writing β€” Document/Report

Worker 1: Write sections 1-3     ─┐
Worker 2: Write sections 4-6     ──── Parallel
                                  β”‚
Worker 3: Fact-check all claims  β”€β”˜β”€β”€ Sequential (needs draft)

Debug β€” Investigation

Worker 1: Reproduce + isolate
    ↓
Worker 2: Root cause analysis
    ↓
Worker 3: Implement + test fix

Analysis β€” Multi-Source

Worker 1: Analyze data source A  ─┐
Worker 2: Analyze data source B  ──── Parallel
                                  β”‚
Worker 3: Cross-source synthesis β”€β”˜β”€β”€ Sequential

Quality Gate

The Phase 5 critic has fresh eyes β€” it never saw intermediate work. It only sees the original task and the final output. This catches category errors that insiders miss.

Critic Output Format

{
  "verdict": "PASS | NEEDS_FIX",
  "confidence": "HIGH | MEDIUM | LOW",
  "strengths": ["what's done well"],
  "issues": [
    {
      "severity": "CRITICAL | MAJOR | MINOR",
      "location": "where in the output",
      "issue": "what's wrong",
      "fix": "specific fix recommendation"
    }
  ],
  "missing": ["anything the output should include but doesn't"],
  "overall_assessment": "1-2 sentence summary"
}

Review Checklist

Universal checks
  • Does the output fully address the original task? (completeness)
  • Are there any logical errors or inconsistencies? (correctness)
  • Is the quality at staff-engineer / senior-researcher level? (quality)
  • Would a domain expert find issues? (expertise check)
  • Is there unnecessary complexity that should be simplified? (elegance)
Code-specific checks
  • Edge cases not handled?
  • Potential bugs, race conditions, or security issues?
  • Follows codebase's existing patterns and conventions?
  • Tests comprehensive? Cover failure cases?
  • Will this break existing functionality?
Writing-specific checks
  • All claims supported by evidence?
  • Argumentation logically sound?
  • Gaps in coverage?
  • Tone and style appropriate and consistent?
  • Citations/references correct and complete?

Error Handling

Scenario Deep Work Response
Agent fails or returns garbage Retry once with simplified prompt; if still fails, absorb into main context
Workers produce conflicting outputs Ultrathink evaluates both, picks the better approach, documents decision
Critic finds CRITICAL issues after 2 rounds Deliver with transparent disclosure β€” never hide problems
Task simpler than classified Downgrade mid-execution, kill unnecessary agents
Task harder than classified Upgrade, spawn additional agents as needed

Design Decisions

Why ultrathink everywhere?

Deep thinking at Phase 0 prevents wasted agent spawns. At Phase 4 it catches merge conflicts. At Phase 5 it catches what workers missed. The cost of extended thinking is negligible compared to delivering wrong output.

Why a fresh-eyes critic?

The orchestrator and workers develop "tunnel vision." A critic that only sees the original task and final output catches category errors that insiders miss.

Why max 2 fix iterations?

Diminishing returns. If the critic still finds critical issues after 2 rounds, the problem is likely structural and needs human judgment, not more agent cycles.

Why adaptive complexity?

Spawning 5 agents for a typo fix is waste. Using 1 agent for an architecture migration is negligence. The complexity classifier ensures the right resources for the right task.

Why prompt-only (no code)?

Zero dependencies = zero friction. Works with any Claude-based agent. No package manager, no API wrappers, no infrastructure. Copy one file, done.


Agent Spawn Reference

LIGHT (2 agents)

Message 1: Worker (opus, ultrathink)
Message 2: Critic (opus, ultrathink)

MEDIUM (5-6 agents)

Message 1: 2 Explore agents ──── parallel
Message 2: 2-3 Workers ───────── parallel
Message 3: 1 Critic

HEAVY (8-10 agents)

Message 1: 3-4 Explore agents ── parallel
Message 2: 1 Architect
Message 3: 3-5 Workers ───────── parallel
Message 4: 1 Critic

WRITING (5-7 agents)

Message 1: 2 Explore agents ──── parallel
Message 2: 2 Writers ─────────── parallel
Message 3: 1 Fact-checker
Message 4: 1 Critic

Works With

  • Claude Code β€” native skill integration
  • Any Claude-based agent β€” use SKILL.md as a system prompt
  • Custom pipelines β€” extract and adapt individual phases
  • Other AI frameworks β€” the orchestration pattern is model-agnostic

Contributing

Contributions welcome! Ideas:

  • Additional task type patterns (DESIGN, DEVOPS, MIGRATION)
  • Benchmarks: Deep Work vs. single-agent completion rates
  • Integration templates for specific frameworks
  • Phase customization hooks
  • Translations (Chinese, Korean, Japanese)

License

MIT License


A prompt-only multi-agent orchestration framework. No dependencies. No infrastructure. Just intelligence.

About

Maximum reasoning + multi-agent orchestration framework for AI coding assistants. Adaptive 6-phase pipeline: ultrathink analysis, parallel research, architecture, parallel execution, synthesis, and quality gate.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors