Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
440 commits
Select commit Hold shift + click to select a range
e0cee08
Task 32: RelatedResourcesTask - Resources stage starting - 54% DONE
82deutschmark Oct 1, 2025
0040d83
Tasks 33-36: Team enrichment (FindTeam, ContractType, Background, Env…
82deutschmark Oct 1, 2025
6125214
Tasks 37-40: Resources stage complete (ReviewTeam, TeamMarkdown, SWOT…
82deutschmark Oct 1, 2025
4126b9f
Tasks 41-42: Documents stage (DataCollection, IdentifyDocuments) - 71…
82deutschmark Oct 1, 2025
6e8dd4b
Tasks 43-45: Documents stage (FilterToFind, FilterToCreate, DraftDocs…
82deutschmark Oct 1, 2025
7c757e3
Tasks 46-48: Documents + WBS (DraftDocsToCreate, DocsMarkdown, WBSLev…
82deutschmark Oct 1, 2025
7b5074a
Tasks 49-50: WBS (WBSLevel2, WBSProject12) - 85% DONE
82deutschmark Oct 1, 2025
13e8360
Tasks 51-52: Pitch (CreatePitch, ConvertPitchToMarkdown) - 88% DONE
82deutschmark Oct 1, 2025
aeadd41
Task 53: IdentifyTaskDependencies - 90% DONE
82deutschmark Oct 1, 2025
d5f1a4f
Task 54: EstimateTaskDurations (multi-chunk LLM task) - 91% DONE
82deutschmark Oct 1, 2025
f3d82be
Tasks 55-62: Complete Luigi database integration refactor - 100% DONE
82deutschmark Oct 1, 2025
b71ceda
docs: Add v0.3.0 changelog entry for Luigi database integration refac…
82deutschmark Oct 1, 2025
4506de9
fix: Use /tmp for Railway run directory instead of /app/run
82deutschmark Oct 1, 2025
ca1b03c
fix: Pass DATABASE_URL to Luigi subprocess for database writes
82deutschmark Oct 1, 2025
3ecd9f3
diagnostic: Add comprehensive logging to debug Luigi task execution f…
82deutschmark Oct 1, 2025
f29c0e1
docs: Add comprehensive diagnostic logs interpretation guide
82deutschmark Oct 1, 2025
4668ee9
diagnostic: Add task execution and luigi.build() exception logging
82deutschmark Oct 1, 2025
a398b41
docs: Add Luigi hang diagnostic guide for Scenario A-D identification
82deutschmark Oct 1, 2025
d2b061a
docs: Add comprehensive Railway Luigi hang debugging session document…
82deutschmark Oct 1, 2025
8f7f7d4
fix: Change Luigi workers=1 to workers=0 for Railway synchronous exec…
82deutschmark Oct 1, 2025
3def2fb
docs: Update Railway debug documentation with resolution
82deutschmark Oct 1, 2025
acc8e56
fix: Update diagnostic log messages to show workers=0
82deutschmark Oct 1, 2025
6019cff
Agents
82deutschmark Oct 1, 2025
c8dbfe4
Changing Workers to 1
82deutschmark Oct 1, 2025
eb48cb8
Create 01Oct-LuigiWorkers0-RootCause.md
82deutschmark Oct 1, 2025
a76305b
diagnostic: Add thread monitoring and timeout detection for Luigi hang
82deutschmark Oct 1, 2025
ba494fc
Update run_plan_pipeline.py
82deutschmark Oct 1, 2025
30ae9c5
Create LUIGI-WORKERS.md
82deutschmark Oct 1, 2025
13cc9d0
fix: Move threading imports before usage to avoid UnboundLocalError
82deutschmark Oct 2, 2025
75ed09f
Update util-types.ts
82deutschmark Oct 2, 2025
020a0e5
Update run_plan_pipeline.py
82deutschmark Oct 2, 2025
9e03e27
Update pipeline_execution_service.py
82deutschmark Oct 2, 2025
2bfd6da
CRITICAL FIX: Clean run directory before pipeline execution
82deutschmark Oct 2, 2025
e0d51d1
fix: Add database connection timeout to prevent Luigi worker deadlock
82deutschmark Oct 2, 2025
1fa3054
Update pipeline_execution_service.py
82deutschmark Oct 2, 2025
283e35a
Update pipeline_execution_service.py
82deutschmark Oct 2, 2025
aef3a06
Update run_plan_pipeline.py
82deutschmark Oct 2, 2025
1e21692
Update pipeline_execution_service.py
82deutschmark Oct 2, 2025
b817d62
Update run_plan_pipeline.py
82deutschmark Oct 2, 2025
b2ab360
Update app_text2plan.py
82deutschmark Oct 2, 2025
d4f928f
Create 2025-10-02-Windows-Unicode-Subprocess-Fix.md
82deutschmark Oct 2, 2025
845d97d
Update 01Oct-LuigiWorkers0-RootCause.md
82deutschmark Oct 2, 2025
f4ee15f
Create 2OctPostMortem.md
82deutschmark Oct 2, 2025
3011f61
Codex attempts some fixes
82deutschmark Oct 2, 2025
f323845
Create 2025-10-02-E2E-Env-Propagation-Runbook.md
82deutschmark Oct 2, 2025
2237152
Cleaning up some notes
82deutschmark Oct 2, 2025
32e25f0
Update pipeline_execution_service.py
82deutschmark Oct 2, 2025
b757c29
Create 2Oct-CRITICAL-SubprocessFix.md
82deutschmark Oct 2, 2025
edf7175
Update pipeline_execution_service.py
82deutschmark Oct 2, 2025
82e7790
Docker fixes
82deutschmark Oct 2, 2025
5358074
Docs
82deutschmark Oct 2, 2025
52db5f0
GPT-5 Tries some fixes
82deutschmark Oct 3, 2025
3ba3e9c
Update CHANGELOG.md
82deutschmark Oct 3, 2025
05ebbdd
Plans and Docs
82deutschmark Oct 3, 2025
7028c38
Docs
82deutschmark Oct 3, 2025
9cdce95
Create 02OctCodexPlan-ImplementationPlan-Cascade.md
82deutschmark Oct 3, 2025
e7b99e4
Add fallback report assembler endpoint
82deutschmark Oct 3, 2025
d8e1a04
Surface recovered reports in Files tab
82deutschmark Oct 3, 2025
e179a8b
Document fallback report recovery
82deutschmark Oct 3, 2025
f3dff25
Sort plans queue by newest first
82deutschmark Oct 3, 2025
18f052b
Update CHANGELOG.md
82deutschmark Oct 3, 2025
6ed9009
Update ReportTaskFallback.tsx
82deutschmark Oct 3, 2025
8a9c461
docs: audit and refresh for v0.3.2
82deutschmark Oct 3, 2025
c725f94
feat(frontend): add recovery workspace and sanitize file listings
82deutschmark Oct 3, 2025
58cccb4
Create 3Oct.md
82deutschmark Oct 3, 2025
52cf778
docs: note recovery workspace usage
82deutschmark Oct 3, 2025
56907d3
docs: high-level Workspace redesign plan and tasklist (3OctWorkspace.md)
82deutschmark Oct 3, 2025
6d27774
feat(frontend): make workspace access primary in plans queue
82deutschmark Oct 3, 2025
949ca83
feat: pivot workspace to database artefacts and streamlined UX
82deutschmark Oct 3, 2025
974bc22
feat: add artefact endpoint and document workspace progress
82deutschmark Oct 3, 2025
74d542f
Tweaks
82deutschmark Oct 4, 2025
4e7128a
Update page.tsx
82deutschmark Oct 4, 2025
71a8d6e
fix: ensure workspace navigation after plan creation and resolve Type…
82deutschmark Oct 4, 2025
ab63c2e
fix(workspace): repair corrupted recovery page with proper imports an…
82deutschmark Oct 4, 2025
aa392a7
Create page.tsx.bak
82deutschmark Oct 4, 2025
5c2eb04
fix(recovery): prevent deployment crashes from undefined artefacts an…
82deutschmark Oct 4, 2025
4c6fbb3
Fix: Prevent TypeError on empty/null stage values in artefact display
82deutschmark Oct 5, 2025
cddba94
Update PipelineDetails.tsx
82deutschmark Oct 5, 2025
b5e6520
Update api.py
82deutschmark Oct 5, 2025
a502528
Docs
82deutschmark Oct 15, 2025
8303d40
Update CHANGELOG.md
82deutschmark Oct 15, 2025
72b2a13
Dependencies
82deutschmark Oct 15, 2025
ba8ed6b
Update fastapi-client.ts
82deutschmark Oct 15, 2025
d6057d9
Update package-lock.json
82deutschmark Oct 15, 2025
7333574
Update favicon.svg
82deutschmark Oct 15, 2025
0f900a1
Update favicon.ico
82deutschmark Oct 15, 2025
ed4c54f
Update PlanForm.tsx
82deutschmark Oct 15, 2025
6a56182
Update tsconfig.json
82deutschmark Oct 15, 2025
96ff59e
Update CHANGELOG.md
82deutschmark Oct 15, 2025
008b6a3
Changes
82deutschmark Oct 15, 2025
d168fca
Harden recovery report handling in production
82deutschmark Oct 15, 2025
4d2a88c
Fix canonical report filename references
82deutschmark Oct 15, 2025
f0d6572
Merge branch 'ui' into codex/identify-gaps-in-project-deployment-9iejrv
82deutschmark Oct 15, 2025
51d0ac5
Merge pull request #4 from 82deutschmark/codex/identify-gaps-in-proje…
82deutschmark Oct 15, 2025
b1c537b
Merge pull request #3 from 82deutschmark/codex/identify-gaps-in-proje…
82deutschmark Oct 15, 2025
4ecf4b7
Docs
82deutschmark Oct 15, 2025
da886e2
Switch GPT-5 client to Responses API with schema registry
82deutschmark Oct 15, 2025
adbdd7b
Merge pull request #5 from 82deutschmark/codex/set-up-gpt-5-streaming…
82deutschmark Oct 15, 2025
524da89
Create 15OctFunctionCalling.md
82deutschmark Oct 15, 2025
4173171
Create 15OctStructuredOutputs.md
82deutschmark Oct 15, 2025
0e30e7a
Merge branch 'ui' of https://github.com/82deutschmark/PlanExe into ui
82deutschmark Oct 15, 2025
47559f4
feat: surface responses reasoning streams
82deutschmark Oct 16, 2025
a4e8367
Merge pull request #6 from 82deutschmark/codex/complete-migration-pro…
82deutschmark Oct 16, 2025
f85c50e
Docs
82deutschmark Oct 16, 2025
3e40ef8
Update AGENTS.md
82deutschmark Oct 16, 2025
cb3c9d9
docs: capture responses agent migration plan
82deutschmark Oct 16, 2025
1917e8b
Merge pull request #7 from 82deutschmark/codex/research-responses-api…
82deutschmark Oct 16, 2025
836a9d2
chore: harden lint workflow and update docs dates
82deutschmark Oct 16, 2025
01521db
Merge pull request #9 from 82deutschmark/codex/create-ui/ux-plan-with…
82deutschmark Oct 16, 2025
0c969ae
Fetch landing badge version from changelog
82deutschmark Oct 16, 2025
ab5ca94
Merge pull request #10 from 82deutschmark/codex/fix-main-page-changel…
82deutschmark Oct 16, 2025
ca77a93
Ensure OpenAI simple adapter streams all completions
82deutschmark Oct 16, 2025
319dcb4
Merge pull request #11 from 82deutschmark/codex/update-simpleopenaill…
82deutschmark Oct 16, 2025
56da54b
Centralize websocket URL construction
82deutschmark Oct 16, 2025
501d1f4
Merge pull request #12 from 82deutschmark/codex/add-websocket-url-hel…
82deutschmark Oct 16, 2025
2f80c96
Expose Responses usage telemetry in monitoring UI
82deutschmark Oct 16, 2025
2450275
Merge pull request #17 from 82deutschmark/codex/fix-implementation-of…
82deutschmark Oct 16, 2025
6d13cdd
Expose final Responses payloads in monitor
82deutschmark Oct 17, 2025
6dab50f
Merge branch 'ui' into codex/fix-implementation-of-responsesapi-m48pp1
82deutschmark Oct 17, 2025
b7afc7c
Merge pull request #19 from 82deutschmark/codex/fix-implementation-of…
82deutschmark Oct 17, 2025
4fe0ec1
Flatten recovery workspace layout
82deutschmark Oct 17, 2025
2a97d8b
Merge pull request #20 from 82deutschmark/codex/fix-ui-layout-of-reco…
82deutschmark Oct 17, 2025
b5ab9b9
Refine recovery workspace relaunch and preview
82deutschmark Oct 17, 2025
0c324df
Merge pull request #21 from 82deutschmark/codex/improve-ui-for-plan-r…
82deutschmark Oct 17, 2025
ad75d15
feat: add streaming analysis SSE pipeline and UI
82deutschmark Oct 17, 2025
5eba45b
Merge pull request #22 from 82deutschmark/codex/plan-integration-with…
82deutschmark Oct 17, 2025
2b91d9e
fix: Responses API migration build issues
82deutschmark Oct 17, 2025
7c46766
docs: add landing page redesign plan for conversation-first experience
82deutschmark Oct 17, 2025
5ceaae4
fix: Restructure landing page layout - prioritize form and queue\n\n-…
82deutschmark Oct 17, 2025
9623c97
Fix Responses API resource initialization
Oct 18, 2025
e8089c0
refactor: replace emoji log markers with [PIPELINE] prefix for better…
Oct 18, 2025
14f1b7a
docs: record Responses client hardening
Oct 18, 2025
f2af326
Cors fix
Oct 18, 2025
aa4ef02
Docs
Oct 18, 2025
17a95d2
Add conversation stream harness to fix import error
82deutschmark Oct 18, 2025
5edb466
Merge pull request #27 from 82deutschmark/codex/fix-syntaxerror-in-co…
82deutschmark Oct 18, 2025
3026624
docs: document streaming env flags for railway
82deutschmark Oct 18, 2025
83b2b82
Merge pull request #28 from 82deutschmark/codex/compare-staging-branc…
82deutschmark Oct 18, 2025
933ee15
Idiot Codex error with headers
Oct 18, 2025
ca706cc
Fix invalid TypeScript headers in Python modules
82deutschmark Oct 18, 2025
07d3883
Merge pull request #29 from 82deutschmark/codex/locate-python-files-w…
82deutschmark Oct 18, 2025
137eea6
Fix SSE ping stream payload formatting
82deutschmark Oct 18, 2025
bb2e341
Merge branch 'staging' into codex/locate-python-files-with-ts-headers…
82deutschmark Oct 18, 2025
292540a
Merge pull request #30 from 82deutschmark/codex/locate-python-files-w…
82deutschmark Oct 18, 2025
fba02bd
Implement conversation-first intake and fix pipeline bootstrap
82deutschmark Oct 18, 2025
92d8ba6
Merge pull request #31 from 82deutschmark/codex/redesign-conversation…
82deutschmark Oct 18, 2025
dc00512
Ensure streaming defaults align with frontend health badge
82deutschmark Oct 18, 2025
57e799d
Merge pull request #32 from 82deutschmark/codex/investigate-404-and-4…
82deutschmark Oct 18, 2025
77ca0d3
feat: wire Conversations API streaming for intake modal
82deutschmark Oct 18, 2025
49f62d0
Merge pull request #33 from 82deutschmark/codex/update-conversationmo…
82deutschmark Oct 18, 2025
182ca13
fix: remove sentinel marker from conversation service
82deutschmark Oct 18, 2025
28511ea
Merge branch 'staging' into codex/update-conversationmodal-and-api-in…
82deutschmark Oct 18, 2025
5df59ee
Conversation modal
82deutschmark Oct 18, 2025
f3e13b0
Relax conversation intake to rely on Responses API state
82deutschmark Oct 19, 2025
3dfd42e
Merge pull request #35 from 82deutschmark/codex/investigate-conversat…
82deutschmark Oct 19, 2025
19b804e
Improve intake conversation modal reliability
82deutschmark Oct 19, 2025
ac19c87
Merge pull request #36 from 82deutschmark/codex/fix-conversation-moda…
82deutschmark Oct 19, 2025
150626a
Ensure intake conversation restarts for each prompt submission
82deutschmark Oct 19, 2025
3628a9b
Merge pull request #37 from 82deutschmark/codex/revise-conversation-m…
82deutschmark Oct 19, 2025
2a0ebdf
Fix Luigi pipeline crashes by upgrading to OpenAI SDK v2.5.0 and fixi…
Oct 19, 2025
715f149
Fix OpenRouter key handoff to pipeline
82deutschmark Oct 19, 2025
23a4905
Merge pull request #39 from 82deutschmark/codex/debug-frontend-backen…
82deutschmark Oct 19, 2025
cfc84c7
Fix dev API base URL resolution for non-local hosts
82deutschmark Oct 19, 2025
5789ba9
Merge pull request #40 from 82deutschmark/codex/debug-frontend-backen…
82deutschmark Oct 19, 2025
8dd3590
chore: update permissions to allow web search, OpenAI docs access, an…
Oct 19, 2025
301bcd4
Merge branch 'staging' of https://github.com/82deutschmark/PlanExe in…
Oct 19, 2025
3662dc6
Fix conversation summary persistence locking
82deutschmark Oct 19, 2025
67e9a2d
Merge pull request #41 from 82deutschmark/codex/fix-conversation-hand…
82deutschmark Oct 19, 2025
803b8a6
chore: remove 8 unused llama-index provider packages, fix pip depende…
Oct 19, 2025
6e9678e
Update settings.local.json
Oct 19, 2025
e55e5eb
Update pyproject.toml
Oct 19, 2025
dc156a1
chore: MAJOR - eliminate unused llama-index meta-package, fix pip dep…
Oct 19, 2025
99dbb2c
Update analysis_stream_service.py
Oct 20, 2025
16da05d
Align intake conversation defaults with GPT-5 Mini
82deutschmark Oct 20, 2025
bba1c4f
Merge pull request #43 from 82deutschmark/codex/update-fallback-resol…
82deutschmark Oct 20, 2025
2fad894
Persist response ids across conversation turns
82deutschmark Oct 20, 2025
26d8b4d
Merge pull request #44 from 82deutschmark/codex/update-useresponsesco…
82deutschmark Oct 20, 2025
fe7977c
Fix dev API base URL detection
82deutschmark Oct 20, 2025
b9aa785
Merge branch 'staging' into codex/debug-frontend-backend-file-creatio…
82deutschmark Oct 20, 2025
6e2fb59
Merge pull request #45 from 82deutschmark/codex/debug-frontend-backen…
82deutschmark Oct 20, 2025
e5d65b5
feat: MAJOR - landing page redesign with conversation-first UX (v0.4.0)
Oct 20, 2025
b6174b2
Centralize Responses control defaults
82deutschmark Oct 20, 2025
2f61b29
Merge pull request #46 from 82deutschmark/codex/locate-max_output_tok…
82deutschmark Oct 20, 2025
67ab59f
Centralize streaming token defaults
82deutschmark Oct 20, 2025
1717795
Merge pull request #47 from 82deutschmark/codex/centralize-output-tok…
82deutschmark Oct 20, 2025
59d241f
Unify streaming token defaults under 120k ceiling
82deutschmark Oct 20, 2025
39b7ad7
Merge pull request #48 from 82deutschmark/codex/centralize-maximum-ou…
82deutschmark Oct 20, 2025
90c39d0
fix: CRITICAL - conversation modal API error and width issues
Oct 20, 2025
90b8db2
fix: ROOT CAUSE - correct Responses API content type in LLM base class
Oct 21, 2025
c18c922
fix: CRITICAL - conversation modal width tripled to show all elements
Oct 21, 2025
7ad8ccd
fix: MAJOR UX - make landing page 60% more info-dense and reduce scro…
Oct 21, 2025
3882c9b
Update settings.local.json
Oct 21, 2025
bb85172
Update ConversationModal.tsx
Oct 21, 2025
ea6bae9
Conversation Modal
Oct 21, 2025
ab49871
Align Responses conversation defaults with API enums
82deutschmark Oct 21, 2025
512db58
Merge pull request #49 from 82deutschmark/codex/investigate-responses…
82deutschmark Oct 21, 2025
ad9f297
Conversations and Responses Support
Oct 21, 2025
206e4c1
Normalize OpenAI input segments and refresh Responses docs
82deutschmark Oct 21, 2025
fe578de
Merge pull request #50 from 82deutschmark/codex/verify-supported-inpu…
82deutschmark Oct 21, 2025
4923d75
Fix conversation streaming to use Responses API directly
82deutschmark Oct 21, 2025
c507c0c
Merge pull request #51 from 82deutschmark/codex/find-openai-docs-for-…
82deutschmark Oct 21, 2025
49ba043
Sanitize structured schema names for Responses API
82deutschmark Oct 21, 2025
32cba77
Merge pull request #52 from 82deutschmark/codex/debug-identifypurpose…
82deutschmark Oct 21, 2025
dc92708
Update simple_openai_llm.py
Oct 21, 2025
441320d
Merge branch 'staging' of https://github.com/82deutschmark/PlanExe in…
Oct 21, 2025
3f9fd33
Update simple_openai_llm.py
Oct 21, 2025
ae4830c
docs: note recovery log panel addition
82deutschmark Oct 21, 2025
db75416
Merge pull request #53 from 82deutschmark/codex/remove-streaming-moda…
82deutschmark Oct 21, 2025
741690b
Update CHANGELOG.md
Oct 21, 2025
dfcfa0c
Update simple_openai_llm.py
Oct 21, 2025
d08a40c
Update AGENTS.md
Oct 21, 2025
ee562e5
Align Responses structured output requests
82deutschmark Oct 21, 2025
a833c31
Merge pull request #54 from 82deutschmark/codex/refactor-project-to-u…
82deutschmark Oct 21, 2025
7e345de
Refine Responses streaming structured output handling
82deutschmark Oct 21, 2025
a5580dd
Merge branch 'staging' into codex/refactor-project-to-use-responses-api
82deutschmark Oct 21, 2025
f0fae18
Merge pull request #55 from 82deutschmark/codex/refactor-project-to-u…
82deutschmark Oct 21, 2025
8cd6151
Fix Responses text format schema handling
Oct 21, 2025
45743dd
Update ConversationModal.tsx
Oct 21, 2025
4db1e88
Refine structured output schema handling
82deutschmark Oct 21, 2025
b68be72
Merge pull request #56 from 82deutschmark/codex/consolidate-schema-ha…
82deutschmark Oct 21, 2025
58defcd
Fix recovery fallbacks and persist pipeline logs
82deutschmark Oct 21, 2025
9ffeece
Merge pull request #57 from 82deutschmark/codex/investigate-log-loss-…
82deutschmark Oct 21, 2025
521223d
Fix Responses input normalization regression
82deutschmark Oct 21, 2025
edad013
Merge pull request #58 from 82deutschmark/codex/fix-regression-from-r…
82deutschmark Oct 21, 2025
2ebd5b2
feat: Implement enriched plan intake schema with Responses API struct…
Oct 21, 2025
703e684
feat: Complete enriched plan intake system integration (backend + API)
Oct 21, 2025
1419431
Docs
Oct 21, 2025
fa9a6ff
docs: Add implementation summary and strict mode compliance guide
Oct 21, 2025
a771006
Fix Responses schema enforcement for Redline Gate
Oct 21, 2025
ea93c29
Implement end-to-end enriched plan intake system with Luigi optimization
Oct 21, 2025
a8d485f
Fixes
Oct 21, 2025
a62840f
feat: redesign landing page conversation flow
82deutschmark Oct 21, 2025
8b04203
Merge pull request #60 from 82deutschmark/codex/redesign-landing-page…
82deutschmark Oct 21, 2025
5305068
Update AGENTS.md
Oct 21, 2025
be70a2e
Claude and Agents
Oct 21, 2025
f56096f
Fix landing fallback model keys
82deutschmark Oct 21, 2025
155f4b5
Merge pull request #61 from 82deutschmark/codex/update-fallback-model…
82deutschmark Oct 21, 2025
c48429a
Docs
Oct 22, 2025
9cbad77
Create track_activity.jsonl
Oct 22, 2025
d2cd4b4
Improve conversation modal input emphasis
82deutschmark Oct 22, 2025
d07da69
Merge pull request #62 from 82deutschmark/codex/update-conversation-m…
82deutschmark Oct 22, 2025
8fff4dc
Guard Responses SDK and refresh model metadata
Oct 22, 2025
f862621
Minor fixes
Oct 22, 2025
7f0ad3e
Fix IdentifyPotentialLevers chat history serialization
82deutschmark Oct 22, 2025
6c0371c
Merge pull request #63 from 82deutschmark/codex/diagnose-failing-luig…
82deutschmark Oct 22, 2025
0535716
chore: release 0.4.2 plan file metadata
Oct 22, 2025
136c356
Update identify_potential_levers.py
Oct 22, 2025
b722213
Merge branch 'main' into staging
82deutschmark Oct 22, 2025
56f1228
Fix lever assistant message serialisation
82deutschmark Oct 22, 2025
875498d
Merge pull request #64 from 82deutschmark/codex/convert-chatmessageco…
82deutschmark Oct 22, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
293 changes: 293 additions & 0 deletions .agents/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,293 @@
# Custom Agents

Create specialized agent workflows that coordinate multiple AI agents to tackle complex engineering tasks. Instead of a single agent trying to handle everything, you can orchestrate teams of focused specialists that work together.

## Getting Started

1. **Edit an existing agent**: Start with `my-custom-agent.ts` and modify it for your needs
2. **Test your agent**: Run `codebuff --agent your-agent-name`
3. **Publish your agent**: Run `codebuff publish your-agent-name`

## Need Help?

- For examples, check the `examples/` directory.
- Join our [Discord community](https://codebuff.com/discord) and ask your questions!
- Check our [documentation](https://codebuff.com/docs) for more details

# What is Codebuff?

Codebuff is an **open-source AI coding assistant** that edits your codebase through natural language instructions. Instead of using one model for everything, it coordinates specialized agents that work together to understand your project and make precise changes.

Codebuff beats Claude Code at 61% vs 53% on [our evals](https://github.com/CodebuffAI/codebuff/tree/main/evals) across 175+ coding tasks over multiple open-source repos that simulate real-world tasks.

## How Codebuff Works

When you ask Codebuff to "add authentication to my API," it might invoke:

1. A **File Explorer Agent** to scan your codebase to understand the architecture and find relevant files
2. A **Planner Agent** to plan which files need changes and in what order
3. An **Editor Agent** to make precise edits
4. A **Reviewer Agent** to validate changes

This multi-agent approach gives you better context understanding, more accurate edits, and fewer errors compared to single-model tools.

## Context Window Management

### Why Agent Workflows?

Modern software projects are complex ecosystems with thousands of files, multiple frameworks, intricate dependencies, and domain-specific requirements. A single AI agent trying to understand and modify such systems faces fundamental limitations—not just in knowledge, but in the sheer volume of information it can process at once.

### The Solution: Focused Context Windows

Agent workflows elegantly solve this by breaking large tasks into focused sub-problems. When working with large codebases (100k+ lines), each specialist agent receives only the narrow context it needs—a security agent sees only auth code, not UI components—keeping the context for each agent manageable while ensuring comprehensive coverage.

### Why Not Just Mimic Human Roles?

This is about efficient AI context management, not recreating a human department. Simply creating a "frontend-developer" agent misses the point. AI agents don't have human constraints like context-switching or meetings. Their power comes from hyper-specialization, allowing them to process a narrow domain more deeply than a human could, then coordinating seamlessly with other specialists.

## Agent workflows in action

Here's an example of a `git-committer` agent that creates good commit messages:

```typescript
export default {
id: 'git-committer',
displayName: 'Git Committer',
model: 'openai/gpt-5-nano',
toolNames: ['read_files', 'run_terminal_command', 'end_turn'],

instructionsPrompt:
'You create meaningful git commits by analyzing changes, reading relevant files for context, and crafting clear commit messages that explain the "why" behind changes.',

async *handleSteps() {
// Analyze what changed
yield { tool: 'run_terminal_command', command: 'git diff' }
yield { tool: 'run_terminal_command', command: 'git log --oneline -5' }

// Stage files and create commit with good message
yield 'STEP_ALL'
},
}
```

This agent systematically analyzes changes, reads relevant files for context, then creates commits with clear, meaningful messages that explain the "why" behind changes.

# Agent Development Guide

This guide covers everything you need to know about building custom Codebuff agents.

## Agent Structure

Each agent is a TypeScript file that exports an `AgentDefinition` object:

```typescript
export default {
id: 'my-agent', // Unique identifier (lowercase, hyphens only)
displayName: 'My Agent', // Human-readable name
model: 'claude-3-5-sonnet', // AI model to use
toolNames: ['read_files', 'write_file'], // Available tools
instructionsPrompt: 'You are...', // Agent behavior instructions
spawnerPrompt: 'Use this agent when...', // When others should spawn this
spawnableAgents: ['helper-agent'], // Agents this can spawn

// Optional: Programmatic control
async *handleSteps() {
yield { tool: 'read_files', paths: ['src/config.ts'] }
yield 'STEP' // Let AI process and respond
},
}
```

## Core Properties

### Required Fields

- **`id`**: Unique identifier using lowercase letters and hyphens only
- **`displayName`**: Human-readable name shown in UI
- **`model`**: AI model from OpenRouter (see [available models](https://openrouter.ai/models))
- **`instructionsPrompt`**: Detailed instructions defining the agent's role and behavior

### Optional Fields

- **`toolNames`**: Array of tools the agent can use (defaults to common tools)
- **`spawnerPrompt`**: Instructions for when other agents should spawn this one
- **`spawnableAgents`**: Array of agent names this agent can spawn
- **`handleSteps`**: Generator function for programmatic control

## Available Tools

### File Operations

- **`read_files`**: Read file contents
- **`write_file`**: Create or modify entire files
- **`str_replace`**: Make targeted string replacements
- **`code_search`**: Search for patterns across the codebase

### Execution

- **`run_terminal_command`**: Execute shell commands
- **`spawn_agents`**: Delegate tasks to other agents
- **`end_turn`**: Finish the agent's response

### Web & Research

- **`web_search`**: Search the internet for information
- **`read_docs`**: Read technical documentation
- **`browser_logs`**: Navigate and inspect web pages

See `types/tools.ts` for detailed parameter information.

## Programmatic Control

Use the `handleSteps` generator function to mix AI reasoning with programmatic logic:

```typescript
async *handleSteps() {
// Execute a tool
yield { tool: 'read_files', paths: ['package.json'] }

// Let AI process results and respond
yield 'STEP'

// Conditional logic
if (needsMoreAnalysis) {
yield { tool: 'spawn_agents', agents: ['deep-analyzer'] }
yield 'STEP_ALL' // Wait for spawned agents to complete
}

// Final AI response
yield 'STEP'
}
```

### Control Commands

- **`'STEP'`**: Let AI process and respond once
- **`'STEP_ALL'`**: Let AI continue until completion
- **Tool calls**: `{ tool: 'tool_name', ...params }`

## Model Selection

Choose models based on your agent's needs:

- **`anthropic/claude-sonnet-4`**: Best for complex reasoning and code generation
- **`openai/gpt-5`**: Strong general-purpose capabilities
- **`x-ai/grok-4-fast`**: Fast and cost-effective for simple or medium-complexity tasks

**Any model on OpenRouter**: Unlike Claude Code which locks you into Anthropic's models, Codebuff supports any model available on [OpenRouter](https://openrouter.ai/models) - from Claude and GPT to specialized models like Qwen, DeepSeek, and others. Switch models for different tasks or use the latest releases without waiting for platform updates.

See [OpenRouter](https://openrouter.ai/models) for all available models and pricing.

## Agent Coordination

Agents can spawn other agents to create sophisticated workflows:

```typescript
// Parent agent spawns specialists
async *handleSteps() {
yield { tool: 'spawn_agents', agents: [
'security-scanner',
'performance-analyzer',
'code-reviewer'
]}
yield 'STEP_ALL' // Wait for all to complete

// Synthesize results
yield 'STEP'
}
```

**Reuse any published agent**: Compose existing [published agents](https://www.codebuff.com/store) to get a leg up. Codebuff agents are the new MCP!

## Best Practices

### Instructions

- Be specific about the agent's role and expertise
- Include examples of good outputs
- Specify when the agent should ask for clarification
- Define the agent's limitations

### Tool Usage

- Start with file exploration tools (`read_files`, `code_search`)
- Use `str_replace` for targeted edits, `write_file` for major changes
- Always use `end_turn` to finish responses cleanly

### Error Handling

- Include error checking in programmatic flows
- Provide fallback strategies for failed operations
- Log important decisions for debugging

### Performance

- Choose appropriate models for the task complexity
- Minimize unnecessary tool calls
- Use spawnable agents for parallel processing

## Testing Your Agent

1. **Local Testing**: `codebuff --agent your-agent-name`
2. **Debug Mode**: Add logging to your `handleSteps` function
3. **Unit Testing**: Test individual functions in isolation
4. **Integration Testing**: Test agent coordination workflows

## Publishing & Sharing

1. **Validate**: Ensure your agent works across different codebases
2. **Document**: Include clear usage instructions
3. **Publish**: `codebuff publish your-agent-name`
4. **Maintain**: Update as models and tools evolve

## Advanced Patterns

### Conditional Workflows

```typescript
async *handleSteps() {
const config = yield { tool: 'read_files', paths: ['config.json'] }
yield 'STEP'

if (config.includes('typescript')) {
yield { tool: 'spawn_agents', agents: ['typescript-expert'] }
} else {
yield { tool: 'spawn_agents', agents: ['javascript-expert'] }
}
yield 'STEP_ALL'
}
```

### Iterative Refinement

```typescript
async *handleSteps() {
for (let attempt = 0; attempt < 3; attempt++) {
yield { tool: 'run_terminal_command', command: 'npm test' }
yield 'STEP'

if (allTestsPass) break

yield { tool: 'spawn_agents', agents: ['test-fixer'] }
yield 'STEP_ALL'
}
}
```

## Why Choose Codebuff for Custom Agents

**Deep customizability**: Create sophisticated agent workflows with TypeScript generators that mix AI generation with programmatic control. Define custom agents that spawn subagents, implement conditional logic, and orchestrate complex multi-step processes that adapt to your specific use cases.

**Fully customizable SDK**: Build Codebuff's capabilities directly into your applications with a complete TypeScript SDK. Create custom tools, integrate with your CI/CD pipeline, build AI-powered development environments, or embed intelligent coding assistance into your products.

Learn more about the SDK [here](https://www.npmjs.com/package/@codebuff/sdk).

## Community & Support

- **Discord**: [Join our community](https://codebuff.com/discord) for help and inspiration
- **Examples**: Study the `examples/` directory for patterns
- **Documentation**: [codebuff.com/docs](https://codebuff.com/docs) and check `types/` for detailed type information
- **Issues**: [Report bugs and request features on GitHub](https://github.com/CodebuffAI/codebuff/issues)
- **Support**: [support@codebuff.com](mailto:support@codebuff.com)

Happy agent building! 🤖
Loading