From 274c391ee23d368a248461d97329143d2d6fb963 Mon Sep 17 00:00:00 2001
From: Brian Casel <brian@briancasel.com>
Date: Fri, 10 Oct 2025 11:10:19 -0400
Subject: [PATCH 1/8] Updated instructions and default standards to reduce
 excessive tests writing and test running during feature development to
 improve speed and token useage.

---
 CHANGELOG.md                                  |   4 +
 config.yml                                    |   2 +-
 profiles/default/agents/templates/verifier.md |   6 +-
 .../default/standards/testing/coverage.md     |   7 -
 .../default/standards/testing/test-writing.md |   9 ++
 .../default/standards/testing/unit-tests.md   |   4 -
 .../implementation/implement-task.md          |   6 +-
 .../implementer-responsibilities.md           |   2 +-
 .../verifier-responsibilities.md              |   8 +-
 .../specification/create-tasks-list.md        | 121 +++++++++---------
 .../workflows/specification/verify-spec.md    |  28 +++-
 11 files changed, 110 insertions(+), 87 deletions(-)
 delete mode 100644 profiles/default/standards/testing/coverage.md
 create mode 100644 profiles/default/standards/testing/test-writing.md

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 5358ae6c..8f39ef2f 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,10 @@ All notable changes to Agent OS will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [2.0.3] - 2025-10-10
+
+- Updated instructions and default standards to reduce excessive tests writing and test running during feature development to improve speed and token useage.
+
 ## [2.0.2] - 2025-10-09
 
 - Clarified /create-spec command so that task list creation doesn't begin until spec.md has been written.
diff --git a/config.yml b/config.yml
index 989cf525..49568ce3 100644
--- a/config.yml
+++ b/config.yml
@@ -1,4 +1,4 @@
-version: 2.0.0
+version: 2.0.3
 base_install: true
 
 # ================================================
diff --git a/profiles/default/agents/templates/verifier.md b/profiles/default/agents/templates/verifier.md
index 342450ad..b402cd90 100644
--- a/profiles/default/agents/templates/verifier.md
+++ b/profiles/default/agents/templates/verifier.md
@@ -48,11 +48,11 @@ Read the following files to understand the user's standards and preferences so t
 
 {{verifier_standards}}
 
-### Step 4: Run the tests that were written for the tasks under your verification purview
+### Step 4: Run ONLY the tests that were written by the implementer of the tasks under your verification purview
 
-IF the tasks under your verification purview involved writing of tests, then run ONLY those specific tests and note how many are passing and failing.
+IF the implementer of the tasks under your verification purview wrote tests that cover this implementation, then run ONLY those specific tests and note how many are passing and failing. Do NOT run the entire app's tests suite.
 
-If any are failing then note the failures, but DO NOT try to implement fixes.
+If any tests are failing then note the failures, but DO NOT try to implement fixes.
 
 ### Step 5: (if applicable) view and screenshot the implemented features in a browser
 
diff --git a/profiles/default/standards/testing/coverage.md b/profiles/default/standards/testing/coverage.md
deleted file mode 100644
index 233dc3bf..00000000
--- a/profiles/default/standards/testing/coverage.md
+++ /dev/null
@@ -1,7 +0,0 @@
-## Test coverage best practices
-
-- **Set Minimum Thresholds**: Establish baseline coverage requirements (e.g., 80% overall) but prioritize meaningful tests over percentages
-- **Prioritize Critical Paths**: Aim for higher coverage (90%+) on business logic, authentication, and payment flows
-- **Track Coverage Trends**: Monitor coverage over time to catch regressions and ensure new code is tested
-- **Quality Over Quantity**: 100% coverage doesn't guarantee bug-free code; focus on testing behavior and edge cases
-- **Exclude Appropriately**: Don't count generated code, test files, or configuration files against coverage metrics
diff --git a/profiles/default/standards/testing/test-writing.md b/profiles/default/standards/testing/test-writing.md
new file mode 100644
index 00000000..a57de5ed
--- /dev/null
+++ b/profiles/default/standards/testing/test-writing.md
@@ -0,0 +1,9 @@
+## Test coverage best practices
+
+- **Write Minimal Tests During Development**: Do NOT write tests for every change or intermediate step. Focus on completing the feature implementation first, then add strategic tests only at logical completion points
+- **Test Only Core User Flows**: Write tests exclusively for critical paths and primary user workflows. Skip writing tests for non-critical utilities and secondary workflows until if/when you're instructed to do so.
+- **Defer Edge Case Testing**: Do NOT test edge cases, error states, or validation logic unless they are business-critical. These can be addressed in dedicated testing phases, not during feature development.
+- **Test Behavior, Not Implementation**: Focus tests on what the code does, not how it does it, to reduce brittleness
+- **Clear Test Names**: Use descriptive names that explain what's being tested and the expected outcome
+- **Mock External Dependencies**: Isolate units by mocking databases, APIs, file systems, and other external services
+- **Fast Execution**: Keep unit tests fast (milliseconds) so developers run them frequently during development
diff --git a/profiles/default/standards/testing/unit-tests.md b/profiles/default/standards/testing/unit-tests.md
index fa9da7b7..cea74632 100644
--- a/profiles/default/standards/testing/unit-tests.md
+++ b/profiles/default/standards/testing/unit-tests.md
@@ -2,9 +2,5 @@
 
 - **Test Behavior, Not Implementation**: Focus tests on what the code does, not how it does it, to reduce brittleness
 - **Clear Test Names**: Use descriptive names that explain what's being tested and the expected outcome
-- **Independent Tests**: Each test should run independently without relying on execution order or shared state
-- **Test Edge Cases**: Include boundary conditions, empty inputs, null values, and error scenarios
 - **Mock External Dependencies**: Isolate units by mocking databases, APIs, file systems, and other external services
 - **Fast Execution**: Keep unit tests fast (milliseconds) so developers run them frequently during development
-- **One Concept Per Test**: Test one behavior or scenario per test to make failures easy to diagnose
-- **Maintain Test Code Quality**: Apply the same code quality standards to tests as to production code
diff --git a/profiles/default/workflows/implementation/implement-task.md b/profiles/default/workflows/implementation/implement-task.md
index a0c85011..b57fbeec 100644
--- a/profiles/default/workflows/implementation/implement-task.md
+++ b/profiles/default/workflows/implementation/implement-task.md
@@ -6,6 +6,6 @@ Guide your implementation using:
 - **The existing patterns** that you've found and analyzed.
 - **User Standards & Preferences** which are defined below.
 
-Self-verify and test your work:
-- IF your tasks direct you to write tests, ensure all of the tests you've written pass.
-- Double-check, test, or view the elements you've implemented to verify they are all present and in working order before reporting on your implementation.
+Self-verify and test your work by:
+- Running ONLY the tests you've written (if any) and ensuring those tests pass.
+- IF your task involves user-facing UI, and IF you have access to browser testing tools, open a browser and use the feature you've implemented as if you are a user to ensure a user can use the feature in the intended way.
diff --git a/profiles/default/workflows/implementation/implementer-responsibilities.md b/profiles/default/workflows/implementation/implementer-responsibilities.md
index 37cffe18..f073a782 100644
--- a/profiles/default/workflows/implementation/implementer-responsibilities.md
+++ b/profiles/default/workflows/implementation/implementer-responsibilities.md
@@ -1,5 +1,5 @@
 1. **Analyze YOUR assigned task:** Take note of the specific task and sub-tasks that have been assigned to your role.  Do NOT implement task(s) that are assigned to other roles.
 2. **Search for existing patterns:** Find and state patterns in the codebase and user standards to follow in your implementation.
-3. **Implement accoding to requirements & standards:** Implement your tasks by following your provided tasks, spec and ensuring alignment with "User's Standards & Preferences Compliance" and self-test and verify your own work.
+3. **Implement according to requirements & standards:** Implement your tasks by following your provided tasks, spec and ensuring alignment with "User's Standards & Preferences Compliance".
 4. **Update tasks.md with your tasks status:** Mark the task and sub-tasks in `tasks.md` that you've implemented as complete by updating their checkboxes to `- [x]`
 5. **Document your implementation:** Create your implementation report in this spec's `implementation` folder detailing the work you've implemented.
diff --git a/profiles/default/workflows/implementation/verifier-responsibilities.md b/profiles/default/workflows/implementation/verifier-responsibilities.md
index 1b40144d..cea3e1ea 100644
--- a/profiles/default/workflows/implementation/verifier-responsibilities.md
+++ b/profiles/default/workflows/implementation/verifier-responsibilities.md
@@ -1,8 +1,8 @@
 1. **Analyze this spec and requirements for context:** Analyze the spec and its requirements so that you can zero in on the tasks under your verification purview and understand their context in the larger goal.
 2. **Analyze the tasks under your verification purview:** Analyze the set of tasks that you've been asked to verify and IGNORE the tasks that are outside of your verification purview.
 3. **Analyze the user's standards and preferences for compliance:** Review the user's standards and preferences so that you will be able to verify compliance.
-4. **Run the tests that were written for the tasks under your verification purview:** Verify how many are passing and failing.
-5. **(if applicable) view the implementation in a browser:** If your verification purview involves UI implementations, open a browser to view, verify and take screenshots.
+4. **Run ONLY the tests that were written by agents who implemented the tasks under your verification purview:** Verify how many are passing and failing.
+5. **(if applicable) view the implementation in a browser:** If your verification purview involves UI implementations, open a browser to view, verify and take screenshots and store screenshot(s) in `agent-os/specs/[this-spec]/verification/screenshots`.
 6. **Verify tasks.md status has been updated:** Verify and ensure that the tasks in `tasks.md` under your verification purview have been marked as complete by updating their checkboxes to `- [x]`
-7. **Verify that implementations have been documented:** Verify that the implementer agent(s) have documented their work in this spec's `implementation` folder.
-8. **Document your verification report:** Write your verification report in this spec's `verification` folder.
+7. **Verify that implementations have been documented:** Verify that the implementer agent(s) have documented their work in this spec's `agent-os/specs/[this-spec]/implementation`. folder.
+8. **Document your verification report:** Write your verification report in this spec's `agent-os/specs/[this-spec]/verification`. folder.
diff --git a/profiles/default/workflows/specification/create-tasks-list.md b/profiles/default/workflows/specification/create-tasks-list.md
index 4b0d50ec..46e82bf6 100644
--- a/profiles/default/workflows/specification/create-tasks-list.md
+++ b/profiles/default/workflows/specification/create-tasks-list.md
@@ -42,11 +42,10 @@ Assigned roles: [list from registry]
 **Dependencies:** None
 
 - [ ] 1.0 Complete database layer
-  - [ ] 1.1 Write tests for [Model] functionality
-    - Model validation tests
-    - Association tests
-    - Method behavior tests
-    - Migration tests
+  - [ ] 1.1 Write 2-8 focused tests for [Model] functionality
+    - Limit to 2-8 highly focused tests maximum
+    - Test only critical model behaviors (e.g., primary validation, key association, core method)
+    - Skip exhaustive coverage of all methods and edge cases
   - [ ] 1.2 Create [Model] with validations
     - Fields: [list]
     - Validations: [list]
@@ -57,13 +56,13 @@ Assigned roles: [list from registry]
   - [ ] 1.4 Set up associations
     - [Model] has_many [related]
     - [Model] belongs_to [parent]
-  - [ ] 1.5 Ensure all database layer tests pass
-    - Run model tests written in 1.1
+  - [ ] 1.5 Ensure database layer tests pass
+    - Run ONLY the 2-8 tests written in 1.1
     - Verify migrations run successfully
-    - Confirm associations work correctly
+    - Do NOT run the entire test suite at this stage
 
 **Acceptance Criteria:**
-- All tests written in 1.1 pass
+- The 2-8 tests written in 1.1 pass
 - Models pass validation tests
 - Migrations run successfully
 - Associations work correctly
@@ -75,11 +74,10 @@ Assigned roles: [list from registry]
 **Dependencies:** Task Group 1
 
 - [ ] 2.0 Complete API layer
-  - [ ] 2.1 Write tests for API endpoints
-    - Controller action tests (index, show, create, update, destroy)
-    - Authentication/authorization tests
-    - Request/response format tests
-    - Error handling tests
+  - [ ] 2.1 Write 2-8 focused tests for API endpoints
+    - Limit to 2-8 highly focused tests maximum
+    - Test only critical controller actions (e.g., primary CRUD operation, auth check, key error case)
+    - Skip exhaustive testing of all actions and scenarios
   - [ ] 2.2 Create [resource] controller
     - Actions: index, show, create, update, destroy
     - Follow pattern from: [existing controller]
@@ -90,13 +88,13 @@ Assigned roles: [list from registry]
     - JSON responses
     - Error handling
     - Status codes
-  - [ ] 2.5 Ensure all API layer tests pass
-    - Run controller tests written in 2.1
-    - Verify all CRUD operations work
-    - Confirm proper authorization enforced
+  - [ ] 2.5 Ensure API layer tests pass
+    - Run ONLY the 2-8 tests written in 2.1
+    - Verify critical CRUD operations work
+    - Do NOT run the entire test suite at this stage
 
 **Acceptance Criteria:**
-- All tests written in 2.1 pass
+- The 2-8 tests written in 2.1 pass
 - All CRUD operations work
 - Proper authorization enforced
 - Consistent response format
@@ -108,12 +106,10 @@ Assigned roles: [list from registry]
 **Dependencies:** Task Group 2
 
 - [ ] 3.0 Complete UI components
-  - [ ] 3.1 Write tests for UI components
-    - Component rendering tests
-    - Form validation tests
-    - User interaction tests
-    - Responsive design tests
-    - Accessibility tests
+  - [ ] 3.1 Write 2-8 focused tests for UI components
+    - Limit to 2-8 highly focused tests maximum
+    - Test only critical component behaviors (e.g., primary user interaction, key form submission, main rendering case)
+    - Skip exhaustive testing of all component states and interactions
   - [ ] 3.2 Create [Component] component
     - Reuse: [existing component] as base
     - Props: [list]
@@ -137,53 +133,50 @@ Assigned roles: [list from registry]
     - Hover states
     - Transitions
     - Loading states
-  - [ ] 3.8 Ensure all UI component tests pass
-    - Run component tests written in 3.1
-    - Verify components render correctly
-    - Confirm forms validate and submit properly
+  - [ ] 3.8 Ensure UI component tests pass
+    - Run ONLY the 2-8 tests written in 3.1
+    - Verify critical component behaviors work
+    - Do NOT run the entire test suite at this stage
 
 **Acceptance Criteria:**
-- All tests written in 3.1 pass
+- The 2-8 tests written in 3.1 pass
 - Components render correctly
 - Forms validate and submit
 - Matches visual design
 
 ### Testing
 
-#### Task Group 4: End-to-End Testing & Validation
+#### Task Group 4: Test Review & Gap Analysis
 **Assigned implementer:** testing-engineer
 **Dependencies:** Task Groups 1-3
 
-- [ ] 4.0 Complete end-to-end test coverage
-  - [ ] 4.1 Write end-to-end integration tests
-    - Full user workflow tests
-    - Cross-layer integration tests
-    - API-to-UI data flow tests
-    - Error scenario tests
-  - [ ] 4.2 Create performance tests
-    - Load testing for API endpoints
-    - Frontend performance tests
-    - Database query optimization tests
-  - [ ] 4.3 Implement accessibility tests
-    - Screen reader compatibility
-    - Keyboard navigation tests
-    - WCAG compliance tests
-  - [ ] 4.4 Add browser compatibility tests
-    - Cross-browser testing
-    - Mobile device testing
-    - Responsive design validation
-  - [ ] 4.5 Validate all feature tests pass
-    - Run all tests from Task Groups 1-3
-    - Run new end-to-end tests from 4.1-4.4
-    - Ensure 100% test coverage for new feature
-    - Verify all edge cases are covered
+- [ ] 4.0 Review existing tests and fill critical gaps only
+  - [ ] 4.1 Review tests from Task Groups 1-3
+    - Review the 2-8 tests written by database-engineer (Task 1.1)
+    - Review the 2-8 tests written by api-engineer (Task 2.1)
+    - Review the 2-8 tests written by ui-designer (Task 3.1)
+    - Total existing tests: approximately 6-24 tests
+  - [ ] 4.2 Analyze test coverage gaps for THIS feature only
+    - Identify critical user workflows that lack test coverage
+    - Focus ONLY on gaps related to this spec's feature requirements
+    - Do NOT assess entire application test coverage
+    - Prioritize end-to-end workflows over unit test gaps
+  - [ ] 4.3 Write up to 10 additional strategic tests maximum
+    - Add maximum of 10 new tests to fill identified critical gaps
+    - Focus on integration points and end-to-end workflows
+    - Do NOT write comprehensive coverage for all scenarios
+    - Skip edge cases, performance tests, and accessibility tests unless business-critical
+  - [ ] 4.4 Run feature-specific tests only
+    - Run ONLY tests related to this spec's feature (tests from 1.1, 2.1, 3.1, and 4.3)
+    - Expected total: approximately 16-34 tests maximum
+    - Do NOT run the entire application test suite
+    - Verify critical workflows pass
 
 **Acceptance Criteria:**
-- All tests from previous task groups pass
-- End-to-end user workflows work correctly
-- 100% test coverage for new feature
-- Performance meets requirements
-- Accessibility standards met
+- All feature-specific tests pass (approximately 16-34 tests total)
+- Critical user workflows for this feature are covered
+- No more than 10 additional tests added by testing-engineer
+- Testing focused exclusively on this spec's feature requirements
 
 ## Execution Order
 
@@ -191,7 +184,7 @@ Recommended implementation sequence:
 1. Database Layer (Task Group 1)
 2. API Layer (Task Group 2)
 3. Frontend Design (Task Group 3)
-4. End-to-End Testing & Validation (Task Group 4)
+4. Test Review & Gap Analysis (Task Group 4)
 ```
 
 **Note**: Adapt this structure based on the actual feature requirements. Some features may need:
@@ -205,6 +198,12 @@ Recommended implementation sequence:
 - **Base implementer assignments** on only the available implementers present in the list in implementers.yml.
 - **Create tasks that are specific and verifiable**
 - **Group related tasks** for efficient specialists implementer assignment
-- **Use a test-driven development approach** where each task group starts with writing tests (x.1 sub-task) and ends with ensuring those tests pass (final sub-task).
+- **Limit test writing during development**:
+  - Each task group (1-3) should write 2-8 focused tests maximum
+  - Tests should cover only critical behaviors, not exhaustive coverage
+  - Test verification should run ONLY the newly written tests, not the entire suite
+  - The testing-engineer's task group should only add a maximum of 10 additional tests IF NECESSARY to fill critical gaps
+  - Total expected tests per feature: approximately 16-34 tests maximum
+- **Use a focused test-driven approach** where each task group starts with writing 2-8 tests (x.1 sub-task) and ends with running ONLY those tests (final sub-task)
 - **Include acceptance criteria** for each task group
 - **Reference visual assets** if visuals are available
diff --git a/profiles/default/workflows/specification/verify-spec.md b/profiles/default/workflows/specification/verify-spec.md
index eb56acdf..f4c567cf 100644
--- a/profiles/default/workflows/specification/verify-spec.md
+++ b/profiles/default/workflows/specification/verify-spec.md
@@ -6,7 +6,7 @@
 2. **Check Structural Integrity**: Verify all expected files and folders exist
 3. **Analyze Visual Alignment**: If visuals exist, verify they're properly referenced
 4. **Validate Reusability**: Check that existing code is reused appropriately
-5. **Verify TDD Approach**: Ensure tasks follow test-first development
+5. **Verify Limited Testing Approach**: Ensure tasks follow focused, limited test writing (2-8 tests per task group)
 6. **Document Findings**: Create verification report
 
 ## Workflow
@@ -84,6 +84,11 @@ Look for these issues:
 
 #### Check 6: Task List Detailed Validation
 Read `agent-os/specs/[this-spec]/tasks.md` and check each task group's tasks:
+1. **Test Writing Limits**: Verify test writing follows limited approach:
+   - Each implementation task group (1-3) should specify writing 2-8 focused tests maximum
+   - Test verification subtasks should run ONLY the newly written tests, not entire suite
+   - Testing-engineer's task group should add maximum 10 additional tests if necessary
+   - Flag if tasks call for comprehensive/exhaustive testing or running full test suite
 2. **Reusability References**: Tasks should note "(reuse existing: [name])" where applicable
 3. **Specificity**: Each task must reference a specific feature/component
 4. **Traceability**: Each task must trace back to requirements
@@ -110,7 +115,7 @@ Create `agent-os/specs/[this-spec]/verification/spec-verification.md` with the f
 - Date: [Current date]
 - Spec: [Spec name]
 - Reusability Check: ✅ Passed / ⚠️ Concerns / ❌ Failed
-- TDD Compliance: ✅ Passed / ⚠️ Partial / ❌ Failed
+- Test Writing Limits: ✅ Compliant / ⚠️ Partial / ❌ Excessive Testing
 
 ## Structural Verification (Checks 1-2)
 
@@ -163,6 +168,17 @@ Create `agent-os/specs/[this-spec]/verification/spec-verification.md` with the f
 
 ### Check 6: Task List Issues
 
+**Test Writing Limits:**
+- ✅ Task Group 1 specifies 2-8 focused tests
+- ❌ Task Group 2 calls for "comprehensive test coverage" (violates limits)
+- ⚠️ Task Group 3 doesn't specify test limits
+- ❌ Testing-engineer group plans 25 additional tests (exceeds 10 max)
+- ❌ Tasks call for running entire test suite (should run only new tests)
+[OR if compliant:]
+- ✅ All task groups specify 2-8 focused tests maximum
+- ✅ Test verification limited to newly written tests only
+- ✅ Testing-engineer adds maximum 10 tests
+
 **Reusability References:**
 - ❌ Task 3.2 doesn't mention reusing existing form partial
 - ❌ Task 4.3 recreates validation that exists in UserValidator
@@ -208,6 +224,8 @@ Create `agent-os/specs/[this-spec]/verification/spec-verification.md` with the f
 1. Creating new components instead of reusing: FormField, DataTable
 2. Audit logging system not requested
 3. Complex state management for simple form
+4. Excessive test coverage planned (e.g., 50+ tests when 16-34 is appropriate)
+5. Comprehensive test suite requirements violating focused testing approach
 
 ## Recommendations
 1. Update spec to reuse existing form components
@@ -230,15 +248,17 @@ Specification verification complete!
 ✅ Verified requirements accuracy
 ✅ Checked structural integrity
 ✅ Validated specification alignment
+✅ Verified test writing limits (2-8 tests per task group, ~16-34 total)
 [If visuals] ✅ Analyzed [X] visual assets
 ⚠️ Reusability check: [Y issues found]
 
 [If passed]
-All specifications accurately reflect requirements and properly leverage existing code
+All specifications accurately reflect requirements, follow limited testing approach, and properly leverage existing code
 
 [If issues found]
 ⚠️ Found [X] issues requiring attention:
 - [Number] reusability issues
+- [Number] test writing limit violations
 - [Number] critical issues
 - [Number] minor issues
 - [Number] over-engineering concerns
@@ -250,6 +270,8 @@ See agent-os/specs/[this-spec]/verification/spec-verification.md for full detail
 
 - Compare user's raw answers against requirements.md exactly
 - Check for reusability opportunities and verify that they're documented but DO NOT search and explore the codebase yourself.
+- Verify test writing limits strictly: Flag any tasks that call for comprehensive testing, exhaustive coverage, or running full test suites
+- Expected test counts: Implementation task groups should write 2-8 tests each, testing-engineer adds maximum 10, total ~16-34 tests per feature
 - Don't add new requirements or specifications
 - Focus on alignment and accuracy, not style
 - Be specific about any issues found

From d66414fbfe834580bf20e04ec1bf672aa6584b9f Mon Sep 17 00:00:00 2001
From: Brian Casel <brian@briancasel.com>
Date: Fri, 10 Oct 2025 11:17:24 -0400
Subject: [PATCH 2/8] Replaced hard-coding of 'opus' model setting on agents
 with 'inherit' so that it inherits whichever model your Claude Code is
 currently using.

---
 CHANGELOG.md                                               | 3 +++
 profiles/default/agents/implementation-verifier.md         | 2 +-
 profiles/default/agents/product-planner.md                 | 2 +-
 profiles/default/agents/specification/spec-researcher.md   | 2 +-
 profiles/default/agents/specification/spec-writer.md       | 2 +-
 .../default/agents/specification/tasks-list-creator.md     | 2 +-
 scripts/create-role.sh                                     | 7 ++++++-
 7 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 8f39ef2f..1a804da9 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -8,6 +8,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [2.0.3] - 2025-10-10
 
 - Updated instructions and default standards to reduce excessive tests writing and test running during feature development to improve speed and token useage.
+- For Claude Code users:
+  - Replaced hard-coding of 'opus' model setting on agents with 'inherit' so that it inherits whichever model your Claude Code is currently using.
+  - Updated create-role script to add the "Inherit" option when creating new agents.
 
 ## [2.0.2] - 2025-10-09
 
diff --git a/profiles/default/agents/implementation-verifier.md b/profiles/default/agents/implementation-verifier.md
index 2058bcdd..4e49de33 100644
--- a/profiles/default/agents/implementation-verifier.md
+++ b/profiles/default/agents/implementation-verifier.md
@@ -3,7 +3,7 @@ name: implementation-verifier
 description: Verify the end-to-end implementation of a spec
 tools: Write, Read, Bash, WebFetch, Playwright
 color: green
-model: opus
+model: inherit
 ---
 
 You are a product spec verifier responsible for verifying the end-to-end implementation of a spec, updating the product roadmap (if necessary), and producing a final verification report.
diff --git a/profiles/default/agents/product-planner.md b/profiles/default/agents/product-planner.md
index 8b61e4ec..b555cad3 100644
--- a/profiles/default/agents/product-planner.md
+++ b/profiles/default/agents/product-planner.md
@@ -3,7 +3,7 @@ name: product-planner
 description: Create product documentation including mission, and roadmap
 tools: Write, Read, Bash, WebFetch
 color: cyan
-model: opus
+model: inherit
 ---
 
 You are a product planning specialist. Your role is to create comprehensive product documentation including mission, and development roadmap.
diff --git a/profiles/default/agents/specification/spec-researcher.md b/profiles/default/agents/specification/spec-researcher.md
index d05e9f63..f31f665c 100644
--- a/profiles/default/agents/specification/spec-researcher.md
+++ b/profiles/default/agents/specification/spec-researcher.md
@@ -3,7 +3,7 @@ name: spec-researcher
 description: Gather detailed requirements through targeted questions and visual analysis
 tools: Write, Read, Bash, WebFetch
 color: blue
-model: opus
+model: inherit
 ---
 
 You are a software product requirements research specialist. Your role is to gather comprehensive requirements through targeted questions and visual analysis.
diff --git a/profiles/default/agents/specification/spec-writer.md b/profiles/default/agents/specification/spec-writer.md
index 003aad01..6cf591fb 100644
--- a/profiles/default/agents/specification/spec-writer.md
+++ b/profiles/default/agents/specification/spec-writer.md
@@ -3,7 +3,7 @@ name: spec-writer
 description: Create a detailed specification document for development
 tools: Write, Read, Bash, WebFetch
 color: purple
-model: opus
+model: inherit
 ---
 
 You are a software product specifications writer. Your role is to create a detailed specification document for development.
diff --git a/profiles/default/agents/specification/tasks-list-creator.md b/profiles/default/agents/specification/tasks-list-creator.md
index 27941982..293186cd 100644
--- a/profiles/default/agents/specification/tasks-list-creator.md
+++ b/profiles/default/agents/specification/tasks-list-creator.md
@@ -3,7 +3,7 @@ name: task-list-creator
 description: Create a detailed and strategic tasks list for development of a spec
 tools: Write, Read, Bash, WebFetch
 color: orange
-model: opus
+model: inherit
 ---
 
 You are a software product tasks list writer and planner. Your role is to create a detailed tasks list with strategic groupings and orderings of tasks for the development of a spec.
diff --git a/scripts/create-role.sh b/scripts/create-role.sh
index 0a60b941..bd6766ab 100755
--- a/scripts/create-role.sh
+++ b/scripts/create-role.sh
@@ -265,9 +265,10 @@ select_model() {
     echo ""
     echo "  1) Sonnet"
     echo "  2) Opus"
+    echo "  3) Inherit your current Claude Code model setting"
     echo ""
 
-    read -p "$(echo -e "${BLUE}Enter selection (1-2): ${NC}")" selection
+    read -p "$(echo -e "${BLUE}Enter selection (1-3): ${NC}")" selection
 
     case $selection in
         1)
@@ -278,6 +279,10 @@ select_model() {
             ROLE_MODEL="opus"
             print_success "Selected model: Opus"
             ;;
+        3)
+            ROLE_MODEL="inherit"
+            print_success "Selected model: Inherit from Claude Code"
+            ;;
         *)
             print_error "Invalid selection"
             exit 1

From cd1fe553a3c4d065f3328755637fd1e0c31adcba Mon Sep 17 00:00:00 2001
From: Brian Casel <brian@briancasel.com>
Date: Fri, 10 Oct 2025 11:20:32 -0400
Subject: [PATCH 3/8] consolidated test-writing standards

---
 profiles/default/standards/testing/unit-tests.md | 6 ------
 1 file changed, 6 deletions(-)
 delete mode 100644 profiles/default/standards/testing/unit-tests.md

diff --git a/profiles/default/standards/testing/unit-tests.md b/profiles/default/standards/testing/unit-tests.md
deleted file mode 100644
index cea74632..00000000
--- a/profiles/default/standards/testing/unit-tests.md
+++ /dev/null
@@ -1,6 +0,0 @@
-## Unit testing best practices
-
-- **Test Behavior, Not Implementation**: Focus tests on what the code does, not how it does it, to reduce brittleness
-- **Clear Test Names**: Use descriptive names that explain what's being tested and the expected outcome
-- **Mock External Dependencies**: Isolate units by mocking databases, APIs, file systems, and other external services
-- **Fast Execution**: Keep unit tests fast (milliseconds) so developers run them frequently during development

From 54a919c0e3f4e56ab71a563191fbe69d3de4956f Mon Sep 17 00:00:00 2001
From: Brian Casel <brian@briancasel.com>
Date: Fri, 10 Oct 2025 11:25:37 -0400
Subject: [PATCH 4/8] updated implementer roles from opus to inherit

---
 profiles/default/roles/implementers.yml | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/profiles/default/roles/implementers.yml b/profiles/default/roles/implementers.yml
index b2f37038..a75845a3 100644
--- a/profiles/default/roles/implementers.yml
+++ b/profiles/default/roles/implementers.yml
@@ -3,7 +3,7 @@ implementers:
     description: Handles migrations, models, schemas, database queries
     your_role: You are a database engineer. Your role is to implement database migrations, models, schemas, and database queries.
     tools: Write, Read, Bash, WebFetch
-    model: opus
+    model: inherit
     color: orange
     areas_of_responsibility:
       - Create database migrations
@@ -29,7 +29,7 @@ implementers:
     description: Handles API endpoints, controllers, business logic, request/response handling
     your_role: You are an API engineer. Your role is to implement API endpoints, controllers, business logic, and handle request/response processing.
     tools: Write, Read, Bash, WebFetch
-    model: opus
+    model: inherit
     color: blue
     areas_of_responsibility:
       - Create API endpoints
@@ -55,7 +55,7 @@ implementers:
     description: Handles UI components, views, layouts, styling, responsive design
     your_role: You are a UI designer. Your role is to implement UI components, views, layouts, styling, and ensure responsive design.
     tools: Write, Read, Bash, WebFetch, Playwright
-    model: opus
+    model: inherit
     color: purple
     areas_of_responsibility:
       - Create UI components
@@ -81,7 +81,7 @@ implementers:
     description: Handles test files, test coverage, test fixtures
     your_role: You are a testing engineer. Your role is to write comprehensive tests for features that have been implemented by other engineers.
     tools: Write, Read, Bash, WebFetch
-    model: opus
+    model: inherit
     color: green
     areas_of_responsibility:
       - Write unit tests

From ab36a0b095f0e02a6df9ff9501933bfe91da0996 Mon Sep 17 00:00:00 2001
From: Brian Casel <brian@briancasel.com>
Date: Fri, 10 Oct 2025 11:29:22 -0400
Subject: [PATCH 5/8] clarified output when updating roles files in a project

---
 scripts/project-install.sh | 2 +-
 scripts/project-update.sh  | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/scripts/project-install.sh b/scripts/project-install.sh
index a3784bdb..a5ca4829 100755
--- a/scripts/project-install.sh
+++ b/scripts/project-install.sh
@@ -218,7 +218,7 @@ install_roles() {
 
     if [[ "$DRY_RUN" != "true" ]]; then
         if [[ $roles_count -gt 0 ]]; then
-            echo "✓ Installed $roles_count roles in agent-os/roles"
+            echo "✓ Installed $roles_count files in agent-os/roles"
         fi
     fi
 }
diff --git a/scripts/project-update.sh b/scripts/project-update.sh
index 04ea9ecb..58131012 100755
--- a/scripts/project-update.sh
+++ b/scripts/project-update.sh
@@ -412,10 +412,10 @@ update_roles() {
 
     if [[ "$DRY_RUN" != "true" ]]; then
         if [[ $roles_new -gt 0 ]]; then
-            echo "✓ Added $roles_new roles in agent-os/roles"
+            echo "✓ Added $roles_new files in agent-os/roles"
         fi
         if [[ $roles_updated -gt 0 ]]; then
-            echo "✓ Updated $roles_updated roles in agent-os/roles"
+            echo "✓ Updated $roles_updated files in agent-os/roles"
         fi
         if [[ $roles_skipped -gt 0 ]]; then
             echo -e "${YELLOW}$roles_skipped files in agent-os/roles were not updated and overwritten.${NC}"

From cbbbd054e2d979e3b3ffaa31669e58112f0cc7dd Mon Sep 17 00:00:00 2001
From: Brian Casel <brian@briancasel.com>
Date: Fri, 10 Oct 2025 11:41:05 -0400
Subject: [PATCH 6/8] injected global standards into single-agent-mode
 plan-product command.

---
 .../commands/plan-product/single-agent/1-plan-product.md    | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/profiles/default/commands/plan-product/single-agent/1-plan-product.md b/profiles/default/commands/plan-product/single-agent/1-plan-product.md
index 4e20d62e..921de120 100644
--- a/profiles/default/commands/plan-product/single-agent/1-plan-product.md
+++ b/profiles/default/commands/plan-product/single-agent/1-plan-product.md
@@ -5,3 +5,9 @@ The FIRST STEP is to confirm the product details by following these instructions
 {{workflows/planning/gather-product-info}}
 
 Then WAIT for me to give you specific instructions on how to use the information you've gathered to create the mission and roadmap.
+
+## User Standards & Preferences Compliance
+
+When planning the product's tech stack, mission statement and roadmap, use the user's standards and preferences for context and baseline assumptions, as documented in these files:
+
+{{standards/global/*}}

From fcefda01648542558b347191fdfa58a24280c125 Mon Sep 17 00:00:00 2001
From: Brian Casel <brian@briancasel.com>
Date: Fri, 10 Oct 2025 11:56:57 -0400
Subject: [PATCH 7/8] clarifying next command to run when in single-agent-mode

---
 .../create-spec/single-agent/2-create-tasks-list.md  |  4 +++-
 .../create-spec/single-agent/3-verify-spec.md        |  2 +-
 .../commands/new-spec/single-agent/1-new-spec.md     | 10 ++++++++++
 .../new-spec/single-agent/2-research-spec.md         | 10 ++++++++++
 .../plan-product/single-agent/1-plan-product.md      | 10 ++++++++++
 .../plan-product/single-agent/2-create-mission.md    | 12 ++++++++++++
 .../plan-product/single-agent/3-create-roadmap.md    | 12 ++++++++++++
 .../plan-product/single-agent/4-create-tech-stack.md | 12 ++++++++++++
 8 files changed, 70 insertions(+), 2 deletions(-)

diff --git a/profiles/default/commands/create-spec/single-agent/2-create-tasks-list.md b/profiles/default/commands/create-spec/single-agent/2-create-tasks-list.md
index 992552be..1fd19527 100644
--- a/profiles/default/commands/create-spec/single-agent/2-create-tasks-list.md
+++ b/profiles/default/commands/create-spec/single-agent/2-create-tasks-list.md
@@ -11,7 +11,9 @@ The tasks list has created at `agent-os/specs/[this-spec]/tasks.md`.
 
 Review it closely to make sure it all looks good.
 
-Next step: Run the command, 3-verify-spec.md to closely verify your spec and tasks list for accuracy and alignment.  Or you can skip straight to running the implement-spec.md command if you're ready.
+Next step: Run the command, `3-verify-spec.md`.
+
+Or if want, you can skip straight to running the `implement-spec.md` command.
 ```
 
 ## User Standards & Preferences Compliance
diff --git a/profiles/default/commands/create-spec/single-agent/3-verify-spec.md b/profiles/default/commands/create-spec/single-agent/3-verify-spec.md
index 8ce4131a..a27a2578 100644
--- a/profiles/default/commands/create-spec/single-agent/3-verify-spec.md
+++ b/profiles/default/commands/create-spec/single-agent/3-verify-spec.md
@@ -13,7 +13,7 @@ Your spec verification report is ready at `agent-os/specs/[this-spec]/verificati
 
 Review it closely to make sure it all looks good.
 
-Next step: Run the command, implement-spec.md to generate prompts for implementation.
+Next step: Run the command, `implement-spec.md` to generate prompts for implementation.
 ```
 
 ## User Standards & Preferences Compliance
diff --git a/profiles/default/commands/new-spec/single-agent/1-new-spec.md b/profiles/default/commands/new-spec/single-agent/1-new-spec.md
index 0a4ef72b..5a58ee76 100644
--- a/profiles/default/commands/new-spec/single-agent/1-new-spec.md
+++ b/profiles/default/commands/new-spec/single-agent/1-new-spec.md
@@ -3,3 +3,13 @@ This begins a multi-step process for planning a new spec for our next product in
 The FIRST STEP is to initialize the spec by following these instructions:
 
 {{workflows/specification/initialize-spec}}
+
+## Display confirmation and next step
+
+Once you've initialized the spec folder, output the following message (replace `[this-spec]` with the folder name for this spec)
+
+```
+✅ I have initialized the spec folder at `agent-os/specs/[this-spec]`.
+
+Next step: Run the command, 2-research-spec.md
+```
diff --git a/profiles/default/commands/new-spec/single-agent/2-research-spec.md b/profiles/default/commands/new-spec/single-agent/2-research-spec.md
index d8b05540..2e1670f4 100644
--- a/profiles/default/commands/new-spec/single-agent/2-research-spec.md
+++ b/profiles/default/commands/new-spec/single-agent/2-research-spec.md
@@ -4,6 +4,16 @@ Follow these instructions for researching this spec's requirements:
 
 {{workflows/specification/research-spec}}
 
+## Display confirmation and next step
+
+Once you've completed your research and documented it, output the following message:
+
+```
+✅ I have documented this spec's research and requirements in `agent-os/specs/[this-spec]/planning`.
+
+Next step: Run the command, `1-create-spec.md`.
+```
+
 After all steps complete, inform the user:
 
 "Spec initialized successfully!
diff --git a/profiles/default/commands/plan-product/single-agent/1-plan-product.md b/profiles/default/commands/plan-product/single-agent/1-plan-product.md
index 921de120..7bcca797 100644
--- a/profiles/default/commands/plan-product/single-agent/1-plan-product.md
+++ b/profiles/default/commands/plan-product/single-agent/1-plan-product.md
@@ -6,6 +6,16 @@ The FIRST STEP is to confirm the product details by following these instructions
 
 Then WAIT for me to give you specific instructions on how to use the information you've gathered to create the mission and roadmap.
 
+## Display confirmation and next step
+
+Once you've gathered all of the necessary information, output the following message:
+
+```
+I have all the info I need to help you plan this product.
+
+Next step: Run the command, `2-create-mission.md`
+```
+
 ## User Standards & Preferences Compliance
 
 When planning the product's tech stack, mission statement and roadmap, use the user's standards and preferences for context and baseline assumptions, as documented in these files:
diff --git a/profiles/default/commands/plan-product/single-agent/2-create-mission.md b/profiles/default/commands/plan-product/single-agent/2-create-mission.md
index bdea0e16..6684b028 100644
--- a/profiles/default/commands/plan-product/single-agent/2-create-mission.md
+++ b/profiles/default/commands/plan-product/single-agent/2-create-mission.md
@@ -2,6 +2,18 @@ Now that you've gathered information about this product, use that info to create
 
 {{workflows/planning/create-product-mission}}
 
+## Display confirmation and next step
+
+Once you've created mission.md, output the following message:
+
+```
+✅ I have documented the product mission at `agent-os/product/mission.md`.
+
+Review it to ensure it matches your vision and strategic goals for this product.
+
+Next step: Run the command, `3-create-roadmap.md`
+```
+
 ## User Standards & Preferences Compliance
 
 IMPORTANT: Ensure the product mission is ALIGNED and DOES NOT CONFLICT with the user's preferences and standards as detailed in the following files:
diff --git a/profiles/default/commands/plan-product/single-agent/3-create-roadmap.md b/profiles/default/commands/plan-product/single-agent/3-create-roadmap.md
index 7d4e5eb5..26d2f119 100644
--- a/profiles/default/commands/plan-product/single-agent/3-create-roadmap.md
+++ b/profiles/default/commands/plan-product/single-agent/3-create-roadmap.md
@@ -2,6 +2,18 @@ Now that you've created this product's mission.md, use that to guide your creati
 
 {{workflows/planning/create-product-roadmap}}
 
+## Display confirmation and next step
+
+Once you've created roadmap.md, output the following message:
+
+```
+✅ I have documented the product roadmap at `agent-os/product/roadmap.md`.
+
+Review it to ensure it aligns with how you see this product roadmap going forward.
+
+Next step: Run the command, `4-create-tech-stack.md`
+```
+
 ## User Standards & Preferences Compliance
 
 IMPORTANT: Ensure the product roadmap is ALIGNED and DOES NOT CONFLICT with the user's preferences and standards as detailed in the following files:
diff --git a/profiles/default/commands/plan-product/single-agent/4-create-tech-stack.md b/profiles/default/commands/plan-product/single-agent/4-create-tech-stack.md
index 2a5bef1f..b6e0d569 100644
--- a/profiles/default/commands/plan-product/single-agent/4-create-tech-stack.md
+++ b/profiles/default/commands/plan-product/single-agent/4-create-tech-stack.md
@@ -2,6 +2,18 @@ The final part of our product planning process is to document this product's tec
 
 {{workflows/planning/create-product-tech-stack}}
 
+## Display confirmation and next step
+
+Once you've created tech-stack.md, output the following message:
+
+```
+✅ I have documented the product's tech stack at `agent-os/product/tech-stack.md`.
+
+Review it to ensure all of the tech stack details are correct for this product.
+
+You're ready to start planning a feature spec! You can do so by running the command, `1-new-spec.md`.
+```
+
 ## User Standards & Preferences Compliance
 
 The user may provide information regarding their tech stack, which should take precidence when documenting the product's tech stack.  To fill in any gaps, find the user's usual tech stack information as documented in any of these files:

From bf431c289df1c5afa4f25bc40e4c36955a5ec877 Mon Sep 17 00:00:00 2001
From: Brian Casel <brian@briancasel.com>
Date: Fri, 10 Oct 2025 11:57:24 -0400
Subject: [PATCH 8/8] changelog

---
 CHANGELOG.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 1a804da9..9386a681 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -11,6 +11,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - For Claude Code users:
   - Replaced hard-coding of 'opus' model setting on agents with 'inherit' so that it inherits whichever model your Claude Code is currently using.
   - Updated create-role script to add the "Inherit" option when creating new agents.
+- Clarified next command to run when in single-agent mode.
 
 ## [2.0.2] - 2025-10-09