Skip to content

Conversation

@ericelliott
Copy link
Collaborator

Add dual-output user testing system that generates both human and AI agent
test scripts from user journey specifications. Human scripts use think-aloud
protocol with video recording, while AI agent scripts execute with screenshots
and variable behavior based on persona traits.

  • Add ai/rules/user-testing.mdc with SudoLang templates
  • Add ai/commands/user-test.md command
  • Register /user-test command in please.mdc
  • Reuse UserJourney and Persona types from productmanager.mdc

claude added 6 commits January 1, 2026 17:52
Add dual-output user testing system that generates both human and AI agent
test scripts from user journey specifications. Human scripts use think-aloud
protocol with video recording, while AI agent scripts execute with screenshots
and variable behavior based on persona traits.

- Add ai/rules/user-testing.mdc with SudoLang templates
- Add ai/commands/user-test.md command
- Register /user-test command in please.mdc
- Reuse UserJourney and Persona types from productmanager.mdc
Add complete documentation, tests, and CLI help for the /user-test command:

- Comprehensive user testing guide (docs/user-testing.md)
  - References Nielsen Norman Group research on 3-5 user testing
  - Clarifies AI agents use real browser automation (Playwright/Puppeteer)
  - Explains dual-output approach and best practices
- Unit tests for user-testing.mdc (ai/rules/user-testing.test.js)
  - Validates file structure and frontmatter
  - Verifies browser automation requirements
  - Checks template existence and documentation links
- README integration
  - Adds User Testing section with quick start
  - Links to comprehensive guide
  - Includes /user-test in workflow commands
- CLI help updates
  - Adds /user-test to aidd CLI help text
- Enhanced user-testing.mdc
  - Explicit browser automation requirements in AgentScript
  - Clarifies real UI interaction (not mocked/simulated)

Sources:
- https://www.nngroup.com/articles/why-you-only-need-to-test-with-5-users/
- https://www.nngroup.com/articles/how-many-test-users/
… frameworks

Remove references to Playwright/Puppeteer/Selenium automation frameworks.
AI agents have built-in capability to drive regular browsers (IDE browser,
Chrome, etc.) directly like a human would - clicking, typing, scrolling
through actual UI elements.

Updated:
- ai/rules/user-testing.mdc - Agent drives browser like human
- docs/user-testing.md - Remove automation framework references
- ai/rules/user-testing.test.js - Update test assertions
Clarify the critical distinction between AI agent testing and automation frameworks:

- AI agents discover UI by looking at the page (like real users)
- NO privileged access to source code or pre-written selectors
- Validates UI discoverability, not just technical functionality
- Automation frameworks (Playwright/Puppeteer) require pre-knowledge of selectors
- This approach catches issues where UI isn't actually discoverable/understandable

This is the key value proposition: agents figure out what to click the same
way users do, validating that the interface is genuinely usable.
Remove ~40 lines of redundant content:

AgentScript template: 6 lines → 2 lines (environment section)
Constraints: 7 lines → 4 lines (removed overlapping statements)
docs: Removed 4 redundant 'discover UI' explanations (kept 2 key mentions)
docs: Removed 4 excessive '3-5 users' references (kept Nielsen research + final note)
docs: Condensed 'Run AI Agent Tests' from 10 lines to 1 paragraph

Token savings: ~700 tokens
Principle: Perfection is attained when there is nothing more to remove.
Removed marginal value-adds:
- 'like a human' (implied by 'discover UI by looking')
- '(IDE, Chrome)' (unnecessary examples)
- 'varies between runs' (redundant with 'stochastic')
- Duplicate 'discover UI by looking' in Constraints (already in template)
- 'the same way users do' (already stated)
- 'what to click by looking (no source code access)' (3rd mention, trim to core)

Every word now earns its keep.
Copilot AI review requested due to automatic review settings January 1, 2026 18:17
claude added 3 commits January 1, 2026 18:19
Tests check for 'real UI' mention to verify emphasis on testing
actual rendered UI vs mocked components.
Add Constraints block to match other command files (discover.md, execute.md, etc.)
Remove exact string matching tests for markdown documentation.
Tests now verify:
- File existence
- Proper frontmatter structure
- Template presence

Not testing exact documentation wording - too brittle.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a /user-test command to the AIDD framework that generates dual test scripts (human and AI agent) from user journey specifications. The implementation includes comprehensive documentation explaining the Nielsen Norman Group's research on user testing effectiveness.

Key changes:

  • New user testing system with SudoLang templates for generating both human think-aloud scripts and AI agent executable scripts
  • Integration with existing ProductManager types for user journeys and personas
  • Comprehensive documentation with best practices and research-backed guidance

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
docs/user-testing.md Comprehensive guide explaining user testing methodology, Nielsen Norman Group research, and step-by-step instructions for using the dual-output approach
ai/rules/user-testing.mdc SudoLang template definitions for generating HumanScript and AgentScript from user journeys with persona-mapped behavior
ai/commands/user-test.md Command documentation specifying inputs (UserJourney) and outputs (test scripts)
ai/rules/please.mdc Registered /user-test command in the main command registry
ai/rules/user-testing.test.js Test suite verifying file existence, frontmatter, template presence, and documentation completeness
README.md Added User Testing section with quick start guide and link to full documentation
bin/aidd.js Added /user-test command to CLI help text

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

claude added 2 commits January 1, 2026 18:26
Add UserTestPersona type that extends Persona from productmanager.mdc
with user testing specific fields (role, techLevel, patience, goals).

Base Persona only includes ...meta fields. Templates reference additional
fields that need to be defined.
Add UserTestStep type that extends Step from productmanager.mdc
with user testing specific fields (action, intent, success, checkpoint).

Base Step only includes ...meta and userStories. Templates reference
additional fields that need to be defined.
Copilot AI review requested due to automatic review settings January 1, 2026 18:27
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

claude added 2 commits January 1, 2026 18:32
Add instruction to read steps out loud in human test template.
Helps reviewers follow along with video recordings.
1. Clarify agents should narrate thoughts/confusion like human testers
2. Add structured markdown output format for agent test reports
3. Create /run-test command file
4. Register /run-test in please.mdc command list
Copilot AI review requested due to automatic review settings January 1, 2026 18:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

claude added 2 commits January 1, 2026 19:22
The --persona flag isn't documented in user-testing.mdc Interface.
Instead, use separate journey files for different personas.
Copilot AI review requested due to automatic review settings January 1, 2026 19:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Adds test coverage for the /run-test command file to match the
existing test coverage for /user-test. Both commands are part of
the user testing feature and should have equivalent validation.

Tests verify:
- run-test.md file exists
- run-test.md references user-testing.mdc
@ericelliott ericelliott requested a review from Copilot January 1, 2026 19:36
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ericelliott ericelliott merged commit f5e6d5b into main Jan 1, 2026
12 checks passed
@ericelliott ericelliott deleted the claude/ai-user-testing-agents-2ZZ8O branch January 1, 2026 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants