Skip to content

AI‐First SDLC — Specification Framework

steven-dracker edited this page Mar 26, 2026 · 1 revision

AI-First SDLC — Specification Framework

Companion to: docs/AI_FIRST_SDLC.md
Document Owner: Steven Dracker
Version: 1.0
Date: March 26, 2026
Status: Draft — Validated against ERATE Workbench POC


1. Purpose

The AI-First SDLC Framework defines who does what across the advisory, architecture, and implementation layers. This companion document defines how — specifically how stakeholder requirements are captured, structured, and translated into prompts that produce code which is architecturally consistent, scope-controlled, and compliant with non-functional requirements.

This specification framework is the repeatable, stack-agnostic layer of the AI-First SDLC. It can be applied to any project regardless of language, framework, or domain.

This is not meant to be a complete reference for all AI-assisted coding methodologies. At the time I am writing this, tools and agentic capabilities are rapidly evolving. This is how I approached it in this project.


2. The Core Problem This Solves

AI implementation tools like Claude Code are extraordinarily capable at execution. Their failure mode is not incompetence — it is assumption. When given an ambiguous prompt, an AI implementation tool will make plausible assumptions and produce plausible code. That code may be functionally correct and architecturally wrong. It may solve the stated problem while violating an unstated constraint. It may introduce a dependency that conflicts with an existing decision made three prompts ago.

The specification framework exists to eliminate ambiguity before execution begins. It does this by requiring that every prompt carries four distinct layers of context:

  1. Functional requirements — what the feature must do
  2. Non-functional requirements — how it must do it
  3. Architectural constraints — what it must not violate
  4. Scope boundaries — what is explicitly out of scope for this prompt

When all four layers are present, the implementation AI executes with precision. When any layer is missing, the practitioner is delegating a decision to the AI that should have been made by a human.


3. The Specification Hierarchy

Requirements flow downward through three levels. Each level inherits from the one above and adds specificity.

Level 1 — Project Constitution

The project constitution is established once at project initiation and governs every prompt for the life of the project. It lives in the engineering runbook (CLAUDE.md or equivalent) and is loaded automatically at the start of every implementation session.

The project constitution contains:

Stack definition The complete, locked technology stack. Language, framework, ORM, database, frontend library, API pattern, test framework. No implementation prompt may introduce a dependency outside this stack without an explicit architectural decision record.

Example from ERATE Workbench:

Stack: C# / ASP.NET Core Razor Pages / Entity Framework Core / SQLite / Chart.js / Swashbuckle
Pattern: Repository pattern with async/await throughout
API: RESTful endpoints documented via Swagger
Authentication: None (POC scope)
Deployment: Local WSL2 / Ubuntu — no cloud deployment in scope

Naming conventions Class names, method names, file names, route patterns, database table names, column naming conventions. Established once, enforced by every prompt.

Architectural patterns Repository pattern, service layer, dependency injection conventions, error handling approach, logging approach. These are decisions that must be consistent across the entire codebase.

Non-negotiable constraints Items that can never be violated regardless of what a prompt requests. Examples: no credentials in code, idempotent data operations, no silent test deletion, append-only log files.

Out of scope for entire project Features, capabilities, or integrations that are explicitly deferred. Every implementation AI session begins knowing what is permanently off the table.


Level 2 — Feature Specification

Written by the Architect AI for each backlog item before Claude Code is invoked. The feature specification translates a backlog item into a structured prompt that carries all four requirement layers.

Feature specification template:

FEATURE: [CC-XXXXXX] [Feature name]

CONTEXT
[2-3 sentences describing why this feature exists, who uses it, 
and what business problem it solves. This is the stakeholder 
requirement translated into technical context.]

FUNCTIONAL REQUIREMENTS
[Numbered list of what the feature must do. Each item is 
testable — it can be verified as done or not done.]
1. 
2.
3.

NON-FUNCTIONAL REQUIREMENTS
[Performance, security, accessibility, data integrity, 
error handling. Each item specifies the standard, not just 
the category.]
- Performance: [specific threshold if applicable]
- Error handling: [specific behavior on failure]
- Data integrity: [idempotency, validation rules]
- Security: [specific constraints]

ARCHITECTURAL CONSTRAINTS
[What must not be violated. References the project constitution 
plus any feature-specific constraints.]
- Must use existing [Repository/Service/Pattern]
- Must not introduce new dependencies outside approved stack
- Must follow existing naming convention: [specific convention]
- Database changes must be implemented as EF Core migrations
- [Any feature-specific architectural constraint]

IMPLEMENTATION SCOPE
[Explicit list of what is included in this prompt. 
Prevents scope creep mid-execution.]
IN SCOPE:
- 
- 
OUT OF SCOPE (defer to future prompts):
- 
- 

ACCEPTANCE CRITERIA
[How the practitioner verifies the output is correct before committing.]
- [ ] 
- [ ] 
- [ ] 

COMMIT MESSAGE
[The exact commit message to use when this feature is accepted.]
feat: [description]

Level 3 — Task Prompt

The executable prompt sent to Claude Code. Derived directly from the feature specification but written as an instruction, not a document. The task prompt is the feature specification collapsed into the form that produces the best implementation output.

Task prompt structure:

[CONTEXT]
[2-3 sentences of business context from the feature spec. 
This grounds the AI in why, not just what.]

[CURRENT STATE]
[Brief description of what already exists that this task 
builds on or modifies. Prevents the AI from rebuilding 
what is already there.]

[REQUIREMENTS]
[Numbered functional requirements from the feature spec, 
written as imperatives.]
1. Create...
2. Add...
3. Ensure...

[CONSTRAINTS]
[Non-negotiable items from the project constitution and 
feature spec. Written as prohibitions.]
- Do NOT introduce any new NuGet packages
- Do NOT modify existing [specific file] 
- All database changes MUST be implemented as EF Core migrations
- Follow existing naming convention: [specific convention]
- Error handling must follow existing pattern in [reference file]

[OUT OF SCOPE]
[Explicit list of what this prompt does not cover. 
Prevents the AI from over-building.]
Do not implement:
- [item 1]
- [item 2]

[OUTPUT FORMAT]
[What files to create or modify, in what order, 
with what structure.]
Produce the following:
1. [filename] — [purpose]
2. [filename] — [purpose]

[VERIFICATION]
[How to confirm it worked. The AI should tell you 
how to test the output.]
After implementation, provide the command to verify [specific behavior].

4. Schema Discovery as Prompt Zero

Before any data-dependent feature prompt is executed, a schema discovery prompt must run first. This is non-negotiable.

The failure mode it prevents: the Architect AI designs a feature against assumed column names, data types, or relationships that do not match the actual data source. The implementation AI then builds against those assumptions. The result is code that compiles but fails at runtime — the worst category of error because it is invisible until execution.

Schema discovery prompt template:

SCHEMA DISCOVERY — [Dataset or table name]

Before implementing any feature against [data source], 
retrieve the actual schema.

For API sources:
Fetch [endpoint URL] with $limit=1 and output every field 
name, its data type, and a sample value to a file called 
docs/schema_[name].md

For database sources:
Query the information_schema or equivalent and output 
every table, column name, data type, and nullable status 
to docs/schema_[name].md

Do not write any feature code in this prompt. 
Output schema documentation only.

Schema discovery outputs become permanent project artifacts. Every subsequent prompt that touches that data source references the schema document rather than assumptions.


5. The Context Transfer Protocol

AI implementation tools have no memory between sessions. Every session begins with zero context unless context is explicitly provided. The context transfer protocol ensures that the project constitution, current state, and active task are loaded correctly at the start of every session.

Session opener (sent to Claude Code at the start of every session):

Read CLAUDE.md and confirm current state.

After reading, summarize:
1. The current project stack
2. The last completed feature
3. The active backlog item
4. Any open issues or deferred items flagged in CLAUDE.md

Do not write any code until you have confirmed current state.

This single discipline prevents the most common failure mode in AI-assisted development: the implementation AI building something that conflicts with a decision made in a previous session that it has no memory of.

Session handoff (written to CLAUDE.md at the end of every session):

## Session Handoff — [Date]

COMPLETED THIS SESSION:
- [CC-XXXXXX]: [Description] — committed as [commit hash]

CURRENT STATE:
- [Brief description of where the codebase stands]
- [Any known issues or incomplete items]

NEXT SESSION SHOULD START WITH:
- [CC-XXXXXX]: [Next backlog item]
- [Any prerequisite that must be confirmed before starting]

DECISIONS MADE THIS SESSION:
- [Any architectural decision that future prompts must respect]

DO NOT IN NEXT SESSION:
- [Any explicit prohibition based on what was learned this session]

6. Non-Functional Requirements Reference

Non-functional requirements are the most commonly omitted layer in AI-assisted development. They are rarely explicit in stakeholder requests but always implicit in production expectations. The following reference defines the standard categories and how to specify them in prompts.

Performance

Specify thresholds, not just intent.

Weak: "The page should load quickly."
Strong: "The page must return a response in under 2 seconds against the full dataset. Use async/await throughout and avoid N+1 query patterns."

Security

Specify what is prohibited, not just what is required.

Weak: "Handle credentials securely."
Strong: "No credentials, connection strings, or API keys may appear in source code. All sensitive configuration must be read from environment variables. The .gitignore must exclude .env files."

Data Integrity

Specify idempotency explicitly.

Weak: "Import the data."
Strong: "The import operation must be idempotent. Running it multiple times against the same source must produce the same database state. Use upsert logic keyed on [specific field]. Do not duplicate records on re-run."

Error Handling

Specify behavior, not just existence.

Weak: "Handle errors appropriately."
Strong: "On API failure, log the error to the append-only log file at /logs/import.log with timestamp, endpoint, and HTTP status code. Do not throw unhandled exceptions. Return a user-readable error message to the UI."

Scope Boundaries

Specify what is out of scope as explicitly as what is in scope.

Weak: [no scope statement]
Strong: "This prompt covers the data ingestion layer only. Do not implement any UI components, API endpoints, or analytics queries in this prompt. Those are separate backlog items."


7. Architectural Decision Records

Every time a decision is made that affects the entire project — a new pattern, a new dependency, a change to an existing convention — an Architectural Decision Record (ADR) is written and stored in docs/decisions/.

ADR template:

# ADR-[NUMBER]: [Decision title]
Date: [Date]
Status: Accepted

## Context
[What situation or problem prompted this decision]

## Decision
[What was decided]

## Rationale
[Why this decision was made over alternatives]

## Consequences
[What this decision enables and what it constrains 
for all future prompts]

## Alternatives Considered
[What else was evaluated and why it was rejected]

ADRs are referenced in feature specifications and task prompts whenever a constraint derives from a prior decision. This creates an auditable chain of reasoning from stakeholder requirement to implementation choice.


8. The Prompt Traceability Convention

Every feature prompt carries a unique identifier that traces from backlog to commit. In the ERATE Workbench project this convention is the CC-ERATE-XXXXXX numbering system.

The traceability chain:

Backlog item CC-ERATE-000042
  → Feature specification written by Architect AI
    → Task prompt executed in Claude Code session
      → Code reviewed and accepted by practitioner
        → Committed with message: "feat: [description] [CC-ERATE-000042]"
          → Backlog item marked Done in BACKLOG.md

This chain means that any commit in the repository can be traced back to the original backlog item, the feature specification that defined it, and the architectural decisions that constrained it. This is the audit trail that makes AI-assisted development governable.


9. The Repeatable Project Bootstrap

When starting a new project using this framework, the following sequence initializes the specification layer correctly.

Step 1 — Write the project constitution Define stack, naming conventions, architectural patterns, non-negotiable constraints, and out-of-scope items. Store in CLAUDE.md. This document governs every prompt for the life of the project.

Step 2 — Write the ChatGPT primer Store in docs/context/chatgpt-primer.md. This document restores the Architect AI session with full project context. It includes the project constitution, current backlog state, active conventions, and the prompt numbering convention.

Step 3 — Run schema discovery Before any feature work, discover the actual schema of every data source the project will touch. Store results in docs/schema_[name].md. No feature prompt runs against an assumed schema.

Step 4 — Seed the backlog Create BACKLOG.md with the four-state model: Backlog, To-Do, Active, Done. Populate with all known CC- items and TD- items from the project constitution. Every session begins by reading this file.

Step 5 — Establish the handoff convention Write the first session handoff entry in CLAUDE.md. From this point forward, every session ends with an updated handoff and every session begins with the session opener.

Step 6 — Write the first feature specification Using the Level 2 template, write the specification for the first backlog item. Have the Architect AI review it against the project constitution for consistency before executing.

The project is now bootstrapped. Every subsequent feature follows the same specification → task prompt → review → commit → handoff cycle.


10. Why This Framework Is Repeatable

The ERATE Workbench POC validated this framework against a real production-quality application. The framework is repeatable because:

It is stack-agnostic. The templates reference no specific language, framework, or tool. They define structure, not syntax.

It is role-agnostic. The Architect AI role can be filled by ChatGPT, Claude in conversation, or any capable language model. The implementation AI role can be filled by Claude Code, GitHub Copilot, or any capable code generation tool. The practitioner role is always human.

It scales with project complexity. A simple project needs a lightweight project constitution and basic feature specifications. A complex project needs detailed ADRs, rigorous schema discovery, and extensive non-functional requirement specification. The framework accommodates both without changing its structure.

It produces an audit trail by default. The prompt numbering convention, the backlog state model, the session handoff protocol, and the commit message convention together create a complete record of every decision made during the project. This is the governance layer that makes AI-assisted development trustworthy in professional and regulated environments.


11. Version History

Version Date Author Notes
1.0 March 2026 Steven Dracker Initial draft — validated against ERATE Workbench POC