Skip to content

ruchit07/ai-spec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ai-spec

Scaffold a production-grade AI feature spec before you write a line of code.

Most AI features are vibe-coded: call the LLM, the output looks reasonable, ship. Then three weeks later a prompt change silently breaks 20% of queries — and nobody knows, because "looks reasonable" was never a measurable criterion.

ai-spec fixes the root cause. One command generates the spec, the eval criteria, a seed set of golden test cases, and an ADR starter — so "done" has a precise definition before you start.

By Ruchit Suthar — Software Architect & Technical Leader. 📖 Method: AI-Driven Development: The Spec-First Workflow


Quick start

npx @ruchit07/ai-spec init "semantic search for support tickets"

That's it. You'll be prompted for the feature kind, problem statement, latency/cost budgets, and owner — then it writes:

specs/
└── semantic-search-for-support-tickets/
    ├── spec.md              # Problem, I/O contract, acceptance criteria, failure modes
    ├── eval-criteria.md     # The metrics that gate CI, with threshold rationale
    ├── eval-criteria.json   # Machine-readable thresholds for your CI
    ├── test-cases.json      # Seed golden test cases tailored to the feature kind
    └── adr.md               # Architecture Decision Record starter

Non-interactive (for scripts / CI)

ai-spec init "ticket classifier" --kind classification --yes
ai-spec init "support agent" -k agent -d ./ai-features --force

Why this matters

Without a spec With ai-spec
"Looks good" is the bar Measurable thresholds (accuracy ≥ 0.8, latency p95 ≤ 2000ms)
Regressions found by users Regressions caught in CI
Prompt is the only documentation Spec + ADR explain intent and decisions
No baseline when you switch models Golden test set is the baseline

The discipline is the value. ai-spec makes the disciplined path the easy path.


Feature kinds

The seed test cases and spec guidance adapt to what you're building:

Kind Tailored guidance
rag Retrieval quality + groundedness as the dominant metric
chat Conversation scope, tools, context-window budget
classification Exact label set, out-of-distribution handling
extraction Typed output schema, field-level accuracy
agent Action space, stopping conditions, safety guardrails

Programmatic API

import { generateFiles, slugify } from '@ruchit07/ai-spec';

const files = generateFiles({
  featureName: 'My Feature',
  slug: slugify('My Feature'),
  kind: 'rag',
  problem: '...',
  primaryProvider: 'openai',
  latencyP95Ms: 2000,
  costPerQueryUsd: 0.005,
  accuracyThreshold: 0.8,
  groundednessThreshold: 0.85,
  owner: '@you',
});
// files: { path, content }[]

Pairs with


License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors