Skip to content

Refactor fingerprinting: AI-only, embedded decisions, cac CLI#32

Merged
nahiyankhan merged 3 commits intomainfrom
refactor/fingerprint
Apr 17, 2026
Merged

Refactor fingerprinting: AI-only, embedded decisions, cac CLI#32
nahiyankhan merged 3 commits intomainfrom
refactor/fingerprint

Conversation

@nahiyankhan
Copy link
Copy Markdown
Collaborator

@nahiyankhan nahiyankhan commented Apr 17, 2026

Summary

  • AI-only pipeline: collapses extract / extractMulti and Director.profile / profileMulti into single Target[]-taking implementations, removes deterministic/fallback paths, and fixes a latent bug where setToolContext didn't fire when the LLM requested tools (causing compare/drift/comply/fleet to silently finish without fingerprints).
  • Embedded decision comparison: DesignDecision.embedding?: number[] is populated at profile time; compareFingerprints now matches decisions by cosine similarity above a 0.75 threshold, blending unmatched coverage with matched-pair distance. Missing embeddings are reported qualitatively and excluded from the scalar so they don't pollute the number.
  • citty → cac: all 17 commands migrated. Three process.argv scanning hacks (profile, fleet, viz) deleted — cac's [...targets] handles variadic natively, so ghost profile ./a -v ./b now captures both targets.
  • Docs: concepts page updated for the three-layer model (observation / decisions / values), stale "Architecture 15%" replaced with "Decisions 15%", radar axis renamed, and a paragraph added explaining paraphrase-robust decision matching.

Deleted legacy symbols

FINGERPRINT_SCHEMA, buildFingerprintPrompt, buildDesignLanguagePrompt, analyzeStructure, DesignLanguageProfile, EnrichedFingerprint.languageProfile, profileWithAnalysis, fingerprintFromExtraction, provider.interpret.

Test plan

  • pnpm check — biome + typecheck + file-size
  • pnpm test — 101/101 passing (new compare-decisions.test.ts covers paraphrase matching, divergence, and the missing-embedding fallback)
  • pnpm build
  • Manual: ghost profile ./a -v ./b captures both targets (was dropping the second with citty)
  • Manual: ghost compare / drift / comply / fleet complete end-to-end when the LLM uses tools (previously silently empty)
  • Review the updated concepts page rendering at /docs/concepts

🤖 Generated with Claude Code

nahiyankhan and others added 3 commits April 17, 2026 00:24
Unify the fingerprinting pipeline as AI-only with a single multi-target
extract path, add embedding-based decision comparison, and replace citty
with cac for proper variadic CLI support.

**Pipeline unification (Director.profile)**

Collapse `extract` / `extractMulti` and `Director.profile` / `profileMulti`
into single `Target[]`-taking implementations. `setToolContext` now fires
unconditionally, fixing a latent bug where `compare`, `drift`, `comply`,
and `fleet` silently completed without fingerprints whenever the LLM
requested tools.

**Clean path hygiene**

Stop mutating `file.path` with `[label]` prefixes in multi-source
extraction; `sourceLabel` alone carries provenance. `run_extractor` takes
an optional `source` arg to disambiguate and errors on collision.

**AI-only, no fallbacks**

Remove `provider.interpret` and both fallback paths inside
`FingerprintAgent` (no-chat-support, 20-iteration safety). `profile()`
throws without LLM config and routes cwd through the agent pipeline.
`profileRegistry` stays as a distinct deterministic shadcn path.

Delete legacy symbols: `FINGERPRINT_SCHEMA`, `buildFingerprintPrompt`,
`buildDesignLanguagePrompt`, `analyzeStructure`, `DesignLanguageProfile`,
`EnrichedFingerprint.languageProfile`, `profileWithAnalysis`,
`fingerprintFromExtraction`.

**Decision comparison via embeddings**

Add `DesignDecision.embedding?: number[]` and a batched `embedTexts`
primitive. Both agent paths embed decisions at profile time when an
embedding provider is configured. `compareFingerprints` replaces word-
overlap matching with cosine similarity above a 0.75 threshold,
blending unmatched coverage with matched-pair distance. Missing
embeddings → decisions reported qualitatively, excluded from the
weighted scalar so they don't pollute the number. New test covers
paraphrase matching, divergence, and the missing-embedding fallback.

**citty → cac**

All 17 commands migrated. Three `process.argv` scanning hacks
(`profile`, `fleet`, `viz`) deleted — cac's `[...targets]` handles
variadic natively. `ghost profile ./a -v ./b` now captures both
targets instead of dropping the second one.

Verified: typecheck, biome check, file-size check, build, 101/101 tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the "64-dim vector" framing with the observation/decisions/values
model. Swap the stale "Architecture 15%" weight card for "Decisions 15%"
(matches WEIGHTS_WITH_DECISIONS in fingerprint/compare.ts) and rename the
radar axis to match. Add a paragraph describing paraphrase-robust
decision matching via cosine similarity above 0.75.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nahiyankhan nahiyankhan merged commit bab9dcc into main Apr 17, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant