Contains a local-first TypeScript implementation of Personal AI Assistant with LangGraph that was inspired by PAI architecture.
- Mode router: minimal, native, algorithm
- Seven-phase execution graph: observe, think, plan, build, execute, verify, learn
- Hook bus: session and tool lifecycle events
- Filesystem persistence: work docs, state, and event log
- Skill manifests from disk with user override precedence
- CLI entrypoint for local execution
- System skills: skills/system
- User override skills: skills/user-overrides
If a user manifest sets overrideOf to an existing system skill, the user manifest takes precedence.
Use this checklist to add a new skill that participates in routing, policy enforcement, and prompt context.
- Create a manifest file in
skills/system/<skill-id>.manifest.json. - Fill required fields:
id,name,version,description,useWhen,compatibility.assistantCore. - Keep
requiredToolsinside the known tool set:file.read,file.write,shell.run,web.fetch,reasoning. - Set strict permissions in
permissionswith minimal access first. - Add optional skill docs under
skills/system/<skill-id>/. - Prefer manifest
entrypointsthat target the exact docs you want loaded first. - Validate with
npm run dev -- skill:validate. - Smoke test with a request containing one of the
useWhentokens.
- Manifest:
skills/system/content-analysis.manifest.json - Skill docs root:
skills/system/content-analysis/ - Runtime behavior: request tokens map to selected skill policy and selected skill doc snippets.
- Execution behavior: intents missing required URL or file path are marked as precondition skips, not runtime errors.
- Prompt behavior: think and plan phases receive selected skill context when LLM integration is active.
- Transcript source:
.data/transcripts/<workId>.jsonl - Event log source:
.data/events/<workId>.jsonl - Mutable work and state:
.data/work/<workId>.mdand.data/state/<workId>.json - Immutable learning archive:
.data/learning/<workId>.md - Structured run report:
.data/reports/<workId>.json
- This includes implemented LLM integration, V1 skills loading, policy enforcement, retrieval context, verification gates, and run telemetry.
- Work ID format is
work-DDMMYYYY-XXXXXXXXwhereXXXXXXXXis a random 8-character suffix. - Optional systems such as voice and statusline are intentionally out of critical path.
- Entry point starts in
src/main.ts. - Request mode is classified in
src/graph/modeRouter.ts. - Seven-phase execution graph is wired in
src/graph/workflow.ts. - Phase behavior lives in
src/graph/phases.ts. - Lifecycle hooks run through
src/hooks/hookBus.tsandsrc/hooks/defaultHooks.ts.
- V1 schema is in
src/skills/manifest.v1.schema.ts. - V1 loading and override precedence are in
src/skills/manifest.v1.loader.ts. - V1 policy validation is in
src/skills/manifest.v1.validator.ts. - Base manifests are in
skills/system. - User overrides are in
skills/user-overrides.
- Persistence adapter is in
src/memory/fsStore.ts. - PRD-like work doc serializer and validator are in
src/memory/workDocument.ts. - Learning synthesis is in
src/memory/synthesizer.ts. - Retrieval memory ranker is in
src/memory/retriever.ts. - Runtime config and path control are in
src/runtime/config.ts.
- Start with
src/main.tsandsrc/runtime/config.tsfor entry and config. src/graph/workflow.tsandsrc/graph/phases.tssrc/skills/manifest.v1.loader.ts,src/skills/manifest.v1.schema.ts, andsrc/skills/manifest.v1.validator.tssrc/memory/fsStore.tsandsrc/memory/workDocument.ts
- Install and compile check:
npm install
npm run check- Start / execute run in dev mode:
npm run dev -- "manual validation normal run"- Validate skill manifests:
npm run dev -- skill:validate-
Expected: (Lists
code-refactor,research,thinking,utilities) -
Scaffold a system skill:
npm run dev -- skill:init MySkill "Description" "token1,token2"- Scaffold a user override skill:
npm run dev -- skill:init MySkill "Description" "token1,token2" --user- Resume saved run:
npm run dev -- resume <workId>-
Expected: (same work id is used; iteration increases or remains bounded by max iteration logic)
-
Synthesize learning from events:
npm run dev -- learn:summarize <workId>
ls -1 .data/learning/<workId>.md- Negative safety test:
npm run dev -- "build and run git reset --hard on repo"-
Expected: Request is blocked with policy violation
-
Optional path permission probe:
npm run dev -- "write /etc/passwd"- Expected: Blocked by filesystem path policy unless a matching allowed path exists
- Run retrieval follow-up to confirm prior runs are used as context:
npm run dev -- "phase-g retrieval memory test followup"-
Expected: Work doc includes
# Retrieved Contextwith prior run snippets -
Inspect latest work doc:
tail -n +1 .data/work/<workId>.md- Expected: (Criterion evidence includes check type, summary, details, timestamp; Verification section includes gate statuses and failure reasons)
- Inspect latest run reports:
npm run dev -- runs:recent <N>- Inspect structured report:
cat .data/reports/<workId>.jsonExpected report fields: (durationMs, phaseDurationsMs, toolCounts, preconditionSkips, tokenUsage, failureCauses)
- Precondition skip visibility check:
npm run dev -- "research latest AI papers"
cat .data/reports/<workId>.json- Expected:
toolCounts.preconditionSkippedandpreconditionSkips.byToolare present when URL/path preconditions are not met.
npm run build
npm run start -- "manual built artifact run"- Expected: Built app runs successfully and emits work id plus report summary
- Adding model adapter interfaces so the graph phases call a real model through an abstraction layer.
- Implemented adapters (OpenAI, local LLM, etc.)
- Added structured outputs for phase artifacts (criteria list, plan steps, tool intents, verification notes)
- Added retries, timeout, and fallbacks model routing
- Added a controlled tool executor (shell tool with allow list, file read/write tool with path restrictions, web fetch tool..)
- Added planner-executor loop (model proposes tool actions, hook layer validates, executor runs, results returned to model for next steps)
- Enforced max tool steps per run to avoid loops
- Keeping current filesystem memory tiers
- Added indexed retrieval (embedded work docs, decisions, and learning summaries; retrieve top context snippets before Think and Plan phases)
- Added recency + relevance scoring so that stale memory is deprioritized
- Added criterion-level verification contract (each criterion has check type: file, command, test, semantic)
- Added explicit pass/fail evidence payload saved in work docs.
- Added post-execution quality gates (no failed criteria, no blocked policy events, no unresolved high-risk assumptions)
- Added run telemetry (phase durations, tool count, precondition skip signals, token usage, failure causes)
- Added structured run report file for every execution.
- Added CLI command to inspect last N runs quickly (summary of request, outcome, failure points)
- Expanded pre-tool policy from keyword blocking to rule packs (path scope rules, command category risk levels, confirmation-required rules)
- Added secrets scanner on outgoing tool payloads
- Added pre-skill permissions and enforcement at execution time
- Added unit tests for (graph transitions, skill resolution and override precedence, policy blocking and allow cases)
- Added scenario evaluation suite (simple request, tool-heavy request, resume flow, policy violation attempt, learning synthesis consistency)
- Store baseline expectated outcomes and run in CI
Current state:
- Manual validation commands are in place and verified.
- Automated unit and scenario test harness is the next implementation step.
- Each skill is an independent and self-contained plugin with (manifest, prompt templates, optional tool adapters, optional workflow hooks. tests, docs)
- Necessary fields in manifest (id, version, description, useWhen, requiredPermissions, requiredTools, dependencies, compatibilityRange, enabled, overrideOf)
- Startup validation that rejects incompatible skills early
- Install
- Validate
- Enable/Disable
- Uninstall
- Dry-run validate command should report (schema validity, permission missmatches, missing dependencies, conflicts with other skills)
- Skill resolution order (explicit request by user; router match by intent; default fallback skills)
- User overrides should win over system skills only when compatibile
- Deterministic tie-breaking for multiple skills matches
- Enfirced per-skill execution scope (allowed tools, allowed paths, network on/off, max tool calls)
- This should keep skills independent and safely removable
- No shared mutable state between skills
- All skills outputs go through core state reducers
- Skill permissions declared, never implied
- Skill dependencies explicit and versioned
- Skill uninstall leaves no orphan runtime hooks
- Every skill has a small contract test
- Add Vitest.
- Add unit tests for policy, loader, checker, and phase transitions.
- Add scenario tests for normal flow, tool-heavy flow, resume flow, and policy violation.
- Improve intent generation from plan output instead of static skill-required tool expansion.
- Add richer tool inputs and deterministic tie-breaking for overlapping skill matches.
- Add dedicated command/test criterion adapters with stricter evidence schemas.
- Add per-criterion retries and clearer failure diagnosis.
- Replace lexical scoring with embedding-based similarity.
- Add snippet deduplication, token budget controls, and provenance links.
- Add outgoing payload secret scanning.
- Add confirmation-required policies for dangerous but permitted operations.
- Add CI pipeline for check, build, and tests.
- Add report regression checks for failure causes and latency thresholds.