feat: skills editor, memory files editor, dashboard shell, five built-in skills by mcheemaa · Pull Request #56 · ghostwright/phantom

mcheemaa · 2026-04-13T23:41:53Z

Summary

Ships PR1 of Project 3: the operator dashboard for Phantom.

Two tabs are live and production-grade in this PR:

Skills: create, read, update, delete, and lint markdown skills under the user-scope .claude/skills/ tree. Structured YAML frontmatter form plus a Monaco-quality body textarea with keyboard save, dirty-state tracking, atomic writes, and every edit audited in SQLite.
Memory files: the same CRUD story for arbitrary .md files under the user-scope .claude/ tree (excluding skills, plugins, agents, and settings JSON). CLAUDE.md, rules, and free-form memory all live here.

Six additional tabs (sessions, cost, scheduler, evolution, memory explorer, settings) ship as Coming Soon placeholders in the same dashboard shell.

Architecture

Storage (src/skills/, src/memory-files/): path validation, Zod-validated YAML frontmatter, linter, atomic tmp-then-rename writes, audit log tables.
API (src/ui/api/skills.ts, src/ui/api/memory-files.ts): JSON CRUD routes wired behind the existing cookie auth check in src/ui/serve.ts. Every mutating call records a row in skill_audit_log or memory_file_audit_log.
Reflective tools (src/agent/in-process-reflective-tools.ts): a new in-process MCP server (phantom-reflective) that exposes phantom_memory_search (semantic + temporal) and phantom_list_sessions directly to the agent, so the built-in reflective skills can actually fire.
Dashboard awareness (src/agent/prompt-blocks/dashboard-awareness.ts): a short block added to the environment section of the system prompt so the agent knows the dashboard exists and can direct the operator to it.
Dashboard UI (public/dashboard/): a single static HTML shell with a sidebar, hash router, and two JS modules. Vanilla JS, no React, no build step. Tailwind v4 tokens inherited from the existing phantom design vocabulary.
Built-in skills (skills-builtin/): mirror, thread, echo, overheard, ritual, show-my-tools. Seeded into the user-scope skills volume on container first boot; existing edits are preserved.

Test plan

bun test passes (1040 pass, 0 fail, +62 new tests vs main)
bun run lint clean
bun run typecheck clean
Manual walk-through of the dashboard in a browser: create, edit, and delete a skill; create, edit, and delete a memory file; verify the beforeunload guard fires on dirty state; verify theme toggle; verify Coming Soon placeholders render
Deploy a container build and verify the six built-in skills land in ~/.claude/skills/ on first boot
Send a Slack message that triggers one of the reflective skills and confirm it loads memory via the in-process tool

Rollback

Single commit-range revert on the branch. No schema rollback is needed because the two new migrations are additive tables with indices; leaving them in an inactive deployment is safe. Existing functionality in the dashboard has no coupling to the pre-existing /ui/ surface, so removing the new /ui/dashboard/ tree and the new API routes is a clean undo.

Add the on-disk storage layers for the PR1 dashboard's skills and memory files tabs. Both subsystems validate paths, parse and serialize content atomically via tmp-then-rename, cap content size, and record every edit in a new SQLite audit log. Skills at /home/phantom/.claude/skills/<name>/SKILL.md are parsed with a Zod-validated strict YAML frontmatter schema (name, description, when_to_use required; allowed-tools, argument-hint, arguments, context, disable-model-invocation optional) and linted for missing fields, body size, and shell red-list patterns. Memory files are any markdown under /home/phantom/.claude/ excluding skills/, plugins/, agents/, and the settings.json pair. Adds skill_audit_log and memory_file_audit_log tables with indices. Updates the migration test for the four new migrations.

Wire /ui/api/skills and /ui/api/memory-files into the existing serve.ts request pipeline. Both route sets live behind the phantom_session cookie check and return JSON bodies. Every mutating call records a row in the appropriate audit log table. Routes: - GET /ui/api/skills list - POST /ui/api/skills create - GET /ui/api/skills/:name read - PUT /ui/api/skills/:name update - DELETE /ui/api/skills/:name delete - GET /ui/api/memory-files list - POST /ui/api/memory-files create (body includes path) - GET /ui/api/memory-files/<encoded-path> read - PUT /ui/api/memory-files/<encoded-path> update - DELETE /ui/api/memory-files/<encoded-path> delete Adds setDashboardDb() to hand the database into the api dispatch so the audit log writes go through the already-open connection used elsewhere.

Expose phantom_memory_search and phantom_list_sessions to the agent itself as a new in-process MCP server (phantom-reflective). This is what makes the five reflective built-in skills actually fireable: they can query memory with temporal filters and enumerate recent sessions without having to round-trip through the external MCP endpoint at /mcp. phantom_memory_search wraps MemorySystem.recallEpisodes and recallFacts with an optional days_back filter that maps to RecallOptions.timeRange and the 'temporal' strategy. phantom_list_sessions reads the sessions SQLite table directly with channel and days_back filters. Also adds a small 'dashboard awareness' prompt block wired into the environment section of the system prompt. The agent now knows the dashboard exists at /ui/dashboard/, that its skills and memory files are editable there, and that it should point the operator at those URLs when asked 'what can I edit' or similar.

The hand-crafted operator dashboard at /ui/dashboard/. A single static HTML shell with a sticky nav, a sidebar of eight tabs, and a main content area that hash-routes between live tabs and Coming Soon placeholders. Two tabs are live in PR1: - Skills: left column of skill cards grouped by source (built-in vs yours) with substring search. Right column is a full-fidelity editor with a structured YAML frontmatter form (name, description, when_to_use, allowed-tools chip input with autocomplete, argument hint, context select, user-invoke-only toggle) and a large JetBrains Mono body textarea. Tab inserts two spaces, Cmd/Ctrl+S saves, dirty-state dot pulses next to the title, lint hints render under the body, delete has a confirm modal. New skill modal offers blank or duplicate-from-mirror templates. - Memory files: same split layout over any .md file under /home/phantom/.claude/. New file modal accepts any path ending in .md, creates nested directories automatically, and opens the editor on the new file. CLAUDE.md gets a small info banner noting it is the top-level memory loaded every session. Six Coming Soon placeholders (sessions, cost, scheduler, evolution, memory explorer, settings) render a serif headline, the expected PR, and a link back to skills. No React, no build step. Vanilla JS with one namespaced helper per tab. Tailwind and DaisyUI tokens are not loaded on this shell; the dashboard inherits the phantom- vocabulary spiritually by declaring its own phantom-nav, phantom-chip, phantom-mono, phantom-meta, and phantom-muted classes from the same token values as the base template. Light and dark themes share the same primary indigo with cream or warm-deep-dark surfaces. Also adds a Dashboard quick-link to the existing /ui/ landing page. beforeunload guards unsaved edits and the router respects the dirty state on navigation.

Ship a small catalog of genuinely novel reflective skills that make a fresh phantom feel alive from message one. Each is a full SKILL.md with a strict YAML frontmatter, a Goal, numbered Steps with per-step success criteria, and Rules. - mirror: weekly self-audit playback. Pulls the last 7 days of memory, anchors observations to specific episodes, renders three sections (what I noticed, what I am unsure about, one question for you). - thread: evolution of thinking on a topic. Takes a topic, clusters mentions chronologically, identifies turning points, renders a short narrative with callouts. - echo: prior-answer surfacer. Before deriving a new answer to a substantive question, checks memory for semantically similar prior questions and surfaces the conclusion if the match is strong. - overheard: promises audit. Scans the last 14 days for commitment phrases, checks for follow-through, surfaces the top 3-5 open promises with draft followup offers. - ritual: latent patterns to scheduled jobs. Finds recurring behaviors in 60 days of sessions, verifies them against memory, proposes formalization as phantom_schedule jobs. - show-my-tools: utility skill that lists current skills, memory files, and the dashboard URLs. The discovery path for everything the operator can edit. All five reflective skills list the new in-process MCP tools (mcp__phantom-reflective__phantom_memory_search, mcp__phantom-reflective__phantom_list_sessions, mcp__phantom-scheduler__phantom_schedule) in their allowed-tools field so they can actually fire. The skills ship in /app/skills-builtin/ inside the image. The docker entrypoint copies each directory to /home/phantom/.claude/skills/ on first boot only. Existing directories are preserved, so operator edits survive container rebuilds. Dockerfile copies the skills-builtin tree in both the builder and runtime stages.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 96a018f9f7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-14T01:36:07Z

src/agent/in-process-reflective-tools.ts

+					results.episodes = await memory.recallEpisodes(input.query, opts).catch(() => []);
+				}
+				if (input.memory_type === "semantic" || input.memory_type === "all") {
+					results.facts = await memory.recallFacts(input.query, { limit }).catch(() => []);


Pass temporal filters to semantic fact recall

When days_back is provided, phantom_memory_search builds a temporal RecallOptions for episodes, but semantic facts are still fetched with { limit } only, so facts are not time-bounded. In reflective skills that request recent windows (for example weekly reviews), this leaks older facts into the result set and contradicts the tool contract that days_back limits returned memory items.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-14T01:36:07Z

src/skills/frontmatter.ts

+		context: SkillContextSchema.optional(),
+		"disable-model-invocation": z.boolean().optional(),
+	})
+	.strict();


Permit built-in source marker in frontmatter schema

The schema is strict, so any extra key is rejected at parse time, but source detection later relies on an optional x-phantom-source marker (detectSource in src/skills/storage.ts) to classify built-in/agent skills. As written, that marker can never be present without causing a parse failure, so skills are always treated as user-sourced and built-in-specific UI behavior (grouping/guardrails) cannot work.

Useful? React with 👍 / 👎.

1. Pass temporal filters to semantic fact recall in the in-process reflective MCP server. phantom_memory_search was building a temporal RecallOptions for episodes when days_back was set, but recallFacts was being called with only { limit }. Reflective skills like mirror that ask for a 7-day window were leaking older facts into the result set. Fix is one word: pass the same opts object to recallFacts. The downstream semantic.recall already honors timeRange via a Qdrant range filter on valid_from. 2. Permit the x-phantom-source provenance marker in the SKILL.md frontmatter schema. The Zod schema was strict and detectSource() in src/skills/storage.ts read frontmatter['x-phantom-source'] without the field being declared, so any built-in skill that set the marker would have been rejected at parse time. Added the field to the schema as an optional enum of "built-in" | "agent" | "user", added the marker to all six built-in SKILL.md files (echo, mirror, overheard, ritual, show-my-tools, thread), and added tests for both the schema acceptance and the source classification. Quality gates: bun test 1044 pass / 0 fail (+4 new tests), bun run lint clean, bun run typecheck clean.

mcheemaa added 5 commits April 13, 2026 16:40

mcheemaa marked this pull request as ready for review April 14, 2026 01:30

chatgpt-codex-connector bot reviewed Apr 14, 2026

View reviewed changes

mcheemaa merged commit 98f09b2 into main Apr 14, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: skills editor, memory files editor, dashboard shell, five built-in skills#56

feat: skills editor, memory files editor, dashboard shell, five built-in skills#56
mcheemaa merged 6 commits intomainfrom
feat/pr1-skills-memory-dashboard

mcheemaa commented Apr 13, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 14, 2026

Uh oh!

chatgpt-codex-connector bot Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mcheemaa commented Apr 13, 2026

Summary

Architecture

Test plan

Rollback

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant