Skillshog is PostHog for coding-agent skills.
Agents use real skills, self-report how those skills performed, optionally ask the human for feedback, and send structured telemetry back to Skillshog. Skill creators then inspect a dashboard and knowledge graph to understand what is working, what is breaking, and what to improve next.
Skill creators do not have a good feedback loop. They publish skills, but they cannot easily answer:
- Which instructions confuse agents?
- Which repo contexts make the skill fail?
- What do humans like or dislike after an agent uses the skill?
- What change would improve the skill most?
Skillshog turns that missing feedback loop into a product.
- Statement 2: Humanless. The operational user is the coding agent.
- Statement 1: Codex-powered. Codex can run the feedback loop, planning loop, and creator chat.
Creator dashboard overview: semantic graph for garrytan/gstack with themes, contexts, and fix nodes.
- A coding agent uses a real public skill, with
garrytan/gstackas the hero example. - A feedback skill runs automatically and captures the agent's self-report.
- The human adds one short comment.
- Skillshog ingests the report, clusters it semantically, and displays the results.
- The creator clicks the
gstackcard, opens the graph, and asks the chat what to improve first.
web/: Next.js frontend and backend routesspecs/: JTBD/topic specs for the Ralph planning/build loopdocs/seed-skills.md: real public skills to seed and highlightPROMPT_plan.md: planning loop promptPROMPT_build.md: building loop promptPROMPT_plan_work.md: scoped planning promptloop.sh: Codex-based Ralph loop runnerAGENTS.md: operational guide loaded every iterationIMPLEMENTATION_PLAN.md: current prioritized work list
Planning:
./loop.sh plan 2Scoped planning:
./loop.sh plan-work "creator dashboard and graph for gstack telemetry" 2Building:
./loop.sh 20The loop uses codex exec, not a special desktop-app toggle. If we want a five-plus-hour unattended run, the practical path is:
- run
./loop.sh <N>in a terminal and leave it running - or schedule repeated work through the Codex app automation system
Ship a lovable first release where:
- real public skills are listed on the home page
- a live or seeded
gstackfeedback report is visible - the creator can open a graph for one skill
- the creator can inspect evidence and ask focused questions in chat
