Skip to content

Fix/agent repl ux#151

Merged
simongdavies merged 4 commits into
hyperlight-dev:mainfrom
simongdavies:fix/agent-repl-ux
May 15, 2026
Merged

Fix/agent repl ux#151
simongdavies merged 4 commits into
hyperlight-dev:mainfrom
simongdavies:fix/agent-repl-ux

Conversation

@simongdavies
Copy link
Copy Markdown
Member

This pull request introduces several significant improvements to the HyperAgent codebase, focusing on user guidance, skill management safety, and user experience enhancements. The changes ensure clearer instructions for missing prerequisites, prevent accidental overwriting of built-in skills, improve profile preview rendering, and enable immediate use of newly created skills.

User Guidance and Experience:

  • Improved missing MCP server guidance: The system now surfaces unconfigured MCP servers as the first item in task guidance, with clear, actionable instructions for the user and LLM, reducing hallucinated or incorrect suggestions. The guidance block distinguishes between servers with explicit setup shortcuts and those requiring manual config edits.
  • Expanded MCP setup command mapping: Added explicit CLI flag mappings for more server types in MCP_SETUP_COMMANDS, ensuring accurate shortcut recommendations.

Skill Management Safety:

  • Prevents accidental shadowing of built-in skills: The generate_skill tool now checks for name collisions with both user and built-in (system) skills. It blocks silent overwrites, warns when a built-in is about to be shadowed, and requires explicit confirmation, protecting curated skills from being inadvertently replaced.
  • Hot-reloads skill registry after saving: After a skill is saved, the SDK's skill registry is reloaded so the new or updated skill is immediately available, improving workflow and reducing confusion. Errors in reload are surfaced but do not block the save.

Profile Preview Improvements:

  • Enhanced profile-apply preview rendering: The profile preview shown before applying changes is now rendered as a markdown table when markdown is enabled, making it easier to read. The underlying data is structured, so both plain text and markdown renderers stay in sync.

Other Notable Updates:

  • Added debug logging for sandbox lifecycle traces, routing verbose logs to disk when --debug is used, keeping the terminal output clean.

- Add `## Quick Install` block (gh auth login + npm install -g + run)
  immediately after the warning callouts, so new users see the install
  path within the first screen
- Link forward to the existing `## Install and Run` section for Docker,
  source builds, and full prerequisites
- Prettier also realigned the capability table dividers in `How It
  Works` (pre-existing alignment drift from when the `execute_bash` row
  was added) — pure formatting, no content change

Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Bug 1 — sandbox verbose traces leak to user terminal
  src/sandbox/tool.js gained a debugLog callback option that ~25
  verbose-gated console.error sites now route through, defaulting to
  console.error when no callback is wired.  src/agent/index.ts passes
  its existing debugLog (which writes ~/.hyperagent/logs/agent-debug-
  *.log when --debug is on) into both sandbox factories, so the noisy
  '[sandbox] setPlugins / invalidateSandboxWithSave / autoSaveState'
  chatter stops leaking into the REPL.

Bug 2 — generate_skill silently shadowed bundled built-in skills
  skill-writer now exports systemSkillExists(name, dir) alongside the
  existing userSkillExists.  generate_skill refuses (without an
  explicit overwrite=true) when the requested name matches a curated
  built-in under <CONTENT_ROOT>/skills/, with an error pointing at a
  safer '-custom' suffix.  When overwrite=true *is* set on a system
  collision, the approval banner shouts ⚠️ SHADOW built-in skill and
  the y/n prompt reads 'Shadow built-in skill? [y/n]' so the user
  can't sleep through a destructive override.  Added 4 tests for
  systemSkillExists covering present/missing/invalid-name cases.

Bug 3 — '/skills <name>' didn't actually invoke the skill
  The Copilot SDK speaks /<skillname>, not /skills <skillname>.  The
  REPL was forwarding the raw '/skills kql-expert' string, which the
  SDK couldn't parse — the LLM saw it as a natural-language request
  and sometimes mis-fired generate_skill to 'save kql-expert',
  triggering Bug 2.  The intake layer in agent/index.ts now rewrites
  '/skills <name>' → '/<name>' before dispatch (subcommands info|edit|
  delete|list are left alone).  slash-commands.ts bridges the same
  way for synthetic callers and the default case now recognises user
  skills in addition to system skills.  Listing footer updated to
  show 'Invoke: /<name>'.

Quality gate: just lint clean, 2452 TS tests pass, 124 Rust tests pass.

🎬 Number Five says verbose-log is alive!

Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
…ofile preview table

Four intertwined fixes for the interactive REPL experience:

• Markdown rendering: patched marked-terminal v7's broken `text`
  renderer (it used token.text raw instead of recursing via
  parseInline, so **bold** inside tight list items leaked
  asterisks to the terminal); flipped showSectionPrefix to false
  so headings render cleanly without ### prefix; made /markdown
  queryable (status/on/off/toggle, no toggle-trap); removed the
  looksLikeMarkdown() gate that produced inconsistent output for
  opted-in users; fixed two callsites printing literal
  **Configuration:** via console.log.

• MCP shortcut surfacing: expanded MCP_SETUP_COMMANDS map to all
  5 supported shortcuts (was 1); restructured formatGuidance() so
  missing-MCP prerequisites land at the TOP under MISSING
  PREREQUISITES (was buried mid-document, model ignored it);
  distinguished supported (→specific shortcut) vs unsupported
  (→config.json) servers; removed the synthesized
  --mcp-setup-${name} fallback that would emit non-existent
  flags for unsupported servers.

• Skill hot-reload: exposed the SDK's built-in skill tool in
  ALLOWED_TOOLS + availableTools; added /skills reload subcommand
  calling session.rpc.skills.reload(); added 'reload' to
  RESERVED_SKILL_NAMES; auto-reload after generate_skill writes
  so freshly authored skills are invocable mid-session.

• Profile-apply preview table: applyProfileImpl now emits a
  proper markdown table (Limit/Before/After columns) via
  renderMarkdown when /markdown is on; marked-terminal converts
  it to a unicode box-drawing table with bold headers, much
  easier to scan than the previous flat 'cpu: 1000ms → 2000ms'
  list.  Plain-text fallback preserved verbatim for markdown-off
  sessions.

Regression tests added throughout (markdown-renderer asserts
absence of ** markers, absence of ### heading prefix, and GFM
table cells survive rendering; pattern-integrity asserts MCP
shortcuts list stays in sync with CLI; approach-resolver
asserts ordering and no-fake-flag policy).

Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 15, 2026 18:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the HyperAgent REPL user experience and safety around skills and prerequisites by enhancing MCP setup guidance, preventing accidental built-in skill shadowing, adding skill hot-reload, improving markdown rendering/UX, and routing noisy sandbox lifecycle logs to debug output.

Changes:

  • Add missing-prerequisites guidance for unconfigured MCP servers, plus an expanded MCP setup shortcut mapping.
  • Add skill safety features: detect collisions with built-in skills, add /skills reload, and hot-reload the SDK skill registry after saving.
  • Improve terminal markdown output (tight-list inline formatting, headings) and render profile previews as readable markdown tables when markdown is enabled.

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/skill-writer.test.ts Extends reserved-name coverage and adds tests for system-skill existence checks.
tests/pattern-integrity.test.ts Adds regression tests for SDK built-in tool gating, skill reload wiring, markdown UX invariants, and MCP setup shortcut consistency.
tests/markdown-renderer.test.ts Adds regression tests for tight-list inline formatting, heading prefix suppression, and GFM table rendering.
tests/approach-resolver.test.ts Updates MCP guidance tests to validate prerequisite surfacing and honest unsupported-server fallback.
src/sandbox/tool.js Adds debugLog sink for verbose sandbox lifecycle traces and routes verbose logs through it.
src/sandbox/tool.d.ts Updates typings to include debugLog option.
src/agent/tool-gating.ts Whitelists the SDK built-in skill tool in the gate.
src/agent/slash-commands.ts Adds /markdown subcommands, /skills reload, better config banner rendering, and improved skill invocation detection.
src/agent/skill-writer.ts Reserves /skills reload name and adds systemSkillExists() helper.
src/agent/markdown-renderer.ts Configures marked-terminal heading rendering and patches tight-list inline rendering via renderer override.
src/agent/index.ts Wires sandbox debug logging, adds profile preview markdown rendering, adds built-in skill collision protection + post-save reload in generate_skill, and updates markdown rendering behavior.
src/agent/approach-resolver.ts Adds prerequisite-first MCP guidance and expands MCP_SETUP_COMMANDS with supported servers.
README.md Adds quick install section and improves capability table formatting.

Comment thread src/agent/index.ts
Comment thread src/agent/slash-commands.ts Outdated
…path traversal

Two issues caught by the Copilot reviewer on PR hyperlight-dev#151:

• /skills <name> → /<name> rewrite was breaking /skills reload.
  The local KNOWN_SKILLS_SUBS set in src/agent/index.ts omitted
  "reload", so the new hot-reload subcommand got rewritten to
  /reload and never reached the handler. Fix: gate the rewrite on
  validateSkillName(token) === null instead — that consults
  RESERVED_SKILL_NAMES in skill-writer.ts, which already lists every
  /skills subcommand (single source of truth, no parallel hardcoded
  set that can drift).

• Slash-command default-case skill detection used
  existsSync(join(skillsDir, cmd.slice(1), "SKILL.md")) with
  unvalidated user input. A literal /../etc would resolve outside
  skillsDir, turning the "is this a skill?" check into an arbitrary
  filesystem probe. Fix: route through systemSkillExists(skillName,
  skillsDir) which calls validateSkillName() first, rejecting empty
  / oversized / path-traversal / reserved names before any join
  touches disk.

Regression tests in tests/pattern-integrity.test.ts assert both the
new gating mechanism and the absence of the old unsafe patterns.

Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
@simongdavies simongdavies merged commit ef827c3 into hyperlight-dev:main May 15, 2026
12 checks passed
simongdavies added a commit that referenced this pull request May 15, 2026
Patch release covering three PRs merged after v0.6.0:

• #149 — README quick-install section near the top for faster onboarding
• #150 — @github/copilot dependency bump 1.0.39 → 1.0.48
• #151 — agent REPL UX overhaul (skill hot-reload, /skills <name> rewrite,
  marked-terminal v7 tight-list patch, MCP prerequisite surfacing, profile
  preview as markdown table, sandbox verbose-trace debug-log routing,
  built-in skill shadow protection, plus 2 review-round fixes for /skills
  reload rewrite + path traversal in skill detection)

Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants