Skip to content

feat(llm,tests): live-Ollama tests + chat UX fixes#65

Merged
fentas merged 2 commits into
masterfrom
feat/chat-render-and-live-tests
May 17, 2026
Merged

feat(llm,tests): live-Ollama tests + chat UX fixes#65
fentas merged 2 commits into
masterfrom
feat/chat-render-and-live-tests

Conversation

@fentas
Copy link
Copy Markdown
Owner

@fentas fentas commented May 17, 2026

Summary

Adds zig build ollama for real-endpoint integration tests (skipped when unreachable) AND fixes four UX issues spotted in real chat sessions after PR #64 merged.

Tests

OLLAMA_HOST=http://localhost:11434 \
ATTY_TEST_MODEL=qwen3-coder:latest \
zig build ollama

Three tests covering protocol SHAPE (not content):

  • single-mode round-trip → non-empty command
  • dialog-mode round-trip → parses as valid envelope
  • destructive-task dialog → parses (or skips on model drift)

All three pass locally against qwen3-coder:latest. With no OLLAMA_HOST set: all three skip.

UX fixes

  1. Exec command echoed into chat input, not shell promptpaintInlineChat now hides the real cursor + DECRC on every paint so bash's echo of an injected command lands at the SHELL prompt above, while the chat panel's block-cursor glyph stays in the input row as the visible marker.

  2. Raw JSON envelope shown in chat scrollback — new renderTurnContent parses assistant_exec turns at paint time and renders: description → cyan command for exec, italic question for question, mauve ✓ + reason for done. Falls back to raw on parse failure.

  3. Question/done from LLM never appeared as turns in chat.question branch now pushes the assistant envelope so it lands in scrollback alongside the hint. (.done skipped: dialogReset wipes the ring; the conclusion banner scrolls into shell history.)

  4. Retry was generic ("wasn't valid JSON")requestParseRetry now detects three specific drift patterns (trailing {"open_chat":true} second object, markdown fences, prose preamble) and includes a concrete fix example.

Plus: conclusion banner went from \n\n to \r\n — fixes the "banner indented to prompt column" rendering in image #7.

Test plan

  • zig build test — 471/471 pass
  • zig build itest — 6/6 pass
  • zig build ollama — 3/3 pass with OLLAMA_HOST set; 3/3 skip without
  • zig fmt --check — clean
  • Manual verify: Alt+C → type test → Enter → command echo lands at shell prompt (not in panel input)
  • Manual verify: chat scrollback shows formatted "description → command" not raw JSON
  • Manual verify: question-mode response appears in chat scrollback above input

🤖 Generated with Claude Code

fentas and others added 2 commits May 17, 2026 11:28
…bug reports

Adds a new `zig build ollama` step that exercises the LLM module
against a real `OLLAMA_HOST` / `LLM_API_BASE` endpoint instead of
mocking HTTP. Each test starts with a reachability probe and returns
`SkipZigTest` when the endpoint isn't responding — so CI without
Ollama sees "N skipped" instead of a failure.

Three tests cover the protocol shape (not model content):
- single-mode round-trip returns a non-empty command
- dialog-mode round-trip parses as a valid envelope
- destructive-task dialog parses (skips on model drift)

Run with: `OLLAMA_HOST=http://localhost:11434 ATTY_TEST_MODEL=qwen3-coder:latest zig build ollama`

Also fixes four real-session bugs from screenshots:

1. **Exec command echoed into chat input row instead of shell prompt.**
   `paintInlineChat` was leaving the real cursor inside the chat
   panel's input row, so bash's echo of the injected command landed
   there. Now hide the real cursor (`?25l`) + DECSC on open, DECRC
   on close — the visible block-cursor glyph stays in the input row
   for the user, but bash's echo lands at the SHELL prompt above.

2. **Raw JSON envelope shown in chat scrollback.**
   `assistant_exec` turns held the raw `{"action":"exec",...}` for
   model context. New `renderTurnContent` helper at paint time parses
   the envelope and shows: description + cyan command for `.exec`,
   italic question for `.question`, mauve ✓ + reason for `.done`.
   Falls back to raw when parse fails (single-mode, drift).

3. **Question/done from LLM never appeared as turns in chat.**
   The `.question` branch now also pushes the assistant envelope so
   it lands in scrollback alongside the existing hint surface. (`.done`
   skipped — `dialogReset` wipes the turn ring; the conclusion banner
   scrolls into shell history instead.)

4. **Retry message was generic.**
   `requestParseRetry` now detects three specific drift patterns —
   trailing `{"open_chat": true}` second-object, markdown fences,
   prose preamble — and includes a concrete fix example in the
   corrective user turn ("emit ONE object with `open_chat` as a
   FIELD, not a separate object").

Plus: leading newlines in the conclusion banner went from `\n\n` to
`\r\n` — fixes the "banner indented" rendering shown in image #7
where the banner inherited the prompt's cursor column instead of
starting at column 1.

`zig build test` (471 pass) + `zig build itest` + `zig build ollama`
(3 pass on local qwen3-coder) + `zig fmt --check` clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@fentas fentas merged commit cecc609 into master May 17, 2026
3 checks passed
@fentas fentas deleted the feat/chat-render-and-live-tests branch May 17, 2026 10:15
fentas added a commit that referenced this pull request May 17, 2026
docs: reflect recently-shipped features

`docs/llm.md`:
- New **Chat surfaces** section explaining Alt+C inline panel vs.
  Alt+Shift+C full overlay (same conversation ring, mutually
  exclusive cursor focus).
- New **Keybindings** section listing every shipped LLM binding +
  Alt+H cheat-sheet pointer.
- Configuration reference split into sections (Core / Chat surfaces
  / Persistence / Visual signals / Buffer sizes / Model struct) so
  the table doesn't overwhelm — and adds rows for `models`,
  `inline_chat_rows`, `overlay_open_policy`, `chat_persist_enabled`,
  `chat_persist_path`, `chat_persist_max_bytes`, `history_turns_max`,
  `max_turn_bytes`, `dialog_system_prompt`, `dialog_parse_retry_max`.
- Documents the `Model` struct with the per-model
  `history_turns_max` trim knob.

`docs/modules.md`:
- Adds `default_bindings`, `onResize`, `isInlineChatActive`,
  `extraReserveRows` to the hook list.
- New **default_bindings** section walks the module-owned keymap
  pattern (dispatcher concat + user-overrides-win precedence).
- New **extraReserveRows** section documents the inline-panel
  reservation contract used by the LLM module's Alt+C panel.
@github-actions github-actions Bot mentioned this pull request May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant