feat(cli): add optional summarize preprocessing via local LLM#1
feat(cli): add optional summarize preprocessing via local LLM#1
Conversation
Add --summarize and --llm-* flags/env defaults, integrate OpenAI-compatible summarization through pydantic-ai, and enforce no-op failure semantics (abort audio generation on summarization errors). Also externalizes the summarization prompt to src/koko_cli/prompts/summarize_for_speech.txt and adds a smoke-llm just recipe.
📝 WalkthroughWalkthroughAdds an optional CLI summarization step that preprocesses input text via an OpenAI‑compatible LLM before synthesis. Adds CLI flags/env vars, settings/models fields, a new summarization module and prompt, service integration to run summarization before TTS, tests, docs, and tooling/build updates. Changes
Sequence DiagramsequenceDiagram
participant CLI as CLI Client
participant Service as run_speak
participant Summarizer as Summarization Module
participant LLM as LLM Provider
participant Synth as Speech Synthesizer
CLI->>Service: invoke run_speak(text, summarize=true, llm_params)
Service->>Summarizer: summarize_for_speech(text, base_url, model, api_key, timeout, max_chars)
Summarizer->>Summarizer: truncate_summary_input()
Summarizer->>LLM: build Agent/Model and call LLM
LLM-->>Summarizer: model response
Summarizer->>Summarizer: normalize_summary_output()
alt summary produced
Summarizer-->>Service: summarized_text
Service->>Synth: synthesize(summarized_text)
Synth-->>Service: audio_output
Service-->>CLI: exit 0 (audio)
else failure
Summarizer-->>Service: raise SummarizationError
Service-->>CLI: exit 1 (no audio)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 168eed7355
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
🧹 Nitpick comments (3)
tests/test_cli.py (1)
272-292: Consider usingtmp_pathfixture instead of hardcoded/tmppath.Static analysis flags the hardcoded
/tmp/koko-offline-summary.wavpath (S108). While acceptable in tests, usingtmp_pathwould be more consistent with other tests in this file and avoids potential issues in restricted environments.♻️ Proposed fix
def test_offline_summarize_requires_local_llm_base_url( monkeypatch: pytest.MonkeyPatch, tmp_path: Path, capsys: pytest.CaptureFixture[str], ) -> None: monkeypatch.setattr(cli, "load_settings", lambda: make_settings(default_model_dir=tmp_path / "missing-model-dir")) + output_path = tmp_path / "koko-offline-summary.wav" exit_code = cli.main( [ "--summarize", "--llm-base-url", "https://api.example.com/v1", "--no-play", "--output", - "/tmp/koko-offline-summary.wav", + str(output_path), "hello", ] )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_cli.py` around lines 272 - 292, In test_offline_summarize_requires_local_llm_base_url replace the hardcoded "/tmp/koko-offline-summary.wav" with a file inside the provided tmp_path fixture (e.g. str(tmp_path / "koko-offline-summary.wav")) when calling cli.main so the test uses an isolated temp file; update the argument passed in the cli.main invocation and keep the rest of the test (monkeypatch, expected exit code and error assertion) unchanged.src/koko_cli/summarization.py (1)
95-101: Consider handling additional local address formats.The function correctly identifies
127.0.0.1,localhost, and::1, but may miss edge cases like:
0.0.0.0(binds to all interfaces but often used for local servers)- IPv6 localhost with brackets:
[::1]This is minor since the primary use case (Ollama at
127.0.0.1) is covered.💡 Optional: Extend local address detection
def is_local_llm_base_url(base_url: str) -> bool: """Return True when the model endpoint resolves to a local host.""" parse_target = base_url if "://" in base_url else f"http://{base_url}" parsed = urlparse(parse_target) hostname = (parsed.hostname or "").lower() - return hostname in {"127.0.0.1", "localhost", "::1"} + return hostname in {"127.0.0.1", "localhost", "::1", "0.0.0.0"}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/koko_cli/summarization.py` around lines 95 - 101, The is_local_llm_base_url function only checks for "127.0.0.1", "localhost", and "::1"; update it to also treat "0.0.0.0" and bracketed IPv6 addresses as local by normalizing the parsed hostname (strip surrounding brackets if present, lower-case) and then checking against the expanded set {"127.0.0.1","localhost","::1","0.0.0.0"}; apply this change inside is_local_llm_base_url (use the existing parse_target/urlparse logic and hostname variable) so bracketed IPv6 like "[::1]" and 0.0.0.0 are correctly recognized as local.src/koko_cli/parsers.py (1)
119-124: Minor validation gap: CLI allows values 1-255 that fail at model level.The
positive_intvalidator permits any value > 0, butSpeakCommand.llm_max_input_charsrequiresge=256. Users passing values like--llm-max-input-chars 100will get a less intuitive validation error from the model layer.Consider adding a custom validator for
>= 256or documenting the minimum in the help text.💡 Option: Add minimum to help text
parser.add_argument( "--llm-max-input-chars", type=positive_int, default=settings.llm_max_input_chars, - help="Maximum input characters sent to summarization model (>0).", + help="Maximum input characters sent to summarization model (min: 256).", )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/koko_cli/parsers.py` around lines 119 - 124, The CLI currently uses positive_int for --llm-max-input-chars which allows values >0 but SpeakCommand.llm_max_input_chars requires ge=256; update the parser.add_argument for "--llm-max-input-chars" to enforce the model's minimum (>=256) by replacing or wrapping positive_int with a validator that rejects values <256 (or alternatively append "minimum 256" to the help string) so invalid inputs are caught at the CLI level and reference the same constraint as SpeakCommand.llm_max_input_chars.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/koko_cli/parsers.py`:
- Around line 119-124: The CLI currently uses positive_int for
--llm-max-input-chars which allows values >0 but
SpeakCommand.llm_max_input_chars requires ge=256; update the parser.add_argument
for "--llm-max-input-chars" to enforce the model's minimum (>=256) by replacing
or wrapping positive_int with a validator that rejects values <256 (or
alternatively append "minimum 256" to the help string) so invalid inputs are
caught at the CLI level and reference the same constraint as
SpeakCommand.llm_max_input_chars.
In `@src/koko_cli/summarization.py`:
- Around line 95-101: The is_local_llm_base_url function only checks for
"127.0.0.1", "localhost", and "::1"; update it to also treat "0.0.0.0" and
bracketed IPv6 addresses as local by normalizing the parsed hostname (strip
surrounding brackets if present, lower-case) and then checking against the
expanded set {"127.0.0.1","localhost","::1","0.0.0.0"}; apply this change inside
is_local_llm_base_url (use the existing parse_target/urlparse logic and hostname
variable) so bracketed IPv6 like "[::1]" and 0.0.0.0 are correctly recognized as
local.
In `@tests/test_cli.py`:
- Around line 272-292: In test_offline_summarize_requires_local_llm_base_url
replace the hardcoded "/tmp/koko-offline-summary.wav" with a file inside the
provided tmp_path fixture (e.g. str(tmp_path / "koko-offline-summary.wav")) when
calling cli.main so the test uses an isolated temp file; update the argument
passed in the cli.main invocation and keep the rest of the test (monkeypatch,
expected exit code and error assertion) unchanged.
Address review feedback by enforcing --llm-max-input-chars >= 256 at parse time, broadening local URL detection (0.0.0.0/[::1]), using tmp_path in tests, and lazy-loading pydantic-ai so base CLI remains usable when summarize deps are absent.
Remove unresolved pydantic-ai requirement from wheel metadata during Nix build until nixpkgs exposes the package. Summarize mode now lazy-loads the dependency and errors only when invoked without it.
Run fast format/lint/type hooks on pre-commit and move full lint/test/nix build gates to pre-push. Update just recipes and README examples for precommit-run/prepush-run.
|
Addressed the review comments and CI failure in follow-up commits:
Local verification rerun:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
flake.nix (1)
49-55: Reasonable temporary workaround for missing nixpkgs dependency.The use of
pythonRemoveDepsto strippydantic-aifrom wheel metadata is appropriate given the dependency isn't yet packaged in nixpkgs. The lazy-loading strategy mentioned in PR objectives ensures the base CLI remains functional.Consider adding a TODO or issue reference to track when
pydantic-aibecomes available in nixpkgs, so this workaround can be removed:- # `pydantic-ai` is not yet packaged in nixpkgs; remove it from wheel + # TODO: Remove this workaround once pydantic-ai is packaged in nixpkgs. + # `pydantic-ai` is not yet packaged in nixpkgs; remove it from wheel # runtime metadata so Nix package checks stay green while base CLI # remains usable. Summarization path requires pydantic-ai at runtime.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@flake.nix` around lines 49 - 55, Add a short TODO comment and/or an issue tracker reference near the pythonRemoveDeps array that removes "pydantic-ai" (the pythonRemoveDeps entry) so future maintainers know this is a temporary workaround and can revert it once pydantic-ai lands in nixpkgs; include the expected condition (e.g., "remove when pydantic-ai is available in nixpkgs" or a link/ID to the tracking issue) and attach it adjacent to the pythonRemoveDeps block to make it easy to find.src/koko_cli/summarization.py (1)
67-70: Remove unusednoqadirective.Static analysis indicates
# noqa: BLE001is unused. The broadExceptioncatch is appropriate here to handle any LLM errors gracefully, but the directive can be removed.Proposed fix
try: result = agent.run_sync(prepared_text, model_settings=ModelSettings(timeout=timeout_seconds)) - except Exception as error: # noqa: BLE001 + except Exception as error: raise SummarizationError(f"Summarization request failed: {error}") from error🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/koko_cli/summarization.py` around lines 67 - 70, Remove the unused noqa directive on the broad exception handler: in the try/except around agent.run_sync(prepared_text, model_settings=ModelSettings(timeout=timeout_seconds)) delete the "# noqa: BLE001" from the line "except Exception as error:" so the except block remains catching Exception and raising SummarizationError(f"Summarization request failed: {error}") from error unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@justfile`:
- Around line 68-71: The justfile targets smoke-llm (and similarly smoke-e2e)
pass the output path unquoted which breaks on paths with spaces; edit the uv run
command in the smoke-llm recipe (and the smoke-e2e recipe on the adjacent line)
to wrap the {{ output }} variable in quotes in the command that invokes koko so
the --output argument becomes "--output '{{ output }}'"; this change ensures
path safety while keeping {{ text }} handling unchanged.
In `@README.md`:
- Around line 122-127: Update the README flag description for
--llm-max-input-chars to reflect the actual enforced constraint: replace "(> 0)"
with "(>= 256)" since the code uses the min_256_int validator; locate the list
entry for --llm-max-input-chars in the flags/options section and change the
constraint text to match min_256_int.
---
Nitpick comments:
In `@flake.nix`:
- Around line 49-55: Add a short TODO comment and/or an issue tracker reference
near the pythonRemoveDeps array that removes "pydantic-ai" (the pythonRemoveDeps
entry) so future maintainers know this is a temporary workaround and can revert
it once pydantic-ai lands in nixpkgs; include the expected condition (e.g.,
"remove when pydantic-ai is available in nixpkgs" or a link/ID to the tracking
issue) and attach it adjacent to the pythonRemoveDeps block to make it easy to
find.
In `@src/koko_cli/summarization.py`:
- Around line 67-70: Remove the unused noqa directive on the broad exception
handler: in the try/except around agent.run_sync(prepared_text,
model_settings=ModelSettings(timeout=timeout_seconds)) delete the "# noqa:
BLE001" from the line "except Exception as error:" so the except block remains
catching Exception and raising SummarizationError(f"Summarization request
failed: {error}") from error unchanged.
- align README llm-max-input-chars docs with enforced >=256 constraint\n- quote just smoke recipe output paths for space-safe handling\n- add TODO on temporary pythonRemoveDeps pydantic-ai workaround in Nix\n- remove stale noqa marker from summarize exception handler
|
Follow-up bot comments are now addressed and pushed in
I also resolved the corresponding review threads. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/koko_cli/summarization.py`:
- Around line 79-88: The function truncate_summary_input currently appends a
truncation notice after slicing to max_input_chars which can make the final
string exceed the intended cap; update truncate_summary_input to reserve space
for the notice by computing note = "\n\n[Input truncated before summarization.]"
and using allowed = max_input_chars - len(note) to slice normalized[:allowed]
when allowed > 0, then append the note; if allowed <= 0, return
normalized[:max_input_chars].rstrip() (i.e., omit the notice when there isn’t
room) so the returned string never exceeds max_input_chars.
Reserve truncation-note budget in truncate_summary_input so returned text never exceeds max_input_chars, and add regression tests for both notice-fit and small-cap fallback paths.
|
Addressed the new CodeRabbit feedback in
Verification rerun:
Resolved the corresponding review thread. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_summarization.py`:
- Around line 62-63: The fake agent stub method run_sync currently declares
parameters as args and kwargs which triggers Ruff ARG002; rename them to _args
and _kwargs (or prefix each unused parameter with an underscore) in run_sync and
the other stub methods at the locations referenced (e.g., the similar stub
around lines 102-103) so unused parameters are prefixed with underscores and the
functions still return FakeResult() unchanged.
Summary
--summarizepreprocessing path usingpydantic-aiover an OpenAI-compatible APIsrc/koko_cli/prompts/summarize_for_speech.txt(domain-agnostic prompt)--llm-*CLI/env configuration andjust smoke-llmrecipeVerification
just fmtjust fmt-nixjust lintjust test(29 passed)just smoke-e2e /tmp/koko-local-assets /tmp/koko-smoke.wav "Koko local inference smoke test"just smoke-llm /tmp/koko-local-assets http://127.0.0.1:11434/v1 mistralai_Mistral-Small-3.2-24B-Instruct-2506-Q6_K /tmp/koko-llm-smoke.wav "### Build\\n- tests passed\\n- changed 14 files"uv build --wheel(verified prompt resource is packaged)Notes
pydantic-aiis intentionally deferred for a follow-up packaging pass.Summary by CodeRabbit
New Features
Documentation
Tests
Chores