Wire regex format presets into sources#21
Conversation
Blacksmith Account SuspendedThis Blacksmith account requires additional verification. Jobs targeting Blacksmith runners will not be picked up and will remain queued until they timeout. Please contact Blacksmith Support for assistance. |
3 similar comments
Blacksmith Account SuspendedThis Blacksmith account requires additional verification. Jobs targeting Blacksmith runners will not be picked up and will remain queued until they timeout. Please contact Blacksmith Support for assistance. |
Blacksmith Account SuspendedThis Blacksmith account requires additional verification. Jobs targeting Blacksmith runners will not be picked up and will remain queued until they timeout. Please contact Blacksmith Support for assistance. |
Blacksmith Account SuspendedThis Blacksmith account requires additional verification. Jobs targeting Blacksmith runners will not be picked up and will remain queued until they timeout. Please contact Blacksmith Support for assistance. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Plus Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR wires named regex-format presets into the Flyctl source path and the iteration summariser: a new ChangesFormat Preset Support for Sources
Possibly related PRs
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/CONFIG.md`:
- Around line 238-245: The note currently warns that `format =
"apache-combined"` on a `flyctl` source will be dropped by paperbark's cursor
filter because lines lack a leading ISO timestamp; update this caveat to also
mention `syslog-rfc5424` (RFC5424 lines start with `<PRI>1` and likewise lack a
leading ISO timestamp) so readers know `format = "syslog-rfc5424"` will be
dropped by the cursor filter before parsing; mirror the same wording change in
the corresponding ROADMAP/CHANGELOG entries and ensure the docs reference
`format = "apache-combined"` and `format = "syslog-rfc5424"` explicitly so
there's no ambiguity about flyctl compatibility.
In `@src/paperbark/iteration.py`:
- Around line 102-108: summarise_lines currently silently ignores format_keys
when line_format is provided; mirror build_source's behavior by rejecting the
combination at the API boundary: inside summarise_lines (the branch that calls
_summarise_with_format when line_format is not None) detect if format_keys is
not None and raise a clear ValueError (or custom exception) indicating
format_keys is incompatible with line_format (referencing format_keys and
line_format in the message); this ensures the same fail-closed behavior as
build_source and prevents the silent contract change when callers use
summarise_lines calling _summarise_with_format.
- Around line 319-327: The helper _format_record_parsed currently treats a
record as parsed only if one of timestamp, level, message, or component is
non-empty, which misclassifies records that only populate status or duration_ms;
update _format_record_parsed to also consider record.status and
record.duration_ms (or their equivalent keys on the parsed record) when deciding
if a record is parsed so that any non-empty canonical field including status or
duration_ms returns True; reference the _format_record_parsed function and
ensure the boolean expression includes record.status and record.duration_ms.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: f48d21f9-4e41-4846-8ecb-234f6796fba9
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (9)
CHANGELOG.mddocs/CONFIG.mddocs/ROADMAP.mddocs/SOURCES.mdsrc/paperbark/dispatcher.pysrc/paperbark/iteration.pysrc/paperbark/sources/flyctl.pytests/test_dispatcher.pytests/test_iteration.py
Summary
[[sources]]acceptsformat = "<preset>"(one ofjson,apache-combined,nginx-default,syslog-rfc5424); the iteration parser routes through the format layer when set, so non-JSON payloads parse cleanly without forking.format_keysstays JSON-only and is rejected when combined with a non-jsonformatso the operator's intent is never silently dropped.file/kubectl/cloudwatchland.Top-level v0.2 shortlist item from
docs/ROADMAP.md.Changes
src/paperbark/iteration.py: newline_formatparameter onsummarise_lines/summarise_log_file; format-driven path mirrors the JSON shape, treats all-empty regex matches asfailed_to_parse.src/paperbark/dispatcher.py: new_resolve_formatresolver againstpaperbark.formats.registered_formats; flyctl branch acceptsformat, conflict-checks withformat_keys, attachessource.line_format.capture_iterationreads the attribute and threads it through.src/paperbark/sources/flyctl.py: acceptsline_formatconstructor arg.docs/CONFIG.md,docs/SOURCES.md,docs/ROADMAP.md,CHANGELOG.mdupdated.Test plan
uv run ruff check src/ tests/uv run ruff format --check src/ tests/uv run mypy src/uv run pytest -q(368 passed)uv run --with pip-audit pip-audit(no vulnerabilities)uv run pre-commit run --files <touched>(all hooks pass)Need help on this PR? Tag
@codesmithwith what you need.Summary by CodeRabbit
New Features
formatoption for sources with presets (json, apache-combined, nginx-default, syslog-rfc5424) to parse non‑JSON logs.format_keysis now JSON‑only and cannot be combined with non‑JSON formats.Documentation
Tests