[Mirror] feat: jinja engine improvements for reka-edge#96
Conversation
Port three Jinja engine improvements needed for the reka-edge model:
1. Python-style string repetition ("ab" * 3 → "ababab")
2. ensure_ascii=true support for tojson filter (escapes non-ASCII to \uXXXX)
3. int() builtin on value_int_t (identity, needed for Reka Edge template)
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds Python-style string repetition for Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/test-jinja.cpp (1)
450-460: Add boundary coverage for repetition counts.Please add
0and negative count cases so semantics are locked down (''output), not just positive-path behavior.🧪 Suggested test additions
test_template(t, "reversed string repetition", "{{ 3 * 'ab' }}", json::object(), "ababab" ); + + test_template(t, "string repetition zero", + "{{ 'ab' * 0 }}", + json::object(), + "" + ); + + test_template(t, "string repetition negative", + "{{ 'ab' * -2 }}", + json::object(), + "" + );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test-jinja.cpp` around lines 450 - 460, Add boundary tests for zero and negative repetition counts by extending the existing repetition tests using test_template; specifically, add cases for "{{ 'ab' * 0 }}" and "{{ 0 * 'ab' }}" expecting an empty string, and similarly for a negative count e.g. "{{ 'ab' * -1 }}" and "{{ -1 * 'ab' }}" expecting an empty string. Place these new test_template calls alongside the existing "string repetition" and "reversed string repetition" tests in tests/test-jinja.cpp so the semantics for zero and negative repetition are explicitly asserted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@common/jinja/value.cpp`:
- Around line 221-225: The code currently pushes the raw byte 'ch' into output
when UTF-8 parsing fails (in the block handling parsed.status !=
utf8_parse_result::SUCCESS), which breaks the ASCII-only guarantee when
ensure_ascii is true; update that branch (around variables parsed, json_str,
pos, ch, and the ensure_ascii flag) to emit an ASCII-safe escape for the
offending byte (e.g., a \u00XX hex escape or a \\xXX sequence consistent with
the surrounding JSON emitter) instead of pushing the raw byte, and then advance
pos as before so invalid/incomplete UTF-8 bytes are consumed while preserving
the ensure_ascii contract.
---
Nitpick comments:
In `@tests/test-jinja.cpp`:
- Around line 450-460: Add boundary tests for zero and negative repetition
counts by extending the existing repetition tests using test_template;
specifically, add cases for "{{ 'ab' * 0 }}" and "{{ 0 * 'ab' }}" expecting an
empty string, and similarly for a negative count e.g. "{{ 'ab' * -1 }}" and "{{
-1 * 'ab' }}" expecting an empty string. Place these new test_template calls
alongside the existing "string repetition" and "reversed string repetition"
tests in tests/test-jinja.cpp so the semantics for zero and negative repetition
are explicitly asserted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d01b0fb5-11e0-478e-9722-ce641ab984e7
📒 Files selected for processing (3)
common/jinja/runtime.cppcommon/jinja/value.cpptests/test-jinja.cpp
The json_ensure_ascii_preserving_format function does not correctly handle an edge case where if UTF-8 parsing fails, it adds the non-ascii character back to the output as a raw byte. This commit fixes that by adding the unicode standard replacement character \\ufffd to the output instead. This is the standard behavior for various programming languages like Python, Rust, Go, etc.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
1. Add todo comment for supporting string repetition for array/tuples 2. Add support for float identity operation 3. Move invalid ascii test case to test_fuzzing
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
@coderabbitai use "mirror" preset for review
ggml-org#21623
Summary by CodeRabbit
New Features
*in either operand order (e.g.,'ab' * 3and3 * 'ab'→'ababab')intandfloatnumeric builtins (identity behavior)tojson(ensure_ascii=true)now escapes non‑ASCII as\uXXXX(preserves formatting and existing escapes; handles invalid UTF‑8 safely)Tests
tojsonensure_ascii behavior (formatting and escape preservation), invalid UTF‑8 fuzzing, andintbuiltin