[Mirror] feat: jinja engine improvements for reka-edge by ngxson · Pull Request #96 · ngxson/llama.cpp

ngxson · 2026-04-08T12:57:11Z

@coderabbitai use "mirror" preset for review

Summary by CodeRabbit

New Features
- String repetition with * in either operand order (e.g., 'ab' * 3 and 3 * 'ab' → 'ababab')
- int and float numeric builtins (identity behavior)
- tojson(ensure_ascii=true) now escapes non‑ASCII as \uXXXX (preserves formatting and existing escapes; handles invalid UTF‑8 safely)
Tests
- Added tests for string repetition, tojson ensure_ascii behavior (formatting and escape preservation), invalid UTF‑8 fuzzing, and int builtin

Port three Jinja engine improvements needed for the reka-edge model: 1. Python-style string repetition ("ab" * 3 → "ababab") 2. ensure_ascii=true support for tojson filter (escapes non-ASCII to \uXXXX) 3. int() builtin on value_int_t (identity, needed for Reka Edge template)

coderabbitai · 2026-04-08T12:57:36Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e23a78e6-0924-460d-ab2d-ca878b9fd7d1

📥 Commits

Reviewing files that changed from the base of the PR and between 0bd6415 and 16af73b.

📒 Files selected for processing (1)

common/jinja/value.cpp

🚧 Files skipped from review as they are similar to previous changes (1)

common/jinja/value.cpp

📝 Walkthrough

Walkthrough

Adds Python-style string repetition for * when mixing strings and integers, implements ensure_ascii=true post-processing for tojson to escape non‑ASCII code points (including surrogate pairs) while preserving JSON layout and escapes, adds int and float builtins, and extends tests.

Changes

Cohort / File(s)	Summary
String Repetition `common/jinja/runtime.cpp`	Adds `*` operator handling for `value_string` × `value_int` (either operand order); derives repeat count from the integer, returns empty string for repeat ≤ 0; placed before membership (`in`/`not in`) logic.
JSON, UTF‑8, & Builtins `common/jinja/value.cpp`, `common/jinja/unicode.h`	Adds UTF‑8 helper utilities and `unicode.h`; changes `tojson` to always produce JSON then, when `ensure_ascii` is truthy, post‑process JSON string literals to replace non‑ASCII code points with `\uXXXX` escapes or surrogate pairs, preserve existing escapes, emit `\ufffd` on UTF‑8 parse errors; adds `{"int", ...}` and `{"float", ...}` builtins that return the numeric value unchanged.
Tests `tests/test-jinja.cpp`	Adds tests for `'ab' * 3` and `3 * 'ab'`, expanded `tojson(ensure_ascii=true)` coverage (nested objects, exact indent/newline formatting, preservation of existing escapes, invalid UTF‑8 fuzz), and an `int` builtin identity test.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Poem

🐇 I hopped through code and found a star,
* makes echoes of "ab" both near and far,
JSON shields odd letters with a careful cloak,
int and float stay simple — no folk joke,
Tests twinkle bright; I nibble a carrot, hark! 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is largely incomplete, missing required sections like Overview, detailed information about the changes, and proper AI usage disclosure as required by the template.	Complete the description with an Overview section explaining the changes, provide detailed additional information, and explicitly state AI usage disclosure (YES/NO).
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly describes the main change: Jinja engine improvements for reka-edge support, which aligns with the substantial modifications to Jinja runtime, value handling, and test coverage.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/test-jinja.cpp (1)

450-460: Add boundary coverage for repetition counts.

Please add 0 and negative count cases so semantics are locked down ('' output), not just positive-path behavior.

🧪 Suggested test additions

     test_template(t, "reversed string repetition",
         "{{ 3 * 'ab' }}",
         json::object(),
         "ababab"
     );
+
+    test_template(t, "string repetition zero",
+        "{{ 'ab' * 0 }}",
+        json::object(),
+        ""
+    );
+
+    test_template(t, "string repetition negative",
+        "{{ 'ab' * -2 }}",
+        json::object(),
+        ""
+    );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test-jinja.cpp` around lines 450 - 460, Add boundary tests for zero and
negative repetition counts by extending the existing repetition tests using
test_template; specifically, add cases for "{{ 'ab' * 0 }}" and "{{ 0 * 'ab' }}"
expecting an empty string, and similarly for a negative count e.g. "{{ 'ab' * -1
}}" and "{{ -1 * 'ab' }}" expecting an empty string. Place these new
test_template calls alongside the existing "string repetition" and "reversed
string repetition" tests in tests/test-jinja.cpp so the semantics for zero and
negative repetition are explicitly asserted.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@common/jinja/value.cpp`:
- Around line 221-225: The code currently pushes the raw byte 'ch' into output
when UTF-8 parsing fails (in the block handling parsed.status !=
utf8_parse_result::SUCCESS), which breaks the ASCII-only guarantee when
ensure_ascii is true; update that branch (around variables parsed, json_str,
pos, ch, and the ensure_ascii flag) to emit an ASCII-safe escape for the
offending byte (e.g., a \u00XX hex escape or a \\xXX sequence consistent with
the surrounding JSON emitter) instead of pushing the raw byte, and then advance
pos as before so invalid/incomplete UTF-8 bytes are consumed while preserving
the ensure_ascii contract.

---

Nitpick comments:
In `@tests/test-jinja.cpp`:
- Around line 450-460: Add boundary tests for zero and negative repetition
counts by extending the existing repetition tests using test_template;
specifically, add cases for "{{ 'ab' * 0 }}" and "{{ 0 * 'ab' }}" expecting an
empty string, and similarly for a negative count e.g. "{{ 'ab' * -1 }}" and "{{
-1 * 'ab' }}" expecting an empty string. Place these new test_template calls
alongside the existing "string repetition" and "reversed string repetition"
tests in tests/test-jinja.cpp so the semantics for zero and negative repetition
are explicitly asserted.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d01b0fb5-11e0-478e-9722-ce641ab984e7

📥 Commits

Reviewing files that changed from the base of the PR and between 87f4744 and d29575d.

📒 Files selected for processing (3)

common/jinja/runtime.cpp
common/jinja/value.cpp
tests/test-jinja.cpp

The json_ensure_ascii_preserving_format function does not correctly handle an edge case where if UTF-8 parsing fails, it adds the non-ascii character back to the output as a raw byte. This commit fixes that by adding the unicode standard replacement character \\ufffd to the output instead. This is the standard behavior for various programming languages like Python, Rust, Go, etc.

ngxson · 2026-04-08T13:55:03Z

@coderabbitai review

coderabbitai · 2026-04-08T13:55:10Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

1. Add todo comment for supporting string repetition for array/tuples 2. Add support for float identity operation 3. Move invalid ascii test case to test_fuzzing

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

coderabbitai Bot reviewed Apr 8, 2026

View reviewed changes

Comment thread common/jinja/value.cpp

kwajiehao and others added 2 commits April 9, 2026 12:35

chore: address PR comments

0bd6415

1. Add todo comment for supporting string repetition for array/tuples 2. Add support for float identity operation 3. Move invalid ascii test case to test_fuzzing

chore: accept suggestion for common/jinja/value.cpp

16af73b

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

github-actions Bot added testing jinja parser labels Apr 9, 2026

ngxson closed this Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Mirror] feat: jinja engine improvements for reka-edge#96

[Mirror] feat: jinja engine improvements for reka-edge#96
ngxson wants to merge 4 commits into
ngxson:masterfrom
kwajiehao:feat/jinja-engine-additions-reka-edge

ngxson commented Apr 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 8, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

ngxson commented Apr 8, 2026

Uh oh!

coderabbitai Bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ngxson commented Apr 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ngxson commented Apr 8, 2026

Uh oh!

coderabbitai Bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ngxson commented Apr 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 8, 2026 •

edited

Loading