fix(engine): actionable error when a Jinja field is missing/None/empty by nabinchha · Pull Request #633 · NVIDIA-NeMo/DataDesigner

nabinchha · 2026-05-11T17:30:22Z

Summary

Closes #629.

Empty-render and missing-attribute Jinja failures used to surface as the same unhelpful message:

🛑 Failed to process column 'preferred_english_name': User provided prompt generation template is invalid.

Two distinct failure modes were collapsing into that generic string:

Bug 1 — empty render (e.g. {{ person.preferred_english_name }} where the key is missing): _assert_rendered_text_not_empty raised a descriptive UserTemplateError, but sanitize_user_exceptions then replaced the message with the generic one.
Bug 3 — raw UndefinedError (e.g. {{ person.address.street }} where address is missing): a jinja2.exceptions.UndefinedError leaked all the way up because the sanitizer only caught UserTemplateError / TemplateSyntaxError.

Both paths now raise a new EmptyTemplateRenderError(UserTemplateError) subclass that the sanitizer is explicitly told to pass through. The exception carries an actionable, row-level message naming the offending chain plus copy-pasteable Jinja conditional and SkipConfig remediation patterns:

Template rendered to empty text. This usually happens when one or more
referenced fields are missing, None, or empty in this row.

Likely culprits in this row:
  - person.address.street (missing from record)

To handle missing values, you can:

  1. Provide a fallback in your template using a Jinja conditional:
       {{ person.address.street if person.address else 'N/A' }}

  2. Skip rows where required fields are missing using SkipConfig:
       skip=SkipConfig(when="{{ not person.address }}")

The "gate" expression in the suggestions stops one accessor past the resolvable prefix so it stays safe to evaluate (avoiding a fresh UndefinedError on the suggested fix itself).

What changed

packages/data-designer-engine/src/data_designer/engine/processing/ginja/:

exceptions.py — new EmptyTemplateRenderError subclass.
ast.py — new public helpers ast_extract_access_chains and resolve_access_chain (plus an AccessChain type alias).
environment.py:
- _assert_rendered_text_not_empty / validate_rendered_text thread the template + record through and raise the new error with the per-row diagnostic.
- UserTemplateSandboxEnvironment.safe_render catches UndefinedError and converts it to the same EmptyTemplateRenderError. NativeJinjaSandboxEnvironment.render_template mirrors the conversion for consistency.
- sanitize_user_exceptions bypasses EmptyTemplateRenderError (same treatment as UserTemplateUnsupportedFiltersError).

End-to-end propagation already prepends the column name via _run_batch, so no other layers needed changes.

Test plan

New regression tests for both bugs (test_empty_render_raises_empty_template_render_error_with_culprit_chain, test_undefined_nested_attr_raises_empty_template_render_error_with_safe_gate).
Coverage for the "resolved-to-None" / "resolved-to-empty-string" / multi-culprit branches.
Test that the sanitizer no longer collapses the new error to the generic message.
Test that NativeJinjaSandboxEnvironment also converts UndefinedError.
Parametrized coverage for ast_extract_access_chains (plain text, single name, nested attr, subscript, mixed, dynamic subscript skipped, loop/conditional context) and resolve_access_chain (resolved, partially-resolved, type-mismatch cases).
Happy-path render still works; one image-cell-generator test was updated from match="invalid" to match="empty" to reflect the new (more informative) message.
uv run pytest packages/data-designer-engine/tests/engine/processing/ginja/ — 95 passed.
uv run ruff format + uv run ruff check — clean.

Empty-render and missing-attribute failures used to surface as the generic "User provided prompt generation template is invalid." either because `sanitize_user_exceptions` stripped the detail or because Jinja's raw `UndefinedError` leaked through. Both now raise a new `EmptyTemplateRenderError` carrying a row-level diagnostic that names the offending chain and includes copy-pasteable Jinja conditional and SkipConfig fix patterns. Closes #629.

github-actions · 2026-05-11T17:32:57Z

PR #633 Review — `fix(engine): actionable error when a Jinja field is missing/None/empty`

Summary

Closes #629. Two distinct failure modes in Jinja rendering used to collapse into the same unhelpful "User provided prompt generation template is invalid." message:

Empty render — template like {{ person.preferred_english_name }} where the key is missing on the record.
Raw UndefinedError — template like {{ person.address.street }} where person.address is missing, leaking Jinja's internal "'dict object' has no attribute 'address'".

The PR introduces EmptyTemplateRenderError(UserTemplateError) and teaches sanitize_user_exceptions to pass it through unaltered. Both the secure (UserTemplateSandboxEnvironment.safe_render) and native (NativeJinjaSandboxEnvironment.render_template) engines now convert UndefinedError + empty-render cases to this new error, which carries a row-level diagnostic naming culprit chains and giving copy-pasteable Jinja-conditional and SkipConfig remediation patterns. Two new AST helpers (ast_extract_access_chains, resolve_access_chain) back the culprit detection.

Findings

Correctness

Exception chaining inconsistency (minor). NativeJinjaSandboxEnvironment.render_template uses raise EmptyTemplateRenderError(...) from exception (environment.py:248-251), preserving the original UndefinedError in __cause__. UserTemplateSandboxEnvironment.safe_render re-raises without from exception (environment.py:402). The messages are curated so there's no user-visible difference, but __cause__ is useful when someone is debugging from a traceback. Recommend raise EmptyTemplateRenderError(...) from exception in safe_render too.
_build_empty_render_message fallback path is robust. Wrapping parser(user_template) + ast_extract_access_chains(...) in try/except Exception and falling back to a generic message avoids masking the original render failure. Good defensive posture.
resolve_access_chain list bounds. Correctly rejects out-of-bounds indices (acc >= len(current) or acc < -len(current)). Negative literal subscripts (e.g. {{ items[-1] }}) would reach resolve_access_chain as int accessors through _build_access_chain, and Python's native negative indexing on current[acc] handles them. Good.
Dict/list/scalar branching is complete. isinstance(current, dict), isinstance(current, list), fallthrough else — the three cases cover the sanitized record value types.
Gate expression derivation is safe. The one-past-resolvable-prefix slice (sample_accessors[: len(sample_prefix) + 1]) yields a Jinja Undefined leaf on evaluation, which stringifies to "" and is falsy — doesn't re-trigger UndefinedError. This is subtle; the inline comment at environment.py explains the reasoning clearly.
culprits[0] for the remediation suggestion. Only the first culprit becomes the template for the remediation example. With multiple culprits listed in the bullets, a user might fix only the first and rerun. Arguably OK (they'll iterate), but consider noting "(showing fix for first culprit)" if user feedback reports confusion.
Sanitizer order. sanitize_user_exceptions now catches (UserTemplateUnsupportedFiltersError, EmptyTemplateRenderError) before (UserTemplateError, TemplateSyntaxError). Since Python matches except clauses in order and both are UserTemplateError subclasses, the ordering is load-bearing. Correct as written. A short comment noting "ordering matters — these are UserTemplateError subclasses" would help future maintainers.
Dynamic subscripts. _build_access_chain returns None for a[b]-style chains, and the visitor descends into the subscript expression with in_chain=False, extracting b as its own chain. Matches the docstring behavior and the parametrized test coverage.

Project conventions

from __future__ import annotations present in all new/modified files ✓
Absolute imports only ✓
Modern type syntax (list[str], str | None, AccessChain = tuple[str, list[str | int]]) ✓
Type annotations on all new public callables ✓
EmptyTemplateRenderError subclasses UserTemplateError, preserving the canonical-error-at-boundaries invariant from AGENTS.md ✓
New AST helpers land in ginja/ast.py alongside existing helpers; _build_access_chain is private — good separation.
Module-level _CULPRIT_* sentinels are fine, but a small Enum or Literal type alias would make the classifier signature self-documenting. Not worth blocking on.

Tests

95 tests in ginja/ passing per the PR description; diff shows thorough new coverage:
- test_ast.py: parametrized chain-extraction (plain, nested, subscript, mixed, dynamic-subscript skip, loop/conditional context), parametrized resolve_access_chain (fully resolved, partially resolved, type-mismatch, list bounds).
- test_environment.py: both bug regressions, culprit-classification branches (missing / None / empty string), multi-culprit listing, sanitizer-bypass assertion, native-engine parity, happy path, skip_template_validation=True path.
The image-cell-generator test was updated from match="invalid" to match="empty" with an explanatory comment pointing at issue Cryptic "User provided prompt generation template is invalid" error when a referenced template field is empty or missing #629. Good — this is exactly the kind of test churn worth annotating.
Gap (small): No test asserts the fallback message when _build_empty_render_message's parse branch fails (e.g., a monkey-patched parser that throws). The code path exists (environment.py:346-352) but is not covered. Arguably not worth chasing given how defensive it is.
Gap (small): No test for a chain with a non-identifier string key rendered via [...] bracket notation (e.g. {{ data["weird key"] }}). _format_access_chain has the branch; a parametrized case would close the loop.

Performance

Culprit detection runs only on the error path. It reparses user_template once per failure. Negligible.
ast_extract_access_chains is O(n) in AST size; visit does no backtracking.

Security

No new attack surface. The helpers operate on already-sanitized records and the existing trusted parse path.
The curated error message embeds user data (record keys in the access chain) into the exception string. This is the same blast radius as prior UserTemplateError messages — acceptable.

Documentation

Docstrings on new helpers are thorough and accurate. safe_render's docstring is updated to mention EmptyTemplateRenderError. ✓
Inline comments explain the non-obvious bits (top-level chain extraction, safe gate derivation, why the sanitizer bypasses this error class).

Suggested follow-ups (non-blocking)

Add from exception to the UndefinedError → EmptyTemplateRenderError re-raise in UserTemplateSandboxEnvironment.safe_render for traceback consistency with the native engine path.
Consider a one-line comment on the sanitize_user_exceptions except-clause noting that ordering is load-bearing.
Add a test case for non-identifier string subscripts in _format_access_chain.
If user reports come in about the multi-culprit case being ambiguous, consider labelling the remediation example ("fix for first culprit shown").

Verdict

Approve with minor nits. The change is well-scoped, well-tested, and genuinely fixes the reported bug class (both the "empty render" and "undefined nested attr" variants) without expanding blast radius. Exception taxonomy and sanitizer bypass are implemented correctly. The nits above are small-cost polish; none block merge.

greptile-apps · 2026-05-11T17:37:21Z

Greptile Summary

This PR replaces two different sources of a generic, unactionable Jinja failure message with a new EmptyTemplateRenderError subclass that names the offending access chain and provides copy-pasteable remediation patterns (Jinja conditional fallback and SkipConfig gate).

Exception hierarchy: EmptyTemplateRenderError(UserTemplateError) is added so sanitize_user_exceptions can pass it through unchanged, preserving the diagnostic detail that the generic catch was previously discarding.
AST analysis: ast_extract_access_chains and resolve_access_chain are introduced to walk the template AST and identify which fields in the current row are missing/None/empty, then _build_empty_render_message computes a "safe gate" expression one accessor past the last resolvable prefix so the suggested template avoids re-triggering UndefinedError.
Propagation: UserTemplateSandboxEnvironment.safe_render and NativeJinjaSandboxEnvironment.render_template both now catch UndefinedError and convert it to the new error type with the per-row diagnostic.

Confidence Score: 3/5

The error-detection and propagation paths are solid, but the gate expression embedded in the suggested template can itself raise UndefinedError when the root template variable is entirely absent from the row record, making the copy-paste fix actively harmful in that case.

The core rendering and sanitizer changes are correct and well-tested. The gate computation in _build_empty_render_message has a logic gap: resolve_access_chain returns prefix = [] for both 'root variable missing' and 'first accessor missing', so the formula always picks the first accessor as the gate. For a template like {{ person.address.street }} where person is completely absent from the record, the generated suggestion raises UndefinedError on person.address — the exact failure mode the PR is trying to fix. No test covers this scenario.

environment.py around _build_empty_render_message, specifically the gate_accessors computation at line 736.

Important Files Changed

Filename	Overview
packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py	Adds UndefinedError catch paths and diagnostic message builder; gate expression logic has an edge-case bug when the template's root variable is entirely absent from the row record.
packages/data-designer-engine/src/data_designer/engine/processing/ginja/ast.py	Adds ast_extract_access_chains and resolve_access_chain helpers; visitor logic correctly deduplicates prefix chains and handles dynamic subscripts.
packages/data-designer-engine/src/data_designer/engine/processing/ginja/exceptions.py	Adds EmptyTemplateRenderError as a UserTemplateError subclass; straightforward and correctly placed in the hierarchy.
packages/data-designer-engine/tests/engine/processing/ginja/test_environment.py	Comprehensive regression tests covering both bug paths; missing coverage for the root-variable-absent-from-record scenario that reveals the gate-expression bug.
packages/data-designer-engine/tests/engine/processing/ginja/test_ast.py	Parametrized coverage for all chain-extraction and resolution cases; thorough and correct.
packages/data-designer-engine/tests/engine/column_generators/generators/test_image.py	Updated match string from 'invalid' to 'empty' to reflect the new error message; correct and minimal change.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[safe_render / render_template called] --> B[template.render record]
    B -->|Success| C[validate_rendered_text]
    C -->|rendered_text != empty| D[Return rendered_text]
    C -->|rendered_text == empty| E[_build_empty_render_message]
    B -->|UndefinedError| E
    B -->|SecurityError| F[UserTemplateError]
    B -->|OverflowError| F
    E --> G[Parse AST to ast_extract_access_chains]
    G --> H[resolve_access_chain per chain]
    H --> I{Any culprit?}
    I -->|Yes| J[Build gate_accessors len prefix plus 1]
    I -->|No| K[Generic fallback message]
    J -->|root in record| L[Correct gate expression]
    J -->|root NOT in record| M[gate goes one step too deep suggested fix raises UndefinedError]
    L --> N[EmptyTemplateRenderError raised]
    M --> N
    N --> O[sanitize_user_exceptions bypasses it]
    O --> P[Actionable message reaches user]

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py:736-737
**Gate expression is one step too deep when the root variable is absent from the record**

`resolve_access_chain` returns `(False, None, [])` both when the root name is missing from `record` (`name not in record`) and when the first accessor fails — both yield `prefix = []`. The gate computation `sample_accessors[:len(sample_prefix)+1]` therefore produces `sample_accessors[:1]` in either scenario. For `{{ person.address.street }}` with `record = {}`, that gives `gate_chain = "person.address"`, so the suggestion becomes `{{ person.address.street if person.address else 'N/A' }}`. But in Jinja2, `person` is `Undefined` when it is absent from the context, and `Undefined.__getattr__` raises `UndefinedError` immediately — the suggested template itself fails with the same kind of error the user is trying to fix.

The gate should fall back to the root name alone (`gate_accessors = []`) when `sample_name` is not present in `record`.

_{Reviews (1): Last reviewed commit: "fix(engine): actionable error when a Jin..." | Re-trigger Greptile}

greptile-apps · 2026-05-11T17:37:26Z

+    gate_accessors = sample_accessors[: len(sample_prefix) + 1]
+    gate_chain = _format_access_chain(sample_name, gate_accessors)


Gate expression is one step too deep when the root variable is absent from the record

resolve_access_chain returns (False, None, []) both when the root name is missing from record (name not in record) and when the first accessor fails — both yield prefix = []. The gate computation sample_accessors[:len(sample_prefix)+1] therefore produces sample_accessors[:1] in either scenario. For {{ person.address.street }} with record = {}, that gives gate_chain = "person.address", so the suggestion becomes {{ person.address.street if person.address else 'N/A' }}. But in Jinja2, person is Undefined when it is absent from the context, and Undefined.__getattr__ raises UndefinedError immediately — the suggested template itself fails with the same kind of error the user is trying to fix.

The gate should fall back to the root name alone (gate_accessors = []) when sample_name is not present in record.

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py Line: 736-737 Comment: **Gate expression is one step too deep when the root variable is absent from the record** `resolve_access_chain` returns `(False, None, [])` both when the root name is missing from `record` (`name not in record`) and when the first accessor fails — both yield `prefix = []`. The gate computation `sample_accessors[:len(sample_prefix)+1]` therefore produces `sample_accessors[:1]` in either scenario. For `{{ person.address.street }}` with `record = {}`, that gives `gate_chain = "person.address"`, so the suggestion becomes `{{ person.address.street if person.address else 'N/A' }}`. But in Jinja2, `person` is `Undefined` when it is absent from the context, and `Undefined.__getattr__` raises `UndefinedError` immediately — the suggested template itself fails with the same kind of error the user is trying to fix. The gate should fall back to the root name alone (`gate_accessors = []`) when `sample_name` is not present in `record`. How can I resolve this? If you propose a fix, please make it concise.

nabinchha requested a review from a team as a code owner May 11, 2026 17:30

nabinchha deployed to agentic-ci May 11, 2026 17:30 — with GitHub Actions Active

greptile-apps Bot reviewed May 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(engine): actionable error when a Jinja field is missing/None/empty#633

fix(engine): actionable error when a Jinja field is missing/None/empty#633
nabinchha wants to merge 1 commit into
mainfrom
fix/template-render-errors-629

nabinchha commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

greptile-apps Bot commented May 11, 2026

Confidence Score: 3/5

Flowchart

Uh oh!

greptile-apps Bot May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		gate_accessors = sample_accessors[: len(sample_prefix) + 1]
		gate_chain = _format_access_chain(sample_name, gate_accessors)

Conversation

nabinchha commented May 11, 2026

Summary

What changed

Test plan

Uh oh!

github-actions Bot commented May 11, 2026

PR #633 Review — fix(engine): actionable error when a Jinja field is missing/None/empty

Summary

Findings

Correctness

Project conventions

Tests

Performance

Security

Documentation

Suggested follow-ups (non-blocking)

Verdict

Uh oh!

greptile-apps Bot commented May 11, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PR #633 Review — `fix(engine): actionable error when a Jinja field is missing/None/empty`