Skip to content

fix(engine): actionable error when a Jinja field is missing/None/empty#633

Open
nabinchha wants to merge 1 commit into
mainfrom
fix/template-render-errors-629
Open

fix(engine): actionable error when a Jinja field is missing/None/empty#633
nabinchha wants to merge 1 commit into
mainfrom
fix/template-render-errors-629

Conversation

@nabinchha
Copy link
Copy Markdown
Contributor

Summary

Closes #629.

Empty-render and missing-attribute Jinja failures used to surface as the same unhelpful message:

🛑 Failed to process column 'preferred_english_name': User provided prompt generation template is invalid.

Two distinct failure modes were collapsing into that generic string:

  • Bug 1 — empty render (e.g. {{ person.preferred_english_name }} where the key is missing): _assert_rendered_text_not_empty raised a descriptive UserTemplateError, but sanitize_user_exceptions then replaced the message with the generic one.
  • Bug 3 — raw UndefinedError (e.g. {{ person.address.street }} where address is missing): a jinja2.exceptions.UndefinedError leaked all the way up because the sanitizer only caught UserTemplateError / TemplateSyntaxError.

Both paths now raise a new EmptyTemplateRenderError(UserTemplateError) subclass that the sanitizer is explicitly told to pass through. The exception carries an actionable, row-level message naming the offending chain plus copy-pasteable Jinja conditional and SkipConfig remediation patterns:

Template rendered to empty text. This usually happens when one or more
referenced fields are missing, None, or empty in this row.

Likely culprits in this row:
  - person.address.street (missing from record)

To handle missing values, you can:

  1. Provide a fallback in your template using a Jinja conditional:
       {{ person.address.street if person.address else 'N/A' }}

  2. Skip rows where required fields are missing using SkipConfig:
       skip=SkipConfig(when="{{ not person.address }}")

The "gate" expression in the suggestions stops one accessor past the resolvable prefix so it stays safe to evaluate (avoiding a fresh UndefinedError on the suggested fix itself).

What changed

packages/data-designer-engine/src/data_designer/engine/processing/ginja/:

  • exceptions.py — new EmptyTemplateRenderError subclass.
  • ast.py — new public helpers ast_extract_access_chains and resolve_access_chain (plus an AccessChain type alias).
  • environment.py:
    • _assert_rendered_text_not_empty / validate_rendered_text thread the template + record through and raise the new error with the per-row diagnostic.
    • UserTemplateSandboxEnvironment.safe_render catches UndefinedError and converts it to the same EmptyTemplateRenderError. NativeJinjaSandboxEnvironment.render_template mirrors the conversion for consistency.
    • sanitize_user_exceptions bypasses EmptyTemplateRenderError (same treatment as UserTemplateUnsupportedFiltersError).

End-to-end propagation already prepends the column name via _run_batch, so no other layers needed changes.

Test plan

  • New regression tests for both bugs (test_empty_render_raises_empty_template_render_error_with_culprit_chain, test_undefined_nested_attr_raises_empty_template_render_error_with_safe_gate).
  • Coverage for the "resolved-to-None" / "resolved-to-empty-string" / multi-culprit branches.
  • Test that the sanitizer no longer collapses the new error to the generic message.
  • Test that NativeJinjaSandboxEnvironment also converts UndefinedError.
  • Parametrized coverage for ast_extract_access_chains (plain text, single name, nested attr, subscript, mixed, dynamic subscript skipped, loop/conditional context) and resolve_access_chain (resolved, partially-resolved, type-mismatch cases).
  • Happy-path render still works; one image-cell-generator test was updated from match="invalid" to match="empty" to reflect the new (more informative) message.
  • uv run pytest packages/data-designer-engine/tests/engine/processing/ginja/ — 95 passed.
  • uv run ruff format + uv run ruff check — clean.

Empty-render and missing-attribute failures used to surface as the
generic "User provided prompt generation template is invalid." either
because `sanitize_user_exceptions` stripped the detail or because
Jinja's raw `UndefinedError` leaked through. Both now raise a new
`EmptyTemplateRenderError` carrying a row-level diagnostic that names
the offending chain and includes copy-pasteable Jinja conditional and
SkipConfig fix patterns.

Closes #629.
@nabinchha nabinchha requested a review from a team as a code owner May 11, 2026 17:30
@nabinchha nabinchha deployed to agentic-ci May 11, 2026 17:30 — with GitHub Actions Active
@github-actions
Copy link
Copy Markdown
Contributor

PR #633 Review — fix(engine): actionable error when a Jinja field is missing/None/empty

Summary

Closes #629. Two distinct failure modes in Jinja rendering used to collapse into the same unhelpful "User provided prompt generation template is invalid." message:

  1. Empty render — template like {{ person.preferred_english_name }} where the key is missing on the record.
  2. Raw UndefinedError — template like {{ person.address.street }} where person.address is missing, leaking Jinja's internal "'dict object' has no attribute 'address'".

The PR introduces EmptyTemplateRenderError(UserTemplateError) and teaches sanitize_user_exceptions to pass it through unaltered. Both the secure (UserTemplateSandboxEnvironment.safe_render) and native (NativeJinjaSandboxEnvironment.render_template) engines now convert UndefinedError + empty-render cases to this new error, which carries a row-level diagnostic naming culprit chains and giving copy-pasteable Jinja-conditional and SkipConfig remediation patterns. Two new AST helpers (ast_extract_access_chains, resolve_access_chain) back the culprit detection.

Findings

Correctness

  • Exception chaining inconsistency (minor). NativeJinjaSandboxEnvironment.render_template uses raise EmptyTemplateRenderError(...) from exception (environment.py:248-251), preserving the original UndefinedError in __cause__. UserTemplateSandboxEnvironment.safe_render re-raises without from exception (environment.py:402). The messages are curated so there's no user-visible difference, but __cause__ is useful when someone is debugging from a traceback. Recommend raise EmptyTemplateRenderError(...) from exception in safe_render too.
  • _build_empty_render_message fallback path is robust. Wrapping parser(user_template) + ast_extract_access_chains(...) in try/except Exception and falling back to a generic message avoids masking the original render failure. Good defensive posture.
  • resolve_access_chain list bounds. Correctly rejects out-of-bounds indices (acc >= len(current) or acc < -len(current)). Negative literal subscripts (e.g. {{ items[-1] }}) would reach resolve_access_chain as int accessors through _build_access_chain, and Python's native negative indexing on current[acc] handles them. Good.
  • Dict/list/scalar branching is complete. isinstance(current, dict), isinstance(current, list), fallthrough else — the three cases cover the sanitized record value types.
  • Gate expression derivation is safe. The one-past-resolvable-prefix slice (sample_accessors[: len(sample_prefix) + 1]) yields a Jinja Undefined leaf on evaluation, which stringifies to "" and is falsy — doesn't re-trigger UndefinedError. This is subtle; the inline comment at environment.py explains the reasoning clearly.
  • culprits[0] for the remediation suggestion. Only the first culprit becomes the template for the remediation example. With multiple culprits listed in the bullets, a user might fix only the first and rerun. Arguably OK (they'll iterate), but consider noting "(showing fix for first culprit)" if user feedback reports confusion.
  • Sanitizer order. sanitize_user_exceptions now catches (UserTemplateUnsupportedFiltersError, EmptyTemplateRenderError) before (UserTemplateError, TemplateSyntaxError). Since Python matches except clauses in order and both are UserTemplateError subclasses, the ordering is load-bearing. Correct as written. A short comment noting "ordering matters — these are UserTemplateError subclasses" would help future maintainers.
  • Dynamic subscripts. _build_access_chain returns None for a[b]-style chains, and the visitor descends into the subscript expression with in_chain=False, extracting b as its own chain. Matches the docstring behavior and the parametrized test coverage.

Project conventions

  • from __future__ import annotations present in all new/modified files ✓
  • Absolute imports only ✓
  • Modern type syntax (list[str], str | None, AccessChain = tuple[str, list[str | int]]) ✓
  • Type annotations on all new public callables ✓
  • EmptyTemplateRenderError subclasses UserTemplateError, preserving the canonical-error-at-boundaries invariant from AGENTS.md ✓
  • New AST helpers land in ginja/ast.py alongside existing helpers; _build_access_chain is private — good separation.
  • Module-level _CULPRIT_* sentinels are fine, but a small Enum or Literal type alias would make the classifier signature self-documenting. Not worth blocking on.

Tests

  • 95 tests in ginja/ passing per the PR description; diff shows thorough new coverage:
    • test_ast.py: parametrized chain-extraction (plain, nested, subscript, mixed, dynamic-subscript skip, loop/conditional context), parametrized resolve_access_chain (fully resolved, partially resolved, type-mismatch, list bounds).
    • test_environment.py: both bug regressions, culprit-classification branches (missing / None / empty string), multi-culprit listing, sanitizer-bypass assertion, native-engine parity, happy path, skip_template_validation=True path.
  • The image-cell-generator test was updated from match="invalid" to match="empty" with an explanatory comment pointing at issue Cryptic "User provided prompt generation template is invalid" error when a referenced template field is empty or missing #629. Good — this is exactly the kind of test churn worth annotating.
  • Gap (small): No test asserts the fallback message when _build_empty_render_message's parse branch fails (e.g., a monkey-patched parser that throws). The code path exists (environment.py:346-352) but is not covered. Arguably not worth chasing given how defensive it is.
  • Gap (small): No test for a chain with a non-identifier string key rendered via [...] bracket notation (e.g. {{ data["weird key"] }}). _format_access_chain has the branch; a parametrized case would close the loop.

Performance

  • Culprit detection runs only on the error path. It reparses user_template once per failure. Negligible.
  • ast_extract_access_chains is O(n) in AST size; visit does no backtracking.

Security

  • No new attack surface. The helpers operate on already-sanitized records and the existing trusted parse path.
  • The curated error message embeds user data (record keys in the access chain) into the exception string. This is the same blast radius as prior UserTemplateError messages — acceptable.

Documentation

  • Docstrings on new helpers are thorough and accurate. safe_render's docstring is updated to mention EmptyTemplateRenderError. ✓
  • Inline comments explain the non-obvious bits (top-level chain extraction, safe gate derivation, why the sanitizer bypasses this error class).

Suggested follow-ups (non-blocking)

  1. Add from exception to the UndefinedError → EmptyTemplateRenderError re-raise in UserTemplateSandboxEnvironment.safe_render for traceback consistency with the native engine path.
  2. Consider a one-line comment on the sanitize_user_exceptions except-clause noting that ordering is load-bearing.
  3. Add a test case for non-identifier string subscripts in _format_access_chain.
  4. If user reports come in about the multi-culprit case being ambiguous, consider labelling the remediation example ("fix for first culprit shown").

Verdict

Approve with minor nits. The change is well-scoped, well-tested, and genuinely fixes the reported bug class (both the "empty render" and "undefined nested attr" variants) without expanding blast radius. Exception taxonomy and sanitizer bypass are implemented correctly. The nits above are small-cost polish; none block merge.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 11, 2026

Greptile Summary

This PR replaces two different sources of a generic, unactionable Jinja failure message with a new EmptyTemplateRenderError subclass that names the offending access chain and provides copy-pasteable remediation patterns (Jinja conditional fallback and SkipConfig gate).

  • Exception hierarchy: EmptyTemplateRenderError(UserTemplateError) is added so sanitize_user_exceptions can pass it through unchanged, preserving the diagnostic detail that the generic catch was previously discarding.
  • AST analysis: ast_extract_access_chains and resolve_access_chain are introduced to walk the template AST and identify which fields in the current row are missing/None/empty, then _build_empty_render_message computes a "safe gate" expression one accessor past the last resolvable prefix so the suggested template avoids re-triggering UndefinedError.
  • Propagation: UserTemplateSandboxEnvironment.safe_render and NativeJinjaSandboxEnvironment.render_template both now catch UndefinedError and convert it to the new error type with the per-row diagnostic.

Confidence Score: 3/5

The error-detection and propagation paths are solid, but the gate expression embedded in the suggested template can itself raise UndefinedError when the root template variable is entirely absent from the row record, making the copy-paste fix actively harmful in that case.

The core rendering and sanitizer changes are correct and well-tested. The gate computation in _build_empty_render_message has a logic gap: resolve_access_chain returns prefix = [] for both 'root variable missing' and 'first accessor missing', so the formula always picks the first accessor as the gate. For a template like {{ person.address.street }} where person is completely absent from the record, the generated suggestion raises UndefinedError on person.address — the exact failure mode the PR is trying to fix. No test covers this scenario.

environment.py around _build_empty_render_message, specifically the gate_accessors computation at line 736.

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py Adds UndefinedError catch paths and diagnostic message builder; gate expression logic has an edge-case bug when the template's root variable is entirely absent from the row record.
packages/data-designer-engine/src/data_designer/engine/processing/ginja/ast.py Adds ast_extract_access_chains and resolve_access_chain helpers; visitor logic correctly deduplicates prefix chains and handles dynamic subscripts.
packages/data-designer-engine/src/data_designer/engine/processing/ginja/exceptions.py Adds EmptyTemplateRenderError as a UserTemplateError subclass; straightforward and correctly placed in the hierarchy.
packages/data-designer-engine/tests/engine/processing/ginja/test_environment.py Comprehensive regression tests covering both bug paths; missing coverage for the root-variable-absent-from-record scenario that reveals the gate-expression bug.
packages/data-designer-engine/tests/engine/processing/ginja/test_ast.py Parametrized coverage for all chain-extraction and resolution cases; thorough and correct.
packages/data-designer-engine/tests/engine/column_generators/generators/test_image.py Updated match string from 'invalid' to 'empty' to reflect the new error message; correct and minimal change.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[safe_render / render_template called] --> B[template.render record]
    B -->|Success| C[validate_rendered_text]
    C -->|rendered_text != empty| D[Return rendered_text]
    C -->|rendered_text == empty| E[_build_empty_render_message]
    B -->|UndefinedError| E
    B -->|SecurityError| F[UserTemplateError]
    B -->|OverflowError| F
    E --> G[Parse AST to ast_extract_access_chains]
    G --> H[resolve_access_chain per chain]
    H --> I{Any culprit?}
    I -->|Yes| J[Build gate_accessors len prefix plus 1]
    I -->|No| K[Generic fallback message]
    J -->|root in record| L[Correct gate expression]
    J -->|root NOT in record| M[gate goes one step too deep suggested fix raises UndefinedError]
    L --> N[EmptyTemplateRenderError raised]
    M --> N
    N --> O[sanitize_user_exceptions bypasses it]
    O --> P[Actionable message reaches user]
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py:736-737
**Gate expression is one step too deep when the root variable is absent from the record**

`resolve_access_chain` returns `(False, None, [])` both when the root name is missing from `record` (`name not in record`) and when the first accessor fails — both yield `prefix = []`. The gate computation `sample_accessors[:len(sample_prefix)+1]` therefore produces `sample_accessors[:1]` in either scenario. For `{{ person.address.street }}` with `record = {}`, that gives `gate_chain = "person.address"`, so the suggestion becomes `{{ person.address.street if person.address else 'N/A' }}`. But in Jinja2, `person` is `Undefined` when it is absent from the context, and `Undefined.__getattr__` raises `UndefinedError` immediately — the suggested template itself fails with the same kind of error the user is trying to fix.

The gate should fall back to the root name alone (`gate_accessors = []`) when `sample_name` is not present in `record`.

Reviews (1): Last reviewed commit: "fix(engine): actionable error when a Jin..." | Re-trigger Greptile

Comment on lines +736 to +737
gate_accessors = sample_accessors[: len(sample_prefix) + 1]
gate_chain = _format_access_chain(sample_name, gate_accessors)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Gate expression is one step too deep when the root variable is absent from the record

resolve_access_chain returns (False, None, []) both when the root name is missing from record (name not in record) and when the first accessor fails — both yield prefix = []. The gate computation sample_accessors[:len(sample_prefix)+1] therefore produces sample_accessors[:1] in either scenario. For {{ person.address.street }} with record = {}, that gives gate_chain = "person.address", so the suggestion becomes {{ person.address.street if person.address else 'N/A' }}. But in Jinja2, person is Undefined when it is absent from the context, and Undefined.__getattr__ raises UndefinedError immediately — the suggested template itself fails with the same kind of error the user is trying to fix.

The gate should fall back to the root name alone (gate_accessors = []) when sample_name is not present in record.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py
Line: 736-737

Comment:
**Gate expression is one step too deep when the root variable is absent from the record**

`resolve_access_chain` returns `(False, None, [])` both when the root name is missing from `record` (`name not in record`) and when the first accessor fails — both yield `prefix = []`. The gate computation `sample_accessors[:len(sample_prefix)+1]` therefore produces `sample_accessors[:1]` in either scenario. For `{{ person.address.street }}` with `record = {}`, that gives `gate_chain = "person.address"`, so the suggestion becomes `{{ person.address.street if person.address else 'N/A' }}`. But in Jinja2, `person` is `Undefined` when it is absent from the context, and `Undefined.__getattr__` raises `UndefinedError` immediately — the suggested template itself fails with the same kind of error the user is trying to fix.

The gate should fall back to the root name alone (`gate_accessors = []`) when `sample_name` is not present in `record`.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cryptic "User provided prompt generation template is invalid" error when a referenced template field is empty or missing

1 participant