Skip to content

Cryptic "User provided prompt generation template is invalid" error when a referenced template field is empty or missing #629

@nabinchha

Description

@nabinchha

Describe the bug

When a Jinja prompt template references a field that is None, an empty string, or missing from the row's record, the user sees a cryptic and unactionable error:

🛑 Failed to process column 'preferred_english_name': User provided prompt generation template is invalid.

Two distinct render-time failures collapse into the same unhelpful message:

Bug 1 — empty render (the more common case): when {{ x }} renders to '' (because x is '', missing from the record dict, or a chain that resolves to empty), the SECURE Jinja path raises a sensible internal error ("User template renders to empty text.") but sanitize_user_exceptions strips that detail and replaces it with the generic "User provided prompt generation template is invalid.". The user has no way to know:

  • which field caused the issue,
  • which row caused it,
  • or what to do about it.

Bug 3 — raw UndefinedError for missing nested attributes: when {{ person.address.street }} is evaluated and address is missing from the person dict, a raw Jinja UndefinedError (e.g. 'dict object' has no attribute 'address') leaks all the way up because the sanitizer only catches UserTemplateError / TemplateSyntaxError. Same root cause as Bug 1, same right answer, but currently surfaces differently and just as confusingly.

This is common in two real-world patterns:

  • Person sampler with non-en_SG locales — fields like preferred_english_name only exist for en_SG personas (see PII_FIELDS in packages/data-designer-config/src/data_designer/config/utils/constants.py). Templates that reference them break for any other locale.
  • Seed datasets with sparse columns — real-world tabular data has empty cells everywhere; any template referencing a sparse column hits this on those rows.

Steps/Code to reproduce bug

from data_designer.engine.processing.ginja.environment import WithJinja2UserTemplateRendering
from data_designer.config.run_config import JinjaRenderingEngine


class Demo(WithJinja2UserTemplateRendering):
    def __init__(self):
        self._jinja_rendering_engine = JinjaRenderingEngine.SECURE


# Bug 1 — empty render via missing key
demo = Demo()
demo.prepare_jinja2_template_renderer(
    "{{ person.preferred_english_name }}",
    dataset_variables=["person"],
)
demo.render_template({"person": {"first_name": "John", "last_name": "Doe"}})
# UserTemplateError: User provided prompt generation template is invalid.

# Bug 3 — raw UndefinedError on missing nested key
demo = Demo()
demo.prepare_jinja2_template_renderer(
    "Hi {{ person.address.street }}",
    dataset_variables=["person"],
)
demo.render_template({"person": {}})
# UndefinedError: 'dict object' has no attribute 'address'

End-to-end inside _run_batch, the user sees:

DataDesignerGenerationError: 🛑 Error generating preview dataset: 🛑 Failed to process column 'preferred_english_name': User provided prompt generation template is invalid.

Expected behavior

The error should be actionable and identify:

  • the column being processed (already happens via _run_batch),
  • which referenced field(s) in the row were None, empty, or missing,
  • the recommended remedies: Jinja conditional fallback and SkipConfig.

Example shape we should aim for:

🛑 Failed to process column 'preferred_english_name':
Template rendered to empty text. This usually happens when one or more referenced fields are None, empty, or missing.

Likely culprits in this row:
  - person.preferred_english_name = None

To handle missing values, you can:

  1. Provide a fallback in your template using a Jinja conditional:
       {{ person.preferred_english_name if person.preferred_english_name else 'N/A' }}

  2. Skip rows where required fields are missing using SkipConfig:
       skip=SkipConfig(when="{{ person.preferred_english_name is none }}")

Proposed fix (high level)

  1. New EmptyTemplateRenderError(UserTemplateError) subclass that bypasses sanitize_user_exceptions (mirrors the existing UserTemplateUnsupportedFiltersError pattern).
  2. AST helper that extracts dotted/bracketed access chains from the parsed template and pairs each chain with the value it would resolve to in the current row.
  3. _assert_rendered_text_not_empty and a new UndefinedError branch in UserTemplateSandboxEnvironment.safe_render use the helper to build an actionable, copy-pasteable error message naming the offending chain(s).
  4. Sanitizer bypass for EmptyTemplateRenderError so the message survives.

End-to-end propagation already prepends the column name via _run_batch, so no other layers need changes.

Affected files

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions