feat: add template tag parser and compiler #14

JuroOravec · 2025-10-23T10:15:34Z

No description provided.

JuroOravec · 2025-10-23T14:37:42Z

Cargo.toml

 pyo3 = { version = "0.27.1", features = ["extension-module"] }
 quick-xml = "0.38.3"
+pest = "2.8.3"
+pest_derive = "2.8.3"


The template syntax parsing was implemented using Pest. Pest works in 3 parts:

"grammar rules" - definition of patterns that are supported in the.. language? I'm not sure about the correct terminology.

Pest defines it's own language for defining these rules, see djc-template-parser/src/grammar.pest.

This is similar to Backus–Naur Form, e.g.

<postal-address> ::= <name-part> <street-address> <zip-part> <name-part> ::= <personal-part> <last-name> <opt-suffix-part> <EOL> | <name-part> <street-address> ::= <house-num> <street-name> <opt-apt-num> <EOL> <zip-part> ::= <town-name> "," <state-code> <ZIP-code> <EOL>

Or the MDN's formal syntax, e.g. here:

border-left-width = <line-width> <line-width> = [<length [0,∞]>](https://developer.mozilla.org/en-US/docs/Web/CSS/length) [|](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_values_and_units/Value_definition_syntax#single_bar) thin [|](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_values_and_units/Value_definition_syntax#single_bar) medium [|](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_values_and_units/Value_definition_syntax#single_bar) thick

Well and this Pest grammar is where all the permissible patterns are defined. E.g. here's a high-level example for a {% ... %} template tag (NOTE: outdated version):

// The full tag is a sequence of attributes // E.g. `{% slot key=val key2=val2 %}` tag_wrapper = { SOI ~ django_tag ~ EOI } django_tag = { "{%" ~ tag_content ~ "%}" } // The contents of a tag, without the delimiters tag_content = ${ spacing* // Optional leading whitespace/comments ~ tag_name // The tag name must come first, MAY be preceded by whitespace ~ (spacing+ ~ attribute)* // Then zero or more attributes, MUST be separated by whitespace/comments ~ spacing* // Optional trailing whitespace/comments ~ self_closing_slash? // Optional self-closing slash ~ spacing* // More optional trailing whitespace }

Parsing and handling of the matched grammar rules.

So each defined rule has its own name, e.g. django_tag.

When a text is parsed with Pest in Rust, we get a list of parsed rules (or a single rule?).

Since the grammar definition specifies the entire {% .. %} template tag, and we pass in a string starting and ending in {% ... %}, we should match exactly the top-level tag_wrapper rule.

If we match anything else in its place, we raise an error.

Once we have tag_wrapper, we walk down it, rule by rule, constructing the AST from the patterns we come across.

Constructing the AST.

The AST consists of these nodes - Tag, TagAttr, TagToken, TagValue, TagValueFilter

Tag - the entire {% ... %}, e.g {% my_tag x ...[1, 2, 3] key=val / %}

The first word inside a Tag is the tag_name, e.g. my_tag.

After the tag name, there are zero or more TagAttrs. This is ALL inputs, both positional and keyword

Tag attrs are x, ...[1, 2, 3], key=val

If a tag attribute has a key, that's stored on TagAttrs.

But ALL TagAttrs MUST have a value.

TagValue holds a single value, may have a filter, e.g. "cool"|upper

TagValue may be of different kinds, e.g. string, int, float, literal list, literal dict, variable, translation _('mystr'), etc. The specific kind is identified by what rules we parse, and the resulting TagValue nodes are distinguished by the ValueKind, an enum with values like "string", "float", etc.

Since TagValue can be also e.g. literal lists, TagValues may contain other TagValues. This implies that:

Lists and dicts themselves can have filters applied to them, e.g. [1, 2, 3]|append:4

items inside lists and dicts can too have filters applied to them. e.g. [1|add:1, 2|add:2]

Any TagValue can have 0 or more filters applied to it. Filters have a name and an optional argument, e.g. 3|add:2 - filter name add, arg 2. These filters are held by TagValueFilter.

While the filter name is a plain identifier, the argument can be yet another TagValue. so even using literal lists and dicts at the position of filter argument is permitted, e.g. [1]|extend:[2, 3]

Lastly, TagToken is a secondary object used by the nodes above. It contains info about the original raw string, and the line / col where the string was found.

The final AST can look like this:

INPUT:

{% my_tag value|lower %}

AST:

Tag { name: TagToken { token: "my_tag".to_string(), start_index: 3, end_index: 9, line_col: (1, 4), }, attrs: vec![TagAttr { key: None, value: TagValue { token: TagToken { token: "value".to_string(), start_index: 10, end_index: 15, line_col: (1, 11), }, children: vec![], spread: None, filters: vec![TagValueFilter { arg: None, token: TagToken { token: "lower".to_string(), start_index: 16, end_index: 21, line_col: (1, 17), }, start_index: 15, end_index: 21, line_col: (1, 16), }], kind: ValueKind::Variable, start_index: 10, end_index: 21, line_col: (1, 11), }, is_flag: false, start_index: 10, end_index: 21, line_col: (1, 11), }], is_self_closing: false, syntax: TagSyntax::Django, start_index: 0, end_index: 24, line_col: (1, 4), }

JuroOravec · 2025-10-23T15:47:29Z

crates/djc-template-parser/src/tag_compiler.rs

@@ -0,0 +1,749 @@
+//! # Django Template Tag Compiler


Another important part is the "tag compiler". This turns the parsed AST into an executable Python function. When this function is called with the Context object, it resolves the inputs to a tag into Python args and kwargs.

from djc_core import parse_tag, compile_tag ast = parse_tag('{% my_tag var1 ...[2, 3] key=val ...{"other": "x"} / %}') tag_fn = compile_tag(ast) args, kwargs = tag_fn({"var1": "hello", "val": "abc"}) assert args == ["hello", 2, 3] assert kwargs == {"key": "abc", "other": "x"}

How it works is:

We start with the AST of the template tag.

TagAttrs with keys become function's kwargs, and TagAttrs without keys are functions args.

For each TagAttr, we walk down it's value, and handle each ValueKind differently

Literals - 1, 1.5, "abc", etc - These are compiled as literal Python values

Variables - e.g. my_var - we replace that with function call variable(context, "my_var")

Filters - my_var|add:"txt" - replaced with function call filter(context, "add", my_var, "txt")

Translation _("abc") - function call translation(context, "abc")

String with nested template tags, e.g. "Hello {{ first_name }}" - function call template_string(context, "Hello {{ first_name }}")

Literal lists and dicts - structure preserved, and we walk down and convert each item, key, value.

Input:

{% component my_var|add:"txt" / %}

Generated function:

def compiled_func(context, *, template_string, translation, variable, filter): args = [] kwargs = [] args.append(filter(context, 'add', variable(context, 'my_var'), "txt")) return args, kwargs

Apply Django-specific logic

As you can see, the generated function accepts the definitions for the functions variable(), filter(), etc.

This means that the implementation for these is defined in Python. So we can still easily change how individual features are handled. These definitions of variable(), etc are NOT exposed to the users of django-components.

The implementation is defined in django-components, and it looks something like below.

There you can see e.g. that when the Rust compiler came across a variable my_var, it generated variable(..) call. And the implementation for variable(...) calls Django's Variable(var).resolve(ctx).

So at the end of the day we're still using the same Django logic to actually resolve variables into actual values.

def resolve_template_string(ctx: Context, expr: str) -> Any: return DynamicFilterExpression( expr_str=expr, filters=filters, tags=tags, ).resolve(ctx) def resolve_filter(_ctx: Context, name: str, value: Any, arg: Any) -> Any: if name not in filters: raise TemplateSyntaxError(f"Invalid filter: '{name}'") filter_func = filters[name] if arg is None: return filter_func(value) else: return filter_func(value, arg) def resolve_variable(ctx: Context, var: str) -> Any: try: return Variable(var).resolve(ctx) except VariableDoesNotExist: return "" def resolve_translation(ctx: Context, var: str) -> Any: # The compiler gives us the variable stripped of `_(")` and `"), # so we put it back for Django's Variable class to interpret it as a translation. translation_var = "_('" + var + "')" return Variable(translation_var).resolve(ctx) args, kwargs = compiled_tag( context=context, template_string=template_string, variable=resolve_variable, translation=resolve_translation, filter=resolve_filter, )

Call the component with the args and kwargs

The compiled function returned a list of args and a dict of kwargs. We then simply pass these further to the implementation of the {% component %} node.

So a template tag like this:

{% component "my_table" var1 ...[2, 3] key=val ...{"other": "x"} / %}

Eventually gets resolved to something like so:

ComponentNode.render("my_table", var1, 2, 3, key=val, other="x")

Validation

The template tag inputs respect Python's convetion of not allowing args after kwargs.

When compiling AST into a Python function, we're able to detect obvious cases and raise an error early, like:

{% component key=val my_var / %} {# Error! #}

However, some cases can be figured out only at render time. Becasue the spread syntax ...my_var can be used with both a list of args or a dict of kwargs.

So we need to wait for the Context object to figure out whether this:

{% component ...items my_var / %}

Resolves to lists (OK):

{% component ...[1, 2, 3] my_var / %}

Or to dict (Error):

{% component ...{"key": "x"} my_var / %}

So when we detect that there is a spread within the template tag, we add a render-time function that checks whether the spread resolves to list or a dict, and raises if it's not permitted:

INPUT:

{% component ...options1 key1="value1" ...options2 key1="value1" / %}

Generated function:

def compiled_func(context, *, expression, translation, variable, filter): def _handle_spread(value, raw_token_str, args, kwargs, kwarg_seen): if hasattr(value, "keys"): kwargs.extend(value.items()) return True else: if kwarg_seen: raise SyntaxError("positional argument follows keyword argument") try: args.extend(value) except TypeError: raise TypeError( f"Value of '...{raw_token_str}' must be a mapping or an iterable, " f"not {type(value).__name__}." ) return False args = [] kwargs = [] kwargs.append(('key1', "value1")) kwarg_seen = True kwarg_seen = _handle_spread(variable(context, 'options1'), """options1""", args, kwargs, kwarg_seen) kwargs.append(('key2', "value2")) kwarg_seen = _handle_spread(variable(context, 'options2'), """options2""", args, kwargs, kwarg_seen) return args, kwargs

JuroOravec · 2025-10-23T15:52:45Z

crates/djc-template-parser/src/tag_parser.rs

+}
+
+#[cfg(test)]
+mod tests {


The reason why this PR is so huge is mainly because of the tests. In ths file, tag_parser.rs, it takes up 7500 lines. There's around 120 tests, in an attempt to test the interactions between different kind of values - e.g. filters in lists, lists in filters, etc etc.

JuroOravec · 2025-10-23T15:53:44Z

crates/djc-template-parser/src/tag_parser.rs

+}
+
+impl TagParser {
+    pub fn parse_tag(input: &str, flags: &HashSet<String>) -> Result<Tag, ParseError> {


This is where the AST generation is implemented.

JuroOravec · 2025-10-23T15:56:48Z

djc_core/djc_template_parser.pyi

+
+TContext = TypeVar("TContext")
+
+class ValueKind:


The Python API generated by PyO3 / maturin doesn't include type hints. For type checker it's a black box, and so the Python API has to be re-defined as .pyi file.

JuroOravec · 2025-10-23T15:57:29Z

crates/djc-template-parser/src/grammar.pest

+// The full tag is a sequence of attributes
+// E.g. `{% slot key=val key2=val2 %}`
+// NOTE: tag_wrapper is used when parsing exclusively a single Django template tag.
+tag_wrapper = { SOI ~ django_tag ~ EOI }


Here is the definition of the grammar rules. For easier reading, there's also a VSCode extension for syntax highlight for .pest files.

JuroOravec · 2025-10-23T16:04:41Z

djc_core/djc_template_parser.py

+        attributes = tag_or_attrs
+    func_string = compile_ast_to_string(attributes)
+    local_scope = {}
+    exec(func_string, {}, local_scope)


When compiling the AST into Python function, I call exec() with the generated code.

This is safe, because we construct the contents of the generated function. And we never call or access a variable directly - instead, we call the helper functions like variable(...) etc. And we define those implementations.

One thing where I had to be careful is that, for better debugging, when there is a problem with the spread (args after kwargs), then there I printed out the original string to point user to where the error occured.

So there I wrapped the raw string in """ and escaped all " inside the string to prevent the raw string from terminating the entire expression.

JuroOravec · 2025-10-23T16:08:09Z

djc_core/djc_template_parser.py

+    """
+
+
+def compile_tag(tag_or_attrs: Union[Tag, List[TagAttr]]):


Another thing - To convert the function code string into an actual Python object, this is much easier to be done on Python side with exec() than trying to re-construct the entire function body inside Rust (if it's even possible).

So this compile_tag() and CompiledFunc are actually part of the public API of djc_core that are defined in directly in Python.

Because of that, these two entities had to be exposed and added to __all__ in djc_core/__init__.py.

JuroOravec added 2 commits October 23, 2025 12:08

feat: add template tag parser and compiler

878899d

refactor: minor updates for the pest grammar

e4c389a

JuroOravec commented Oct 23, 2025

View reviewed changes

JuroOravec added 3 commits October 23, 2025 18:28

refactor: fix tests

78b4fc9

docs: document flow for parsing and compiling

f4846fc

refactor: rename dynamic expressiosn to template strings

de9b038

JuroOravec marked this pull request as ready for review October 23, 2025 16:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: add template tag parser and compiler #14

feat: add template tag parser and compiler #14

Uh oh!

JuroOravec commented Oct 23, 2025

Uh oh!

JuroOravec Oct 23, 2025 •

edited

Loading

Uh oh!

JuroOravec Oct 23, 2025

Uh oh!

JuroOravec Oct 23, 2025

Uh oh!

JuroOravec Oct 23, 2025

Uh oh!

JuroOravec Oct 23, 2025

Uh oh!

JuroOravec Oct 23, 2025 •

edited

Loading

Uh oh!

JuroOravec Oct 23, 2025

Uh oh!

JuroOravec Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		"""


		def compile_tag(tag_or_attrs: Union[Tag, List[TagAttr]]):

Uh oh!

feat: add template tag parser and compiler #14

Are you sure you want to change the base?

feat: add template tag parser and compiler #14

Uh oh!

Conversation

JuroOravec commented Oct 23, 2025

Uh oh!

JuroOravec Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JuroOravec Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

JuroOravec Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

JuroOravec Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

JuroOravec Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

JuroOravec Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JuroOravec Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

JuroOravec Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JuroOravec Oct 23, 2025 •

edited

Loading

JuroOravec Oct 23, 2025 •

edited

Loading