Skip to content

Conversation

@JuroOravec
Copy link
Contributor

No description provided.

pyo3 = { version = "0.27.1", features = ["extension-module"] }
quick-xml = "0.38.3"
pest = "2.8.3"
pest_derive = "2.8.3"
Copy link
Contributor Author

@JuroOravec JuroOravec Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template syntax parsing was implemented using Pest. Pest works in 3 parts:

  1. "grammar rules" - definition of patterns that are supported in the.. language? I'm not sure about the correct terminology.

    Pest defines it's own language for defining these rules, see djc-template-parser/src/grammar.pest.

    This is similar to Backus–Naur Form, e.g.

    <postal-address> ::= <name-part> <street-address> <zip-part>
    <name-part> ::= <personal-part> <last-name> <opt-suffix-part> <EOL> | <name-part>
    <street-address> ::= <house-num> <street-name> <opt-apt-num> <EOL>
    <zip-part> ::= <town-name> "," <state-code> <ZIP-code> <EOL>
    

    Or the MDN's formal syntax, e.g. here:

    border-left-width = 
      <line-width>  
    
    <line-width> = 
      [<length [0,∞]>](https://developer.mozilla.org/en-US/docs/Web/CSS/length)  [|](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_values_and_units/Value_definition_syntax#single_bar)
      thin            [|](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_values_and_units/Value_definition_syntax#single_bar)
      medium          [|](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_values_and_units/Value_definition_syntax#single_bar)
      thick
    

    Well and this Pest grammar is where all the permissible patterns are defined. E.g. here's a high-level example for a {% ... %} template tag (NOTE: outdated version):

 // The full tag is a sequence of attributes
 // E.g. `{% slot key=val key2=val2 %}`
 tag_wrapper = { SOI ~ django_tag ~ EOI }
 
 django_tag = { "{%" ~ tag_content ~ "%}" }
 
 // The contents of a tag, without the delimiters
 tag_content = ${
     spacing*                             // Optional leading whitespace/comments
     ~ tag_name                           // The tag name must come first, MAY be preceded by whitespace
     ~ (spacing+ ~ attribute)*            // Then zero or more attributes, MUST be separated by whitespace/comments
     ~ spacing*                           // Optional trailing whitespace/comments
     ~ self_closing_slash?                // Optional self-closing slash
     ~ spacing*                           // More optional trailing whitespace
 }
  1. Parsing and handling of the matched grammar rules.

    So each defined rule has its own name, e.g. django_tag.

    When a text is parsed with Pest in Rust, we get a list of parsed rules (or a single rule?).

    Since the grammar definition specifies the entire {% .. %} template tag, and we pass in a string starting and ending in {% ... %}, we should match exactly the top-level tag_wrapper rule.

    If we match anything else in its place, we raise an error.

    Once we have tag_wrapper, we walk down it, rule by rule, constructing the AST from the patterns we come across.

  2. Constructing the AST.

    The AST consists of these nodes - Tag, TagAttr, TagToken, TagValue, TagValueFilter

    • Tag - the entire {% ... %}, e.g {% my_tag x ...[1, 2, 3] key=val / %}

    • The first word inside a Tag is the tag_name, e.g. my_tag.

    • After the tag name, there are zero or more TagAttrs. This is ALL inputs, both positional and keyword

      • Tag attrs are x, ...[1, 2, 3], key=val
      • If a tag attribute has a key, that's stored on TagAttrs.
      • But ALL TagAttrs MUST have a value.
    • TagValue holds a single value, may have a filter, e.g. "cool"|upper

      • TagValue may be of different kinds, e.g. string, int, float, literal list, literal dict, variable, translation _('mystr'), etc. The specific kind is identified by what rules we parse, and the resulting TagValue nodes are distinguished by the ValueKind, an enum with values like "string", "float", etc.
      • Since TagValue can be also e.g. literal lists, TagValues may contain other TagValues. This implies that:
        1. Lists and dicts themselves can have filters applied to them, e.g. [1, 2, 3]|append:4
        2. items inside lists and dicts can too have filters applied to them. e.g. [1|add:1, 2|add:2]
    • Any TagValue can have 0 or more filters applied to it. Filters have a name and an optional argument, e.g. 3|add:2 - filter name add, arg 2. These filters are held by TagValueFilter.

      • While the filter name is a plain identifier, the argument can be yet another TagValue. so even using literal lists and dicts at the position of filter argument is permitted, e.g. [1]|extend:[2, 3]
    • Lastly, TagToken is a secondary object used by the nodes above. It contains info about the original raw string, and the line / col where the string was found.

The final AST can look like this:

INPUT:

{% my_tag value|lower %}

AST:

Tag {
    name: TagToken {
        token: "my_tag".to_string(),
        start_index: 3,
        end_index: 9,
        line_col: (1, 4),
    },
    attrs: vec![TagAttr {
        key: None,
        value: TagValue {
            token: TagToken {
                token: "value".to_string(),
                start_index: 10,
                end_index: 15,
                line_col: (1, 11),
            },
            children: vec![],
            spread: None,
            filters: vec![TagValueFilter {
                arg: None,
                token: TagToken {
                    token: "lower".to_string(),
                    start_index: 16,
                    end_index: 21,
                    line_col: (1, 17),
                },
                start_index: 15,
                end_index: 21,
                line_col: (1, 16),
            }],
            kind: ValueKind::Variable,
            start_index: 10,
            end_index: 21,
            line_col: (1, 11),
        },
        is_flag: false,
        start_index: 10,
        end_index: 21,
        line_col: (1, 11),
    }],
    is_self_closing: false,
    syntax: TagSyntax::Django,
    start_index: 0,
    end_index: 24,
    line_col: (1, 4),
}

@@ -0,0 +1,749 @@
//! # Django Template Tag Compiler
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another important part is the "tag compiler". This turns the parsed AST into an executable Python function. When this function is called with the Context object, it resolves the inputs to a tag into Python args and kwargs.

from djc_core import parse_tag, compile_tag

ast = parse_tag('{% my_tag var1 ...[2, 3] key=val ...{"other": "x"} / %}')
tag_fn = compile_tag(ast)

args, kwargs = tag_fn({"var1": "hello", "val": "abc"})

assert args == ["hello", 2, 3]
assert kwargs == {"key": "abc", "other": "x"}

How it works is:

  1. We start with the AST of the template tag.

  2. TagAttrs with keys become function's kwargs, and TagAttrs without keys are functions args.

  3. For each TagAttr, we walk down it's value, and handle each ValueKind differently

    • Literals - 1, 1.5, "abc", etc - These are compiled as literal Python values
    • Variables - e.g. my_var - we replace that with function call variable(context, "my_var")
    • Filters - my_var|add:"txt" - replaced with function call filter(context, "add", my_var, "txt")
    • Translation _("abc") - function call translation(context, "abc")
    • String with nested template tags, e.g. "Hello {{ first_name }}" - function call template_string(context, "Hello {{ first_name }}")
    • Literal lists and dicts - structure preserved, and we walk down and convert each item, key, value.

    Input:

    {% component my_var|add:"txt" / %}

    Generated function:

    def compiled_func(context, *, template_string, translation, variable, filter):
        args = []
        kwargs = []
        args.append(filter(context, 'add', variable(context, 'my_var'), "txt"))
        return args, kwargs
  4. Apply Django-specific logic

    As you can see, the generated function accepts the definitions for the functions variable(), filter(), etc.

    This means that the implementation for these is defined in Python. So we can still easily change how individual features are handled. These definitions of variable(), etc are NOT exposed to the users of django-components.

    The implementation is defined in django-components, and it looks something like below.

    There you can see e.g. that when the Rust compiler came across a variable my_var, it generated variable(..) call. And the implementation for variable(...) calls Django's Variable(var).resolve(ctx).

    So at the end of the day we're still using the same Django logic to actually resolve variables into actual values.

    def resolve_template_string(ctx: Context, expr: str) -> Any:
        return DynamicFilterExpression(
            expr_str=expr,
            filters=filters,
            tags=tags,
        ).resolve(ctx)
    
    def resolve_filter(_ctx: Context, name: str, value: Any, arg: Any) -> Any:
        if name not in filters:
            raise TemplateSyntaxError(f"Invalid filter: '{name}'")
    
        filter_func = filters[name]
        if arg is None:
            return filter_func(value)
        else:
            return filter_func(value, arg)
    
    def resolve_variable(ctx: Context, var: str) -> Any:
        try:
            return Variable(var).resolve(ctx)
        except VariableDoesNotExist:
            return ""
    
    def resolve_translation(ctx: Context, var: str) -> Any:
        # The compiler gives us the variable stripped of `_(")` and `"),
        # so we put it back for Django's Variable class to interpret it as a translation.
        translation_var = "_('" + var + "')"
        return Variable(translation_var).resolve(ctx)
    
    args, kwargs = compiled_tag(
        context=context,
        template_string=template_string,
        variable=resolve_variable,
        translation=resolve_translation,
        filter=resolve_filter,
    )
  5. Call the component with the args and kwargs

    The compiled function returned a list of args and a dict of kwargs. We then simply pass these further to the implementation of the {% component %} node.

    So a template tag like this:

    {% component "my_table" var1 ...[2, 3] key=val ...{"other": "x"} / %}

    Eventually gets resolved to something like so:

    ComponentNode.render("my_table", var1, 2, 3, key=val, other="x")

Validation

The template tag inputs respect Python's convetion of not allowing args after kwargs.

When compiling AST into a Python function, we're able to detect obvious cases and raise an error early, like:

{% component key=val my_var / %}  {# Error! #}

However, some cases can be figured out only at render time. Becasue the spread syntax ...my_var can be used with both a list of args or a dict of kwargs.

So we need to wait for the Context object to figure out whether this:

{% component ...items my_var  / %}

Resolves to lists (OK):

{% component ...[1, 2, 3] my_var  / %}

Or to dict (Error):

{% component ...{"key": "x"} my_var  / %}

So when we detect that there is a spread within the template tag, we add a render-time function that checks whether the spread resolves to list or a dict, and raises if it's not permitted:

INPUT:

{% component ...options1 key1="value1" ...options2 key1="value1" / %}

Generated function:

def compiled_func(context, *, expression, translation, variable, filter):
    def _handle_spread(value, raw_token_str, args, kwargs, kwarg_seen):
        if hasattr(value, "keys"):
            kwargs.extend(value.items())
            return True
        else:
            if kwarg_seen:
                raise SyntaxError("positional argument follows keyword argument")
            try:
                args.extend(value)
            except TypeError:
                raise TypeError(
                    f"Value of '...{raw_token_str}' must be a mapping or an iterable, "
                    f"not {type(value).__name__}."
                )
            return False

    args = []
    kwargs = []
    kwargs.append(('key1', "value1"))
    kwarg_seen = True
    kwarg_seen = _handle_spread(variable(context, 'options1'), """options1""", args, kwargs, kwarg_seen)
    kwargs.append(('key2', "value2"))
    kwarg_seen = _handle_spread(variable(context, 'options2'), """options2""", args, kwargs, kwarg_seen)
    return args, kwargs

}

#[cfg(test)]
mod tests {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why this PR is so huge is mainly because of the tests. In ths file, tag_parser.rs, it takes up 7500 lines. There's around 120 tests, in an attempt to test the interactions between different kind of values - e.g. filters in lists, lists in filters, etc etc.

}

impl TagParser {
pub fn parse_tag(input: &str, flags: &HashSet<String>) -> Result<Tag, ParseError> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the AST generation is implemented.


TContext = TypeVar("TContext")

class ValueKind:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python API generated by PyO3 / maturin doesn't include type hints. For type checker it's a black box, and so the Python API has to be re-defined as .pyi file.

// The full tag is a sequence of attributes
// E.g. `{% slot key=val key2=val2 %}`
// NOTE: tag_wrapper is used when parsing exclusively a single Django template tag.
tag_wrapper = { SOI ~ django_tag ~ EOI }
Copy link
Contributor Author

@JuroOravec JuroOravec Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the definition of the grammar rules. For easier reading, there's also a VSCode extension for syntax highlight for .pest files.

attributes = tag_or_attrs
func_string = compile_ast_to_string(attributes)
local_scope = {}
exec(func_string, {}, local_scope)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When compiling the AST into Python function, I call exec() with the generated code.

This is safe, because we construct the contents of the generated function. And we never call or access a variable directly - instead, we call the helper functions like variable(...) etc. And we define those implementations.

One thing where I had to be careful is that, for better debugging, when there is a problem with the spread (args after kwargs), then there I printed out the original string to point user to where the error occured.

So there I wrapped the raw string in """ and escaped all " inside the string to prevent the raw string from terminating the entire expression.

"""


def compile_tag(tag_or_attrs: Union[Tag, List[TagAttr]]):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing - To convert the function code string into an actual Python object, this is much easier to be done on Python side with exec() than trying to re-construct the entire function body inside Rust (if it's even possible).

So this compile_tag() and CompiledFunc are actually part of the public API of djc_core that are defined in directly in Python.

Because of that, these two entities had to be exposed and added to __all__ in djc_core/__init__.py.

@JuroOravec JuroOravec marked this pull request as ready for review October 23, 2025 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant