Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt templater should be updated to use the same source slicing/mapping algorithm as the Jinja templater (i.e. JinjaTracer) #1783

Closed
barrywhart opened this issue Oct 29, 2021 · 5 comments · Fixed by #1788
Assignees
Labels
dbt Related to Data Build Tool enhancement New feature or request

Comments

@barrywhart
Copy link
Member

In issue #1437 (and the PR #1678), the Jinja templater was updated with a new, more robust source slicing/mapping algorithm. The dbt templater should be updated to use the same algorithm.

@barrywhart barrywhart added the enhancement New feature or request label Oct 29, 2021
@barrywhart
Copy link
Member Author

I captured a call stack for the SQLFluff dbt templater -> dbt -> Jinja:

  /Users/bhart/dev/sqlfluff/plugins/sqlfluff-templater-dbt/sqlfluff_templater_dbt/templater.py(383)_unsafe_process()
-> manifest=self.dbt_manifest,
  /Users/bhart/.pyenv/versions/3.7.9/envs/sqlfluff-3.7.9/lib/python3.7/site-packages/dbt/compilation.py(509)compile_node()
-> node = self._compile_node(node, manifest, extra_context)
  /Users/bhart/.pyenv/versions/3.7.9/envs/sqlfluff-3.7.9/lib/python3.7/site-packages/dbt/compilation.py(413)_compile_node()
-> node,
> /Users/bhart/.pyenv/versions/3.7.9/envs/sqlfluff-3.7.9/lib/python3.7/site-packages/dbt/clients/jinja.py(556)get_rendered()
-> def get_rendered(

@barrywhart
Copy link
Member Author

From dbt/clients/jinja.py, these are the functions that interact closely with Jinja:

def get_template(
    string: str,
    ctx: Dict[str, Any],
    node=None,
    capture_macros: bool = False,
    native: bool = False,
):
    with catch_jinja(node):
        env = get_environment(node, capture_macros, native=native)

        template_source = str(string)
        return env.from_string(template_source, globals=ctx)

def render_template(template, ctx: Dict[str, Any], node=None) -> str:
    with catch_jinja(node):
        return template.render(ctx)

Nothing too exotic looking.

@barrywhart
Copy link
Member Author

The main technical challenge for this work is that dbt/compilation.py does not expose the Jinja environment and context through a public class or function. Instead, it builds the context and calls the above functions in the same function, _compile_node():

    def _compile_node(
        self,
        node: ManifestNode,
        manifest: Manifest,
        extra_context: Optional[Dict[str, Any]] = None,
    ) -> NonSourceCompiledNode:
        if extra_context is None:
            extra_context = {}

        logger.debug("Compiling {}".format(node.unique_id))

        data = node.to_dict(omit_none=True)
        data.update({
            'compiled': False,
            'compiled_sql': None,
            'extra_ctes_injected': False,
            'extra_ctes': [],
        })
        compiled_node = _compiled_type_for(node).from_dict(data)

        context = self._create_node_context(
            compiled_node, manifest, extra_context
        )

        compiled_node.compiled_sql = jinja.get_rendered(
            node.raw_sql,
            context,
            node,
        )

        compiled_node.relation_name = self._get_relation_name(node)

        compiled_node.compiled = True

        # add ctes for specific test nodes, and also for
        # possible future use in adapters
        compiled_node = self._add_ctes(
            compiled_node, manifest, extra_context
        )

        return compiled_node

dbt is a fast-moving, relatively young project. Rather than rely on dbt internals, I think the best option may be to monkeypatch Jinja to "intercept" the calls to env.from_string() and template.render(ctx). It's not pretty, but if dbt continues using the same two functions to call Jinja, it will be pretty stable.

We might also consider reaching out to the dbt community and/or creating a PR to basically split _compile_node() into two functions:

  • Build the Jinja "context" dictionary
  • Build and compile template

and to expose the first function publicly for use by SQLFluff.

@barrywhart
Copy link
Member Author

What do you think about the proposal here, @alanmcruickshank?

@barrywhart barrywhart changed the title dbt templater should be updated to use the same source slicing/mapping algorithm as the Jinja templater (i.e. TemplateTracer) dbt templater should be updated to use the same source slicing/mapping algorithm as the Jinja templater (i.e. JinjaTracer) Oct 31, 2021
@alanmcruickshank
Copy link
Member

alanmcruickshank commented Oct 31, 2021

What do you think about the proposal here, @alanmcruickshank?

I think your idea of monkey patching feels the most pragmatic for now. I can ask the dbt crew about their internal API but if I was them I wouldn't want someone depending on the internal API like that.

The parts of the dbt templater that already reference internals are the ones that often need changing so relying on a more stable project like Jinja to do the patching feels good.

This has the added benefit that we can do something similar if any other tools also rely on Jinja in the future 👍

@tunetheweb tunetheweb added the dbt Related to Data Build Tool label Oct 31, 2021
@barrywhart barrywhart self-assigned this Nov 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dbt Related to Data Build Tool enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants