Skip to content

Feat: enable dbt transpilation#2806

Merged
izeigerman merged 7 commits intomainfrom
trey/dbt-transpilation
Jun 24, 2024
Merged

Feat: enable dbt transpilation#2806
izeigerman merged 7 commits intomainfrom
trey/dbt-transpilation

Conversation

@treysp
Copy link
Contributor

@treysp treysp commented Jun 21, 2024

This PR allows dbt project models and tests to be written in a different SQL dialect than the target engine on which the project is being run.

A project has a default dialect - all models/tests without an explicitly configured dialect are assumed to be written in the default. In a dbt project without an explicit default dialect specified, the target engine dialect is the default.

Specify a default dialect other than the target engine in the config.py file model defaults argument

  • Example: `config=sqlmesh_config(Path(file).parent, model_defaults=ModelDefaultsConfig(dialect="bigquery"))

Specify dialects for individual models/tests in the standard way:

  • Configuration jinja at top of file: {{ config(dialect='bigquery') }}
  • YAML configuration files in file directories
  • dbt_project.yml

Limitations

  • SQLMesh allows kind-specific parameters like time_column to be specified in a different dialect than the model query itself. In dbt, kind-specific parameters are parsed with the model's dialect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we now need to handle dialect in the jinja, would it make sense to always pass in dialect here? It would simplify create_builtin_globals code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we want to pass in the default dialect? Wouldn't it make more sense to pass in the model dialect during to_sqlmesh in the model?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't pass in self.default_dialect because target hasn't been set the first time this is called and we'll error.

I'm not following your second comment - where in to_sqlmesh can we modify the globals? Here?
https://github.com/TobikoData/sqlmesh/blob/b7c659e66b6eab5ec30a6f5d341937d8746de683/sqlmesh/dbt/model.py#L382

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@crericha I think this is only relevant in the context of dbt projects and doesn't apply to native projects, so not sure whether we need the dialect as a top-level argument.

Copy link
Contributor

@crericha crericha Jun 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is needed. Aren't we creating directly into the generic AST form and the sqlglot generator will handle transpiling for us? How is partitioned_by parsing handled for a native sqlmesh model?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exp.to_column(val) is parsing the raw column string from the model definition and returning an expression. If someone includes dialect-specific characters, I think it will choke.

cluster_by is also parsing column names just below and passes a dialect, so maybe it not being here already is an oversight?
https://github.com/TobikoData/sqlmesh/blob/b7c659e66b6eab5ec30a6f5d341937d8746de683/sqlmesh/dbt/model.py#L377

Either way, this initial implementation is messy, so I'll clean it up and we can remove if it's not necessary.

@treysp treysp force-pushed the trey/dbt-transpilation branch 2 times, most recently from c1ffc0f to 324b6b1 Compare June 21, 2024 21:24
@treysp treysp force-pushed the trey/dbt-transpilation branch from 324b6b1 to 9ae6ebd Compare June 21, 2024 21:26
@treysp treysp force-pushed the trey/dbt-transpilation branch from 9f2ef2a to 738fa58 Compare June 21, 2024 22:13
@treysp treysp force-pushed the trey/dbt-transpilation branch from 0d68419 to 738fa58 Compare June 24, 2024 03:04
@izeigerman izeigerman force-pushed the trey/dbt-transpilation branch from b4b3c54 to 370ae0e Compare June 24, 2024 19:13
@izeigerman izeigerman force-pushed the trey/dbt-transpilation branch from 370ae0e to 6ccf7b9 Compare June 24, 2024 20:02
@izeigerman izeigerman merged commit e0f5dcc into main Jun 24, 2024
@izeigerman izeigerman deleted the trey/dbt-transpilation branch June 24, 2024 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants