Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for BigQuery User-Defined Functions (UDFs) #1289

Open
dirkjonker opened this issue Nov 25, 2021 · 6 comments
Open

Support for BigQuery User-Defined Functions (UDFs) #1289

dirkjonker opened this issue Nov 25, 2021 · 6 comments

Comments

@dirkjonker
Copy link

dirkjonker commented Nov 25, 2021

It would be useful to be able to create User-Defined Functions (UDF) from dataform, and being able to reference them like you reference a table. Right now I create UDFs in the pre_operations section, but I don't have syntax highlighting or ability to test it. Also calling the UDFs from a query is a bit hacky now, as you need to fully qualify the function name (project_name.dataset.function_name) in order to ensure you are calling the right function.

Bonus for being able to unit-test Javascript UDFs!

@lewish
Copy link
Collaborator

lewish commented Nov 26, 2021

You can kind of do this already:

// my_udf.sqlx
config {
  type: "operations",
  hasOutput: true,
}

create function ${self()}(x INT64) ...

And then you can reference it:

// my_table.sqlx
config { type: "table" }
select ${ref("my_udf")}(column) from ...

This is a bit of a workaround, but it does mean the UDFs are part of the graph, by storing them as permanent functions inside BigQuery.

This doesn't address your points on unit-testing however!

@dirkjonker
Copy link
Author

dirkjonker commented Nov 30, 2021

@lewish thanks, that's really useful!

Now I'm just missing the syntax highlighting and automatic formatting that respects the Javascript in the UDF string ;)

@dirkjonker
Copy link
Author

dirkjonker commented Nov 30, 2021

I do get query validation warnings: "Some dependencies don't yet exist in the warehouse". It seems like Dataform doesn't know that the functions exist, even though it can successfully create them. It's just a warning though, so it doesn't block any dependent resources from being created, but it's a false warning which is inconvenient.

@sanimesa
Copy link

Just tried this, but dataform did not recognize the function until it was executed and persisted in the database.

@davidsr2r
Copy link

I was looking into this as a data engineer at a company using dbt, who helped develop our dbt tooling and maintain it; this would be a somewhat compelling reason to make a switch from dbt, since dbt also lacks this feature. We use UDFs to share functionality across models.
If dataform had first class UDF support and a proper dry run feature (dry running incremental models, for example) it'd be an easy sell to make the switch.

@Ekrekr
Copy link
Contributor

Ekrekr commented Mar 27, 2024

To expand on this, I would love to see some examples of where the functionality of Javascript UDFs can't be achieved without them.

Having logic computed at runtime in BigQuery removes a lot of the benefits of the determinism of a compiled Dataform graph - so I'm trying to understand the motivations of them better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants