-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add JavaScript UDFs for BigQuery #1377
Conversation
7d5d9dd
to
139e572
Compare
| def generate_setup_queries(self): | ||
| result = list( | ||
| map(partial(BigQueryUDFDefinition, context=self.context), | ||
| lin.traverse(find_bigquery_udf, self.expr))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"A local variables wouldn't kill you here" :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might :)
| @@ -291,10 +291,11 @@ def _build_ast(self, expr, context): | |||
| result = comp.build_ast(expr, context) | |||
| return result | |||
|
|
|||
| def _execute_query(self, ddl, async=False): | |||
| def _execute_query(self, dml, async=False): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did You mean sql like above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, dml is coming in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think you're right. Let me take another look.
ibis/bigquery/compiler.py
Outdated
|
|
||
| def find_bigquery_udf(expr): | ||
| if isinstance(expr.op(), BigQueryUDFNode): | ||
| return lin.halt, expr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will find the topmost BigQueryUDFNodes in the hierarchy. Is it the expected behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. Fixing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Multiple UDFs now tested and working
ibis/bigquery/compiler.py
Outdated
| (query, rest) = (ast.queries[0], ast.queries[1:]) | ||
| assert not rest | ||
| query, rest = (ast.queries[0], ast.queries[1:]) | ||
| assert not rest, '*rest should be empty in bigquery.compiler._get_query' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean that multi-statement queries are not supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm removing this function from this module.
ibis/bigquery/udf/__init__.py
Outdated
| @@ -0,0 +1 @@ | |||
| from ibis.bigquery.udf.udf import udf # noqa: F401 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should avoid namespaces like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll rename the udf module to core.
ibis/bigquery/udf/udf.py
Outdated
| from ibis.bigquery.compiler import compiles, BigQueryUDFNode | ||
|
|
||
|
|
||
| ibis_type_to_bigquery_type = Dispatcher('ibis_type_to_bigquery_type') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could reuse this conversion in ibis.bigquery.client. The other backends store the datatype conversion logic in ibis.{backend}.client, which we should factor out (probably) to ibis.{backend}.datatypes in a followup PR.
ibis/bigquery/udf/udf.py
Outdated
| return textwrap.indent(text, ' ' * spaces) | ||
|
|
||
|
|
||
| def udf(input_type, output_type, strict=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about moving this to ibis.bigquery.udf.api or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ibis/bigquery/udf/udf.py
Outdated
| ) | ||
|
|
||
|
|
||
| class PythonToJavaScriptTranslator: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about moving this to transpiler.py along with the find utility in find.py? Find is only used during translation (maybe Rewriter too).
ibis/client.py
Outdated
| @@ -221,15 +218,12 @@ def execute(self, expr, params=None, limit='default', async=False, | |||
| Scalar expressions: Python scalar value | |||
| """ | |||
| ast = self._build_ast_ensure_limit(expr, limit, params=params) | |||
| result = self._execute_query(ast, async=async, **kwargs) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ast got confusing with js2py transpiler introduced, we might prefer query_ast here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed.
ibis/sql/alchemy.py
Outdated
| @@ -683,6 +670,18 @@ def __init__(self, *args, **kwargs): | |||
| super(AlchemyContext, self).__init__(*args, **kwargs) | |||
| self._table_objects = {} | |||
|
|
|||
| def collapse(self, queries): | |||
| if not isinstance(queries, six.string_types): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the following is more idiomatic:
if isinstance(queries, six.string_types):
return query
if len(queries) > 1:
raise 'Multi-statement queries not supported'
return queries[0]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ibis/bigquery/compiler.py
Outdated
| def find_bigquery_udf(expr): | ||
| if isinstance(expr.op(), BigQueryUDFNode): | ||
| result = expr | ||
| return lin.proceed, expr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is not required.
13b71ee
to
96c9ec0
Compare
eadf4f8
to
ae70d3f
Compare
ca89eb9
to
c7538e6
Compare

No description provided.