Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Lambda functions to parser #1313

Merged
merged 7 commits into from
Jan 22, 2021
Merged

Conversation

Mytherin
Copy link
Collaborator

This PR adds basic support for lambda functions to the parser. They are not supported anywhere else yet and not bound yet, but the plan is to use them later on in functions that can apply to lists.

Lambda expressions look like this:

class LambdaExpression : public ParsedExpression {
public:
	string capture_name;
	unique_ptr<ParsedExpression> expression;
};

Example syntax:

SELECT map(i, x -> x + 1) FROM (VALUES (list_value(1, 2, 3))) tbl(i);

@hannes
Copy link
Member

hannes commented Jan 21, 2021

CC @mbasmanova

@mbasmanova
Copy link

Mark, this is great. To confirm, does this PR include support for multiple arguments for lambda, e.g. map_filter(m, (k, v) -> k > 10 AND v < 0) and does it include support for captures, e.g. filter(array_column, x -> x > int_column)?

@Mytherin
Copy link
Collaborator Author

It supports only single argument captures right now, e.g. filter(array_column, x -> x > column) works, but map_filter(m, (k, v) -> k > 10 and v < 0) does not. I can have a look at extending the lambdas to support multiple captures.

@Mytherin
Copy link
Collaborator Author

As for captures, the parser will not do anything besides transforming the expression. It is up to the binder to actually resolve columns. i.e. filter(array_column, x -> x > int_column) will pass the parser just fine and generate a lambda expression containing the following:

capture_name: x
expression: `Comparison(Column(x), Column(int_column), GREATER_THAN)`

The binder is then in charge of resolving "x" back to the lambda, and "int_column" to another data source (e.g. a table).

@hannes
Copy link
Member

hannes commented Jan 21, 2021

+optional type specification for lambda

@mbasmanova
Copy link

Mark, thanks for explaining. To clarify the terminology, lambda in filter(array_column, x -> x > int_column) has a single capture, e.g. int_column, and a single argument, e.g. x. In map_filter(m, k > 10 and v < 0) we have a lambda with no captures and 2 arguments: k and v. Hence, I think we should rename capture_name above into something like argument_names and make it a vector, not a single value.

@Mytherin
Copy link
Collaborator Author

That makes a lot of sense; will do. Thanks for the feedback!

@Mytherin
Copy link
Collaborator Author

All the changes are implemented now, lambda functions now look like this:

class LambdaExpression : public ParsedExpression {
	vector<string> parameters;
	unique_ptr<ParsedExpression> expression;
};

I also fixed several operator precedence rules so that lambda arrows take priority over other operators, which causes e.g. x -> x + 1 AND y + 1 to be correctly parsed as x + 1 AND y + 1 without requiring brackets.

select map(i, (x, y) -> x + y) from tbl;
-- lambda: parameters { x, y }, function: x + y
select map(i, x -> x + 1) from (values (list_value(1, 2, 3))) tbl(i);
-- lambda: parameters { x }, function: x + 1
select map(i, x -> x + 1 AND y + 1) from (values (list_value(1, 2, 3))) tbl(i);
-- lambda: parameters { x }, function: x -> x + 1 AND y + 1

@mbasmanova
Copy link

Nice. Thank you, Mark.

…end_parser_scan.cpp to avoid triggering R CRAN warnings
@Mytherin Mytherin merged commit f79660c into duckdb:master Jan 22, 2021
@Mytherin Mytherin deleted the lambdas branch February 18, 2021 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants