Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qualify macros only when called, and throw error on recursive macros #8224

Merged
merged 7 commits into from
Aug 1, 2023

Conversation

lnkuiper
Copy link
Contributor

Fixes #8141

@github-actions github-actions bot marked this pull request as draft July 14, 2023 11:29
@lnkuiper lnkuiper marked this pull request as ready for review July 14, 2023 11:32
Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks good - one more comment:

case ExpressionClass::SUBQUERY: {
// replacing parameters within a subquery is slightly different
auto &sq = (expr->Cast<SubqueryExpression>()).subquery;
ParsedExpressionIterator::EnumerateQueryNodeChildren(*sq->node, [&](unique_ptr<ParsedExpression> &child) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this will not work with TABLE macros, correct? i.e. this will still fail:

create macro m1(a) as a+1;
create macro m2(a) as table select m1(a);
create or replace macro m1(a) as (from m2(42));
select m1(42);

I think it might make sense to have an expression recursion limit in the binder similar to what we have in the transformer (look for StackCheck). Macros can also be used to circumvent the expression depth limit otherwise which can lead to a stack overflow.

@github-actions github-actions bot marked this pull request as draft July 17, 2023 08:53
@lnkuiper
Copy link
Contributor Author

I've implemented StackCheck in ExpressionBinder, seems to work well. I've set the maximum depth to 128. With higher depths, we get stack overflows instead. I hope 128 is OK.

Also, I found a bug unrelated (I think) to this PR:

$ duckdb
-- Loading resources from /Users/laurens/.duckdbrc
v0.8.1-dev1402 aada97ec5f
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D drop macro sum;
ABORT THROWN BY INTERNAL EXCEPTION: Database "system" not found
^C[1]    53269 abort (core dumped)  duckdb

@lnkuiper lnkuiper marked this pull request as ready for review July 17, 2023 08:56
@github-actions github-actions bot marked this pull request as draft July 17, 2023 14:10
@lnkuiper lnkuiper marked this pull request as ready for review July 17, 2023 14:10
@Alex-Monahan
Copy link
Contributor

I think Python has a 10,000 level default on the stack and it can be edited with a parameter, in case that is helpful!

@Mytherin
Copy link
Collaborator

I've implemented StackCheck in ExpressionBinder, seems to work well. I've set the maximum depth to 128. With higher depths, we get stack overflows instead. I hope 128 is OK.

Also, I found a bug unrelated (I think) to this PR:

$ duckdb
-- Loading resources from /Users/laurens/.duckdbrc
v0.8.1-dev1402 aada97ec5f
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D drop macro sum;
ABORT THROWN BY INTERNAL EXCEPTION: Database "system" not found
^C[1]    53269 abort (core dumped)  duckdb

Could you open an issue with that bug?

@lnkuiper
Copy link
Contributor Author

Could you open an issue with that bug?

Done #8287

I've also investigated why I could only set the limit to 128, and it's because there's a lot of recursive calls in the Binder, especially when binding table macro's. This is one recursive binding iteration:

ExpressionBinder::BindExpression
BaseSelectBinder::BindExpression
ExpressionBinder::BindMacro
ExpressionBinder::BindExpression
ExpressionBinder::BindExpression
BaseSelectBinder::BindExpression
ExpressionBinder::Bind
ExpressionBinder::Bind
Binder::BindSelectNode
Binder::BindNode
Binder::BindNode
Binder::Bind
Binder::Bind
Binder::BindNode
Binder::BindNode
ExpressionBinder::BindExpression

In this cycle, the stack counter of StackCheck is incremented once (in the main switch in expression_binder.cpp), despite there being 16 function calls.

@Mytherin
Copy link
Collaborator

I wonder if there is any place where a large amount of data is placed onto the stack (e.g. a large array)? 128 does seem very low for a limit. As Alex mentioned Python defaults to 10K, SQLite defaults to 1K.

We might also want to add more recursion checking in e.g. Binder::BindSelectNode as well.

@lnkuiper
Copy link
Contributor Author

I don't think any large stack allocations are being done. For every pass through the main switch in ExpressionBinder, we have 16 function calls until we increment StackCheck again, but only for this specific test case with the recursive table macro. Small stack allocations are being done in some of the 16 function calls.

I've tried again and increased the limit to 425 (that's as far as I went with my binary search). Considering we have 16 function calls per increment, that's a stack depth 425 * 16 = 6800. If we want, we could consider increasing the stack size at compile-time. I could also add a special case to bind_macro_expression.cpp to increment the stack counter by more?

@Mytherin
Copy link
Collaborator

I think perhaps we need to add more instances of the StackChecker along the path then. Shouldn't the stack depth be incremented multiple times as ExpressionBinder::BindExpression is called several times in the path?

We should likely also add a stack increment to Binder::BindNode.

@Mytherin Mytherin merged commit 66bb94d into duckdb:master Aug 1, 2023
53 checks passed
@Mytherin
Copy link
Collaborator

Mytherin commented Aug 1, 2023

Thanks!

krlmlr added a commit to duckdblabs/duckplyr that referenced this pull request Aug 17, 2023
@lnkuiper lnkuiper deleted the fts_attach branch August 31, 2023 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error when attaching a remote db with FTS
3 participants