Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable binder in generated column expression refers to table column #5739

Closed
2 tasks done
eaubin opened this issue Dec 19, 2022 · 8 comments
Closed
2 tasks done

Variable binder in generated column expression refers to table column #5739

eaubin opened this issue Dec 19, 2022 · 8 comments
Labels

Comments

@eaubin
Copy link

eaubin commented Dec 19, 2022

What happens?

Lambda in a select works as expected:

> duckdb.connect().execute("SELECT list_filter([0,1,2,3,4], x -> TRUE)").fetchall()
[([0, 1, 2, 3, 4],)]

but within a generated column expression

CREATE TABLE test (
    nums INTEGER[] GENERATED ALWAYS AS (list_filter([0,1,2,3,4], x -> TRUE)) VIRTUAL
);

x is checked as a column reference: BinderException: Binder Error: Column "x" referenced by generated column does not exist.

To Reproduce

CREATE TABLE test (
    nums INTEGER[] GENERATED ALWAYS AS (list_filter([0,1,2,3,4], x -> TRUE)) VIRTUAL
);

OS:

windows,osx,x64

DuckDB Version:

duckdb-0.6.2.dev683-cp37-cp37m-win_amd64

DuckDB Client:

Python, CLI

Full Name:

eaubin

Affiliation:

unaffiliated

Have you tried this on the latest master branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree
@Tishj
Copy link
Contributor

Tishj commented Dec 19, 2022

I misremembered, we do allow creating generated columns that don't refer to columns of the table, but we don't support creating a table only containing a generated column

Error: Binder Error: Creating a table without physical (non-generated) columns is not supported

Though your error seems unrelated to that, your example would produce this error, if not for the current one.

@eaubin
Copy link
Author

eaubin commented Dec 19, 2022

This was the smallest example I could come up with for the error. My actual use case is I have a table with a text column and I want to split it and filter the contents. The query below is closer in spirit

CREATE TABLE test (
    nums INTEGER[],
    evens INTEGER[] GENERATED ALWAYS AS
        (list_filter(nums, x -> x%2=0)) VIRTUAL    
)

@Tishj
Copy link
Contributor

Tishj commented Dec 19, 2022

Ah then yes, we should fix that
I looked into an issue related to lambdas yesterday, might be related to this
Though I don't have a fix for it, the fix I made fixed one thing and broke another 🙃

@eaubin
Copy link
Author

eaubin commented Dec 19, 2022

Another error, probably the same cause though,

CREATE TABLE test (
    nums INTEGER[],
    evens INTEGER[] GENERATED ALWAYS AS
        (list_filter(nums, nums -> TRUE)) VIRTUAL    
)

gives Binder Error: Conflicting column names for column nums!

@Tishj
Copy link
Contributor

Tishj commented Jan 2, 2023

Another error, probably the same cause though,

CREATE TABLE test (
    nums INTEGER[],
    evens INTEGER[] GENERATED ALWAYS AS
        (list_filter(nums, nums -> TRUE)) VIRTUAL    
)

gives Binder Error: Conflicting column names for column nums!

That sounds like a bug, but it would just throw another error when fixed
the second nums -> TRUE is a lambda parameter by the same name as the original, and is thus ambiguous

D create table tbl as select [5,4,3] as a;
D select * from tbl;
┌───────────┐
│     a     │
│  int32[]  │
├───────────┤
│ [5, 4, 3] │
└───────────┘
D select list_filter(a, a -> True) from tbl;
Error: Binder Error: Invalid parameter name 'tbl.a': must be unqualified

@eaubin
Copy link
Author

eaubin commented Jan 2, 2023

The 'lambda' in the documentation might be suggesting unintended things to me, e.g. scheme/haskell lexical scope where binders can be named arbitrary and inner binders scope most tightly.

The binder error Invalid parameter name 'tbl.a': must be unqualified is surprising to me too. I thought the a in a -> True would just be a binder, and not a parameter.

Just FWIW (and feel free to ignore/or convert to discussion), another thing I tried was to use a macro in a list filter:

CREATE OR REPLACE FUNCTION foo_always_true(foo) AS TRUE;
SELECT list_filter(a, foo_always_true) FROM tbl;
Binder Error: Referenced column "foo_always_true" not found in FROM clause!
Candidate bindings: "tbl.a"

This makes sense as it seems macro/functions are really different from lambda functions. Edit: Of course there's no need to pass the function directly, eta expansion works fine: SELECT list_filter(a, x -> foo_always_true(x)) FROM tbl.

CREATE OR REPLACE FUNCTION foo_always_true() AS foo -> TRUE;
Binder Error: Referenced column "foo" not found in FROM clause!

This looks like the same behavior as above.

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days.

@github-actions github-actions bot added the stale label Jul 29, 2023
@github-actions
Copy link

This issue was closed because it has been stale for 30 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants