Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

returning fails matching specified target columns #9012

Closed
y-wei opened this issue Apr 5, 2023 · 10 comments
Closed

returning fails matching specified target columns #9012

y-wei opened this issue Apr 5, 2023 · 10 comments
Assignees
Labels
component/frontend Protocol, parsing, binder. type/bug Something isn't working

Comments

@y-wei
Copy link
Contributor

y-wei commented Apr 5, 2023

Describe the bug

No response

To Reproduce

dev=> create table tn (v1 int, v2 int);
CREATE_TABLE
dev=> insert into tn (v2) values (1) returning *;
 v1 | v2 
----+----
  1 |   
(1 row)

INSERT 0 1

Expected behavior

dev=> create table tn (v1 int, v2 int);
CREATE_TABLE
dev=> insert into tn (v2) values (1) returning *;
 v1 | v2 
----+----
   | 1  
(1 row)

INSERT 0 1

Additional context

No response

@y-wei y-wei added type/bug Something isn't working component/frontend Protocol, parsing, binder. labels Apr 5, 2023
@github-actions github-actions bot added this to the release-0.19 milestone Apr 5, 2023
@y-wei y-wei self-assigned this Apr 10, 2023
@broccoliSpicy
Copy link
Contributor

broccoliSpicy commented Apr 11, 2023

when binding returing_list: the InputRef's index is derived from the original table's column layout

let (returning_list, fields) = self.bind_returning_list(returning_items)?;

but the inserted values may have a different column layout.
we can rearrange the inserted values to align with the original table's column layout, but this kind of memory shuffle may drag performance.
I suppose we can address this issue by rewire the returning InputRef's index and then disable column pruning for insertion -> projection(or disable all optimization for insertion->projection?)

if we decided to disable column pruning for insertion -> projection, the LogicalProject's logic can be entangled, can we just add a new type of PlanNode called InsertionReturningProject?

@y-wei
Copy link
Contributor Author

y-wei commented Apr 11, 2023

Excellent analysis👏 That said I find introducing a new PlanNode maybe somewhat overkill?
I'm wondering if add one more LogicalProject::with_mapping here above L43 is sufficient?

if returning {
plan = LogicalProject::create(plan, insert.returning_list);
}

@broccoliSpicy
Copy link
Contributor

Excellent analysisclap That said I find introducing a new PlanNode maybe somewhat overkill? I'm wondering if add one more LogicalProject::with_mapping here above L43 is sufficient?

if returning {
plan = LogicalProject::create(plan, insert.returning_list);
}

I will happily try your approach later
but what's frightening me is column pruning, even if we had the correct InputRef.index,
for example,

create table t (v1 int, v2 real);
insert into t (v2) values(1.5) returning v1;

in this query, since we don't need the v2 values in projection, the LogicalProject's optimizer may mess up our insert values, perhaps even lose it, but the real insertion execution haven't begin yet

@y-wei y-wei linked a pull request Apr 11, 2023 that will close this issue
5 tasks
@y-wei
Copy link
Contributor Author

y-wei commented Apr 12, 2023

@broccoliSpicy After reflection I realize that your idea looks better👍. Would you like to fix it?

@broccoliSpicy
Copy link
Contributor

sure, gladly, but we better finalize our approach first. would you mind some comments @st1page

@y-wei y-wei assigned broccoliSpicy and unassigned y-wei Apr 12, 2023
@broccoliSpicy
Copy link
Contributor

I implemented InputRef index rewiring and found some tricky bug.
this approach won't work.

 dev=> create table tn (v1 int, v2 int);                                     
CREATE_TABLE
dev=> insert into tn (v2) values (1) returning v2 + 1;
 ?column? 
----------
        2
(1 row)

INSERT 0 1
dev=> insert into tn (v2) values (1) returning v2+1. v1;
 v1 
----
  2
(1 row)

INSERT 0 1
dev=> insert into tn (v2) values (1) returning v2+1, v1;
 ?column? | v1 
----------+----
        2 |   
(1 row)

INSERT 0 1
dev=> insert into tn (v2) values (1,5) returning v2+1, v1;                  
ERROR:  QueryError: Bind error: INSERT has more expressions than target columns
dev=> insert into tn (v2) values (1.5) returning v2+1, v1;                  
 ?column? | v1 
----------+----
        3 |   
(1 row)

INSERT 0 1
dev=> insert into tn (v2) values (1.5) returning v2+1, v1;
 ?column? | v1 
----------+----
        3 |   
(1 row)

INSERT 0 1
dev=> insert into tn (v2) values (1.5) returning v2 + 1.1;           │ 0:psql │                                                                     │
│  ?column?                                                            └────────┘                                                                     │
│ ----------                                                                                                                                          │3.1                                                                                                                                           │
│ (1 row)   

I didn't check it through but I guess during function binding, we infer the function's input and return type based on the target table's schema, then do some type coercion
I am afraid more and more subtle bug will emerge with this approach...
index rewiring was supposed to be a hack but I guess it requires a lot of delicate patches.

I am thinking about mock a binding context before we bind returning list now

@broccoliSpicy
Copy link
Contributor

ha, my bad, the above behavior is actually in conform with postgres.
I guess both index rewiring and mock binding context would work.
still implemented mock binding context though

@st1page
Copy link
Contributor

st1page commented Apr 25, 2023

I used to be unfamiliar with this grammar and I find it is more powerful than I thought 🥵

dev=# with cte as (insert into t values (1, 20, 400, 5000), (2,60,600,6000) returning x,z) select sum(x) from cte;
 sum 
-----
  80
(1 row)

dev=# explain with cte as (insert into t values (1, 20, 400, 5000), (2,60,600,6000) returning x,z) select sum(x) from cte;
                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Aggregate  (cost=0.07..0.08 rows=1 width=8)
   CTE cte
     ->  Insert on t  (cost=0.00..0.03 rows=2 width=16)
           ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=16)
   ->  CTE Scan on cte  (cost=0.00..0.04 rows=2 width=4)
(5 rows)

@y-wei y-wei modified the milestones: release-0.19, release-0.20 May 12, 2023
@fuyufjh fuyufjh removed this from the release-1.0 milestone Aug 8, 2023
@fuyufjh
Copy link
Contributor

fuyufjh commented Aug 8, 2023

ha, my bad, the above behavior is actually in conform with postgres.
I guess both index rewiring and mock binding context would work.
still implemented mock binding context though

Hey, may I ask any new updates?

@broccoliSpicy
Copy link
Contributor

this issued is fixed by #11080.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/frontend Protocol, parsing, binder. type/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants