Support RETURNING in modification commands #242

citus-github-bot · 2016-02-04T19:09:00Z

Issue by digi604
Thursday Jul 23, 2015 at 15:23 GMT
Originally opened as citusdata/pg_shard#128

Any query using the RETURNING clause produces the following error:

NotSupportedError: cannot perform distributed planning for the given query
DETAIL:  RETURNING clauses are not supported in distributed queries.

Can you add support for RETURNING clauses?

The text was updated successfully, but these errors were encountered:

citus-github-bot · 2016-02-04T19:09:00Z

Comment by jasonmp85
Monday Jul 27, 2015 at 19:56 GMT

Hey @digi604: What's your use case? We didn't implement RETURNING to keep the feature profile abbreviated during the initial release, but with sufficient demand we could change that. 😃

citus-github-bot · 2016-02-04T19:09:01Z

Comment by rodo
Sunday Aug 09, 2015 at 17:05 GMT

In ORM (I know it's bad, but sometimes we have to work with people who love them) like Django the SQL query include RETURNING to return the object's id to the ORM, maybe @digi604 uses an ORM.
I vote for RETURNING support in pg_shard 👍

citus-github-bot · 2016-02-04T19:09:01Z

Comment by digi604
Tuesday Aug 11, 2015 at 07:52 GMT

i use peewee (orm) in a rather complex worker queue.

ozgune · 2016-03-16T21:31:22Z

This feature came up again. If the query executed on a single shard, how hard would it be to implement? Would we need to keep the returned values from INSERT statements on the master node?

ORMs like to have their INSERT statements to have RETURNING expressions:

returning id -- returning null
FEELINGS_DB.run(Sequel.lit("insert into observation_hash(id, created_at, server_id, data)
  values(?,?,?,?)",
  (feeling_id+i).to_s(16).rjust(32,'0'),
  Time.now,
  (server_id+i).to_s(16).rjust(32,'0'),
  data
))
# vs
FEELINGS_DB[:observation_hash] << {
  id: (feeling_id+i).to_s(16).rjust(32,'0'),
  created_at: Time.now,
  server_id: (server_id+i).to_s(16).rjust(32,'0'),
  data: dat

ozgune · 2016-03-18T00:18:58Z

I just chatted with @anarazel on this. We have two ways to implement this:

Execute the RETURNING statement on the master node: @anarazel notes that this approach is much easier to integrate into the PostgreSQL machinery. Andres, could you note this approach's benefits?
Push the RETURNING statement to the worker node. In this approach, we'll push down the query and store results returned from the worker node in a tuple store. In this approach, we could also cover UPDATE statements that touch a single shard.

I'm also noting #213 and #240 as semi-related issues. At a higher level, we could evaluate expressions on the master node and route these requests, or evaluate them on the workers. This decision may also relate to our replication model.

jasonmp85 · 2016-03-18T02:35:20Z

Hey @ozgune

I don't understand what executing this on the master even means. RETURNING is part of an UPDATE, INSERT, or DELETE command that returns a particular column. It would make the most sense to push this down.

Approach two is what I was going to suggest.

anarazel · 2016-03-18T02:45:36Z

@jasonmp85 The point of not pushing RETURNING down, at the very least for INSERT, is that it a) allows for easy and correct evaluation of now(), nextval(), etc. b) doesn't prevent going to a pipelined execution model c) is actually easier to implement given how RETURNING works in postgres.

jasonmp85 · 2016-03-18T04:07:22Z

Right, but what about UPDATE count = count + 1 RETURNING count? Isn't stuff like that a pretty common pattern, too? Or DELETE FROM accounts WHERE id = 5 RETURNING balance? That is more straightforward if you push the clause down, no?

anarazel · 2016-03-18T04:14:14Z

On 2016-03-17 21:07:23 -0700, Jason Petersen wrote:

Right, but what about UPDATE count = count + 1 RETURNING count?
Isn't stuff like that a pretty common pattern, too? Or DELETE FROM accounts WHERE id = 5 RETURNING balance? That is more
straightforward if you push the clause down, no?

True. But those are much less common. The most common reason to use
RETURNING is for ORMs after an INSERT, to get the generated key et al.

samay-sharma · 2016-04-12T18:24:33Z

Ran into this again during a customer engagement. It was again because their ORM was issuing INSERT queries which had a RETURNING clause.

brettallred · 2016-04-14T05:19:26Z

+1 - We are running into the same issue while evaluating CitusDB

brianbroderick · 2016-04-14T14:11:11Z

Without returning the generated primary key, how do you know what to put in a foreign key without adding the "returning" clause? Yes, I could insert a record then select that record to get the pk value, but that seems inefficient to me.

ozgune · 2016-04-22T22:20:40Z

@anarazel and I talked about this last week; and I'm summarizing our meeting notes. @anarazel -- if I have inaccuracies in this description, could you point them out?

We talked about the architectural implications of executing RETURNING on the master or pushing it to the worker node:

(1) If we implement pipelined write statements on the master node: In this model, the client speaking to the master node supports pipelined execution. We then split into two alternatives:

(1.a) The master node maintains a persistent queue. When the client sends INSERT statements, we persist them in this queue -- we could use logical replication queues for this. We then immediately ACK the INSERT to the client. We then flush this queue to the worker nodes as they fill up.

This pipelined execution model is incompatible with doing INSERT .. RETURNING on the worker nodes.

Two questions that relate to this model are: (i) What happens if the INSERT statement violates a constraint, such as a unique constraint, on the worker node? How do we error out? (ii) How do we handle UPDATE or UPSERT statements in this model?

(1.b) The master node queues up INSERTs and then pushes them out in batches to worker nodes. Once we receive replies from the worker node, we ACK to the client. In this model, INSERT .. RETURNING could be executed on the master or worker node.

(2) In the v5.0 replication model, we have shard replicas living on separate machines. If we push down INSERT .. RETURNING, we need to pay extra effort into pushing it down to only one of the replicas.

Also, if we wanted to implement now(), nextval(), things get even more complicated -- we can't run now() on different replicas.

We're currently evaluating changing Citus' replication model. In this updated model, we will have one primary replica and multiple secondary replicas. If we go with the new replication model, INSERT .. RETURNING could be executed on the master or (primary) worker node.

(3) It's easier to implement these changes on the master node given the way RETURNING is implemented in Postgres. The primary note here was that if we pushed INSERT .. RETURNING to worker nodes, we may need to duplicate code. Two follow-up questions were:

How much code duplication is there for simple INSERT .. RETURNING on worker node?
How much time would it take to do this on the master vs. worker node?

We also talked about -- if we wanted to later support CTEs in RETURNING clauses, this logic would be easier to execute on the master node. I'll also look to upload an architectural diagram that shows these in writing.

(4) We also briefly touched upon UPDATE and DELETE statements. For these statements that have RETURNING clauses, we'd need to push them down to worker nodes.

sumedhpathak · 2016-05-02T21:15:10Z

Removed 5.1 milestone from this issue.

anarazel · 2016-05-03T22:55:20Z

@digi604 @brianbroderick @brettallred @rodo Do you need INSERT RETURNING or also UPDATE / DELETE / INSERT ON CONFLICT ... RETURNING? We're looking at implementing the feature, and your answer would help determine the shape of it.

brianbroderick · 2016-05-04T17:15:40Z

yeah, we use upserts all the time so having the "on conflict" clause is important to me.

rodo · 2016-05-08T16:27:23Z

Mainly INSERT RETURNING for primary key with serial.

francisco1844 · 2016-05-20T19:49:40Z

When is 5.2 Release planned? since this feature was moved to 5.2 Release.

ozgune · 2016-06-01T18:17:43Z

@francisco1844 -- we intend to officially release v5.2 end of July.

We expect this feature to go into the master branch earlier than that; and we'll keep this issue updated.

ozgune · 2016-06-01T18:23:19Z

Two additional quick notes.

(1) ActiveRecord / Ruby only supports simple INSERT statements. RETURNING "id" is automatically added to all INSERT calls.

Apart from custom-written CTEs, RETURNING is not used in UPDATE or DELETE statements. It also looks like you can disable this behavior using the "insert_returning: false" config option. In that case it looks like it'll run a SELECT currval(..) on the sequence after the INSERT (typically within the same transaction).

(2) Sequel just inserts the ON CONFLICT clause if needed in the middle of the query, so it behaves the same as insert and uses RETURNING by default.

anarazel · 2016-06-03T20:32:20Z

@ozgune @marcocitus @sumedhpathak Any opinion on whether it's ok to send the RETURNING to all nodes? My current implementation does so, and just consumes (ignores) the results for RETURNING for all but the first. I tend to think that's ok.

marcocitus · 2016-06-03T20:58:04Z

My current implementation does so, and just consumes (ignores) the results for RETURNING for all but the first.

I think that's fine.

Fixes: #242

citus-github-bot added the feature label Feb 4, 2016

ozgune added the priority:high label Mar 16, 2016

ozgune added this to the 5.1 Release milestone Mar 16, 2016

ozgune added the planned label Mar 16, 2016

sumedhpathak removed this from the 5.1 Release milestone May 2, 2016

ozgune added this to the 5.2 Release milestone May 14, 2016

lithp mentioned this issue May 24, 2016

Django Support #549

Closed

6 tasks

sumedhpathak added ready and removed planned labels Jun 1, 2016

sumedhpathak assigned anarazel Jun 1, 2016

anarazel added a commit that referenced this issue Jun 3, 2016

Support RETURNING for modification commands.

5a0c4cd

Fixes: #242

anarazel mentioned this issue Jun 3, 2016

Support RETURNING #578

Merged

lfittl mentioned this issue Jun 11, 2016

Rails/ActiveRecord Support #595

Closed

11 tasks

sumedhpathak added needs review and removed ready labels Jun 22, 2016

anarazel added a commit that referenced this issue Jun 22, 2016

Support RETURNING for modification commands.

cc942c7

Fixes: #242

anarazel added a commit that referenced this issue Jul 1, 2016

Support RETURNING for modification commands.

cccba66

Fixes: #242

anarazel closed this as completed in #578 Jul 1, 2016

anarazel removed the needs review label Jul 1, 2016

DimCitus pushed a commit that referenced this issue Jan 10, 2018

Support RETURNING for modification commands.

586f738

Fixes: #242

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support RETURNING in modification commands #242

Support RETURNING in modification commands #242

citus-github-bot commented Feb 4, 2016 •

edited by sumedhpathak

Loading

citus-github-bot commented Feb 4, 2016

citus-github-bot commented Feb 4, 2016

citus-github-bot commented Feb 4, 2016

ozgune commented Mar 16, 2016

ozgune commented Mar 18, 2016

jasonmp85 commented Mar 18, 2016

anarazel commented Mar 18, 2016

jasonmp85 commented Mar 18, 2016

anarazel commented Mar 18, 2016

samay-sharma commented Apr 12, 2016

brettallred commented Apr 14, 2016

brianbroderick commented Apr 14, 2016

ozgune commented Apr 22, 2016

sumedhpathak commented May 2, 2016

anarazel commented May 3, 2016

brianbroderick commented May 4, 2016

rodo commented May 8, 2016

francisco1844 commented May 20, 2016

ozgune commented Jun 1, 2016

ozgune commented Jun 1, 2016 •

edited

Loading

anarazel commented Jun 3, 2016

marcocitus commented Jun 3, 2016

Support RETURNING in modification commands #242

Support RETURNING in modification commands #242

Comments

citus-github-bot commented Feb 4, 2016 • edited by sumedhpathak Loading

citus-github-bot commented Feb 4, 2016

citus-github-bot commented Feb 4, 2016

citus-github-bot commented Feb 4, 2016

ozgune commented Mar 16, 2016

ozgune commented Mar 18, 2016

jasonmp85 commented Mar 18, 2016

anarazel commented Mar 18, 2016

jasonmp85 commented Mar 18, 2016

anarazel commented Mar 18, 2016

samay-sharma commented Apr 12, 2016

brettallred commented Apr 14, 2016

brianbroderick commented Apr 14, 2016

ozgune commented Apr 22, 2016

sumedhpathak commented May 2, 2016

anarazel commented May 3, 2016

brianbroderick commented May 4, 2016

rodo commented May 8, 2016

francisco1844 commented May 20, 2016

ozgune commented Jun 1, 2016

ozgune commented Jun 1, 2016 • edited Loading

anarazel commented Jun 3, 2016

marcocitus commented Jun 3, 2016

citus-github-bot commented Feb 4, 2016 •

edited by sumedhpathak

Loading

ozgune commented Jun 1, 2016 •

edited

Loading