Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk UPSERT Confusion #1118

Closed
allochi opened this issue May 22, 2018 · 11 comments
Closed

Bulk UPSERT Confusion #1118

allochi opened this issue May 22, 2018 · 11 comments

Comments

@allochi
Copy link

allochi commented May 22, 2018

Hi,

I'm sorry if this is not the place to ask this question, I wouldn't have if there was a section about it in the documentation.

I'm trying to use Bulk UPSERT with a simple table

create table items (
	id serial primary key,
	name text
);

I then insert these records

curl -X "POST" "http://localhost:5200/items" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'[
  {
    "name": "iPad"
  },
  {
    "name": "iPhone"
  },
  {
    "name": "MacBook Pro"
  }
]'

When I try to UPSERT with this

## Upsert Items
curl -X "POST" "http://localhost:5200/items" \
     -H 'Prefer: resolution=merge-duplicates' \
     -H 'Content-Type: application/json' \
     -d $'[
  {
    "id": 1,
    "name": "iPad"
  },
  {
    "id": 2,
    "name": "iPhone"
  },
  {
    "id": 3,
    "name": "Macbook Pro"
  },
  {
    "id": null,
    "name": "PostgREST"
  }
]'
  1. if I set id to null in the last item, I get error 400 null value in column "id" violates not-null constraint
  2. if I don't include id in the last item, I get error 400 "message":"All object keys must match"

The only way I can get this to work is by providing a value for id, which I can't as id is auto-generated from a sequence, isn't the purpose of UPSERT is to INSERT a new record and in case of conflict with the primary key either update or ignore? how can I insert a new record in this case?

Could you please advice on how this suppose to work? Thanks!

@steve-chavez
Copy link
Member

One way to achieve what you want could be to define a trigger on your table that converts the null to the default, like:

CREATE OR REPLACE FUNCTION items_null_id_is_default() RETURNS TRIGGER AS $$
BEGIN
  NEW.id = coalesce(NEW.id, nextval('items_id_seq'));
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER items_null_id_is_default BEFORE INSERT ON items FOR EACH ROW EXECUTE PROCEDURE items_null_id_is_default();

Then you could have the { "id": null, "name": "PostgREST"} in UPSERT working, the trigger should have a slight overhead on the INSERT as there's no DML. Another option is to adjust your use case at the application level, UPSERT is better when you have natural keys(must specify all the ids).

@allochi
Copy link
Author

allochi commented May 23, 2018

@steve-chavez Thanks for your reply, it seems that UPSERT will not serve the purpose I want to use it for, so I will just create SQL functions for that.

I would expect UPSERT to check objects with an id for a conflict for UPDATE, and INSERT objects without id, but unfortunately it doesn't seems to be designed this way.

@allochi allochi closed this as completed May 23, 2018
@steve-chavez
Copy link
Member

@allochi It's a limitation for now, internally the UPSERT request is translated to:

INSERT INTO  "public"."items" ("id", "name") 
SELECT "id", "name" FROM json_populate_recordset (null:: "public"."items", 
'[ 
  { "id": 1, "name": "iPad" }, 
  { "id": 2, "name": "iPhone" }, 
  { "id": 3, "name": "Macbook Pro" }, 
  { "name": "PostgREST" } 
]') _  
ON CONFLICT(id) DO UPDATE SET "id" = EXCLUDED."id", "name" = EXCLUDED."name" 
RETURNING "public"."items".*

The undefined id attribute in the json is treated as a null, ideally PostgreSQL would turn that into a DEFAULT and it would use the sequence nextval(as possible when using VALUES) but it doesn't work that way, I can see how that's inconvenient maybe this could be raised on psql-hackers and see if it's possible to do. To fix that on PostgREST side we'd have to do extra json processing on the request body and that would greatly hurt performance, we already have an issue #690 on that.

I'll make sure to add a related note in the docs about this.

@allochi
Copy link
Author

allochi commented May 23, 2018

@steve-chavez thanks for the clarification.

I have already created a function to deal with my use case through RPC. But I would love to see PostgREST addresses this in the future. Thanks again!

@steve-chavez
Copy link
Member

Docs on UPSERT are now available in https://postgrest.org/en/v5.0/api.html#upsert.

@allochi
Copy link
Author

allochi commented May 29, 2018

@steve-chavez Thanks a lot, I would also like to thank the team for a job well done on PostgREST.

@paulovieira
Copy link
Contributor

paulovieira commented Mar 9, 2019

This problem also got me confused! And unfortunately the error message doesn't help much (though now I can understand it).

When the table uses a surrogate key (that is, "id serial primary key") it's natural that people assume that postgrest can make the upsert by omitting the "id" key (in the POST'ed json). That would be the analogy to raw sql.

I'll make a pr to the docs to clarify this issue.

@geohuz
Copy link

geohuz commented Jul 28, 2020

I'm wondering if there is any other field missed in the json source the bulk insert can be successfully done. I recently hit a problem to sync a data from another api, I have carefully created the table on my side to include all the fields from the source, but there are several objects have missed fields other than the key field still leaves me with the error: ""message":"All object keys must match". Is there any work around for this issue?

@steve-chavez
Copy link
Member

@geohuz You can specify columns(http://postgrest.org/en/v7.0.0/api.html#specifying-columns) to bypass the "All object keys must match" validation.

@geohuz
Copy link

geohuz commented Jul 28, 2020

@steve-chavez Thank you so much!

@samirdamle
Copy link

samirdamle commented Apr 27, 2022

As a workaround, I broke up the array of items into two separate arrays - one with null ids and the other with existing ids.
Then I did two separate upserts and it worked.

const itemsToUpdate = items.filter(item => !!item.id )  // items with existing id
const itemsToCreate = items.filter(item => !item.id )   // items with null id

// do 2 separate upserts now for itemsToUpdate and itemsToCreate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants