-
-
Notifications
You must be signed in to change notification settings - Fork 502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Postgres upsert behavior inconsistent with "do nothing" #4242
Comments
Can you share the reproduction as code, please? |
Should look something like this. Entities:
Application code:
Populating the relation:
|
By "a value" you are referring to what exactly? The Maybe we can have a flag to enforce reloading of the entity fully? Maybe it would be even a good default? I really don't know, I was hesitant to add this API because I never needed it myself, so I don't have any first-class experience with the problems it should be solving. From what I understood, people often provide all the values explicitly, and for that it
This method is dynamic by definition, I don't see this as a problem, e.g. when you use upsert based on PK and the entity is already part of the identity map, you just get an
If this part is what your "Populating the relation:" code snippet should be reproducing, it seems to be working fine on my end (with latest version). I wasn't really sure what you mean by the snippets, is that supposed to be a sequence, so you literally call upsert + flush 3 times or is that supposed to be a comparison of 3 different versions? I'd appreciate a test case so its clear. Here is what I compiled from your snippets so far, note that I replaced the import { Entity, ManyToOne, PrimaryKey, Property, Ref, Unique } from '@mikro-orm/core';
import { MikroORM } from '@mikro-orm/postgresql';
@Entity()
class B {
@PrimaryKey({ type: 'uuid', defaultRaw: 'uuid_generate_v4()' })
id!: string;
@ManyToOne(() => D, { onDelete: 'cascade', ref: true })
d!: Ref<D>;
@Property({ unique: true })
order!: number;
@Property({ length: 6, defaultRaw: 'now()', onUpdate: () => new Date() })
updatedAt: Date = new Date();
}
@Entity()
@Unique({ properties: ['tenantWorkflowId'] })
class D {
@PrimaryKey({ type: 'uuid', defaultRaw: 'uuid_generate_v4()' })
id!: string;
@Property()
tenantWorkflowId!: number;
@Property({ length: 6, defaultRaw: 'now()', onUpdate: () => new Date() })
updatedAt: Date = new Date();
}
let orm: MikroORM;
beforeAll(async () => {
orm = await MikroORM.init({
entities: [B, D],
dbName: `4242`,
});
await orm.schema.ensureDatabase();
await orm.schema.execute('CREATE EXTENSION IF NOT EXISTS "uuid-ossp"');
await orm.schema.refreshDatabase();
});
afterAll(async () => {
await orm.close(true);
});
beforeEach(async () => {
await orm.schema.clearDatabase();
});
test('4242 1/2', async () => {
const loadedDs = await orm.em.upsertMany(D, [
{ tenantWorkflowId: 1 },
]);
console.log(loadedDs);
orm.em.clear();
// loadedDs should have an id, tenantWorkflowId, and updatedAt. With debug statements on, the query issued was
// `insert into "d" ("tenant_workflow_id") values (1) on conflict ("tenant_workflow_id") do nothing returning "id", "updated_at"`
const loadedDs2 = await orm.em.upsertMany(D, [
{ tenantWorkflowId: 1 },
]);
console.log(loadedDs2);
orm.em.clear();
// loadedDs2 only has tenantWorkflowId set. With debug statements on, the query issued was
// `insert into "d" ("tenant_workflow_id") values (1) on conflict ("tenant_workflow_id") do nothing returning "id", "updated_at"`
// `select "w0"."id" from "d" as "w0" where ("w0"."tenant_workflow_id" = 1)`
const loadedDs3 = await orm.em.upsertMany(D, [
{ tenantWorkflowId: 1, updatedAt: new Date() },
]);
console.log(loadedDs3);
orm.em.clear();
// loadedDs3 has id, tenantWorkflowId, and updatedAt set. With debug statements on, the query issued was
// `insert into "d" ("tenant_workflow_id", "updated_at") values (1, '2023-04-21T22:01:07.995Z') on conflict ("tenant_workflow_id") do update set "updated_at" = excluded."updated_at" returning "id", "updated_at"`
});
test('4242 2/2', async () => {
// Assuming that tenantWorkflowId 1 already exists in the database
const loadedDs4 = await orm.em.upsertMany(D, [
{ tenantWorkflowId: 1 },
]);
console.log(loadedDs4);
await orm.em.flush();
const b = await orm.em.upsert(B, {
d: loadedDs4[0],
order: 0,
updatedAt: new Date(),
});
await orm.em.flush();
console.log(b);
await b.d.load(); // this corrertly populates the relation, it is not marked as initialized before the load() call
console.log(b.d);
}); |
At a high level, the way that I'd want to use upsert is as a "findAndUpdateOrCreate" method. Because the return type is the entity, my assumption is that it should always return a fully initialized entity - otherwise if it isn't a fully loaded entity I'd expect it to be a reference. However, given that the entire point of using upsert is to reduce the number of database calls under the hood, I don't think it makes a ton of sense to return a partial result that then needs to hit the database again to fully initialize. The implication of this then is that:
I think what I was trying to show in the example code was the differing behavior across subsequent requests. On the first call, because nothing exists yet it does a full create and returns the full entity. On the second call the same code returned a partial result and fired an extra query because the entity had already been created. The third call demonstrates how always forcing an update (for example, by always updating The other snippet shows how calling |
Thanks for the details, it was very helpful! I completely agree with the points, already working on a PoC. Some interesting reading about the topic can be found here. So I guess it will be better to keep the separate query for reloading the values, in the end, it will be always there for MySQL that does not have returning statements. Already got some promising results (#4370):
Not sure if it makes more sense to whitelist the colums in returning clause, I guess I will do that in the end, so its in line with the It will need more tests and polishing, let's see if I can find some time for that tomorrow, need to go AFK now. |
Released as 5.7.8, let me know how it works, maybe there are more cases that will need improvements. |
What do you think about how the upsert works for already managed entities (loaded in context)? Now it does just the |
FYI next version will improve the behavior of "no nothing" in postgres, and make things explicitly configurable, see #4669 |
Describe the bug
This is somewhere between a bug report and a feature request. When performing an upsert call using only the unique key(s) of the entity, the underlying postgres call gets created with an "on conflict do nothing" clause. The problem with this is that, if there is a conflict, postgres will not return anything in the "returning" clause.
This causes a few issues:
upsert
is. If the expectation is that it should always return a value, that expectation is broken.upsert
API. If used with entity properties in addition to the unique key(s), it will always perform an update and always return a result. If only used with the unique key(s), it may not return the full result.To Reproduce
Steps to reproduce the behavior:
Expected behavior
Upsert should always return a full value.
Additional context
I imagine the simplest way to achieve this would be to always perform the update (just using the provided unique key(s)). I suppose this could potentially introduce more writes in situations where it wasn't previously, and it could trigger automatic column updates, but that could already be happening if users were specifying non-unique keys in the query.
Versions
The text was updated successfully, but these errors were encountered: