New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed #31071 -- Disabled insert optimization for primary keys with defaults when loading fixtures. #12209
Conversation
👍 Does the 'How Django knows to UPDATE vs. INSERT' section in the docs also need an update? django/docs/ref/models/instances.txt Lines 466 to 476 in 8f87f42
|
8f87f42
to
e60d3a2
Compare
There is already a warning:
so IMO |
@charettes Thanks for this patch 👍 I updated docs. |
e60d3a2
to
3e8e213
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @charettes, @felixxm.
I'm going to say this here, so you can say, "No" at your leisure. 🙂
I think this might be a mistake...
From the ticket, with the default in place, I think a lot of folks are doing this:
s0 = Sample.objects.create()
s1 = Sample(pk=s0.pk, name='Test 1')
s1.save()
...and so are going to update to broken code.
Yes, they can/could put force_update
in, but the line on that is this:
It should be very rare that you’ll need to use these parameters. Django will almost always do the right thing and trying to override that will lead to errors that are difficult to track down. This feature is for advanced use only.
I think here we're saying "Despite the claim that you'll rarely need them, you're going to have to use these flags" for code that is quite common, has been working correctly since Day 0. I think a lot of folks will bump into that, and that it's no longer true that "Django will almost always do the right thing"
I accept that "it does reduce the number of queries significantly when using primary key defaults" (Simon from ticket) but I think we're putting that performance over correct-by-default.
SO... 😬 I think we should revert 85458e9. We could then reopen ticket-29260 but, Tim's original comment there:
A fix might try to detect if the primary key came from a default...
5 seconds thinking about that, and I'm back to (the preceding):
I'm not sure if the issue can or should be fixed.
— it's just not going to be worth the code. (Maybe there is some way of tracking it that's clean but...)
I'd instead say, "use force_insert
" in this case.
(Why's that better than the "use force_update
" here? Because we're not breaking anything, and the default behaviour, whilst having a redundant query, is still correct.)
Thus I would say ticket-29260 should ultimately be wontfix
.
So, have a think...
If the two of you decide that this (i.e. this PR) is the way forward, then OK.
I have a couple of comments:
- I think the How Django knows to UPDATE vs. INSERT docs are no longer correct. (Probably should have been adjusted for 85458e9.) They'll need at least addition of a second gotcha, but, as I read it, the first "If the object’s primary key attribute is set to a value that evaluates to True..." is no longer true.
- Then in the release notes, I think we should mention
force_update
.update_or_create()
is very good, yes, but it's an entirely different API to the code we've broken, so we should explicitly tell folks the smallest change that fixes it.
(Wincing about this. Loop back to top. I leave it with you.)
Only in the cases where they've defined a primary key with a default value. Most of the cases where an implicit id or explicit I don't have a strong feeling either way even if I think the changes proposed here makes the most sense as I've explained on Trac. It feels like in order to properly implement this optimization we'd need a way to differentiate between a user provided primary key value and default generated one as Tim mentioned. One of the reason I think this is that doing a SELECT id FROM table WHERE id = UUID() There's no row so we'll proceed with an This behavior will have to be changed to perform a complex query of the form SELECT UUID() as generated_id, EXISTS(SELECT 1 FROM table WHERE id = generated_id) Which certainly not what a user would expect a model of the form class Order(models.Model):
id = models.UUIDField(primary_key=True, default=UUID) to do when they perform an We'll have to figure this one out in the near future but shipping an half baked solution is likely not best way forward. |
But (all?) UUID PKs will be, right? (Which is pretty common now, and why we've had a case come in already...) — tl;dr I don't think this is a super-edge-case... Anyhow, as I say, I'm happy to go with yours and Mariusz' best judgement, but I started looking at the docs change and was... "Oh..." -- so I wanted to raise my concern. |
Right but to be fair it was not about using this pattern directly but because fixture loading was broken. Not saying this pattern is not used in the wild but the report was about broken fixture loading. |
Yep. Good point. 😉 |
I know that these three lines s0 = Sample.objects.create()
s1 = Sample(pk=s0.pk, name='Test 1')
s1.save() don't look unusual at first glance, but ... I want to clarify what is broken, so object's primary key is known and we create a new model instance to update this object. Moreover this is broken only for primary keys with defaults. TBH, I'm surprised that we support creating a new model instance to update an existing one, it's quite unnatural (at least for me) and error-prone (a data loss possibility). I would expect to filter an object and update it, e.g.
You can also use |
Yep. Totally agree. But folks but it, and my thought is that it's quite common, combined with that for UUID PKs having a default is the standard... Anyhow... as I say, as long as you and Simon are agreed, for which the double-check here is enough, I'm happy to go forward, but I do think we need the docs to be accurate about the situation. |
@charettes What do you think? IMO we should move this patch forward. Do you have an idea how to clarify bullet points in the "How Django knows to UPDATE vs. INSERT"? |
Happy to help work on the docs tomorrow if you folks like? |
@felixxm I think we should move forward as well.
What about
Any thoughts about that @reupen? |
Looks good for the second bullet point, but the first one will also need an update if I'm not mistaken since the UPDATE will be skipped if the primary key field has a default defined and a new model instance is being saved. It is getting a bit complicated admittedly – maybe the 'primary key field has a default defined' case could be separated from the 'primary key field doesn't have a default defined' case? For example, something like:
|
3e8e213
to
9d88cdc
Compare
@reupen @charettes Thanks 👍 I rebased from master, and tried to explain the current algorithm in the |
The new wording makes sense to me @felixxm, thanks! |
…faults when loading fixtures. Model.save_base() is called directly when loading fixtures and assumes existing rows will be updated. Branching of "raw" allows to maintain the optimization introduced in #29260 while supporting this edge case. Regression in 85458e9. Thanks Reupen Shah for the report.
9d88cdc
to
9e14bc2
Compare
No description provided.