-
-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make non-updating queries use readonly
unless there was a mutation before
#963
Make non-updating queries use readonly
unless there was a mutation before
#963
Conversation
We're potentially losing out on some optimizations. Solution: track use of mutating commands In my unscientific benchmarks, I saw a 4-5% performance increase when doing repetitive SELECT queries as read-only vs. read-write. This is an API breaking change (again) but it would avoid leaving performance on the table for little convenience gain.
However, we always assume them to be non-readonly. Solution: split the API to `open_cursor` and `open_cursor_mut` This way we can ensure we're squeezing out performance in the right place, when feasible.
I just used GitHub's "resolve conflicts" UI. Lets see if things even compile. |
The merge I did through GH's UI didn't compile (totally not surprised). Maybe it's close. It does seem tho that I can push to your branch, so let me try doing that on this one before you do anything else... |
Yeah, scratch that. I don't understand what I see locally. I add your repo as a remote and checkout the |
Please let me merge |
Should be good now. |
It will need another (slightly non-trivial, but I can do it for sure) merge after #885 is in, as they will have a conflict with regard to cursors (nothing major, I'll fix it) |
Solution: fix the signature
Working on conflict resolution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine to merge as-is. EDIT: found one little nit.
I'm just still unsure this even does it "right". The warning from the Postgres doc...
// It is generally unwise to mix read-only and read-write commands within a single function
// using SPI; that could result in very confusing behavior, since the read-only queries
// would not see the results of any database updates done by the read-write queries.
... just isn't very helpful in guiding the user to determining which mode they ought to operate.
I interpret that note as saying "Within a SPI connection you need to decide if all queries are read only or not. Any mixture requires read_only=false."
I looked through SPI usages in the Postgres sources, and there aren't many, but they don't mix/match the read_only flag under the same SPI connection. So like, that doesn't give us much guidance here.
I bet we'll have some bug reports around this in the future. I have no idea how to handle it. The only thing that comes to mind is two different SpiConnections. One that is read_only=false, and one that is read_only=true, and the user has to pick up front which one they want.
So rather than the flag being on SpiClient it'd be on SpiConnection.
I don't like that idea very much but I wonder if it's the only way to best honor the warning from the Postgres sources? Not that the recent merges haven't been, but this idea would be a pretty big impact on our Spi API.
Hmm. Maybe not. Maybe it's just adding a Thoughts? |
I've updated this PR to be based off recent |
I read the warning from PostgreSQL differently (and I may have seen further explanations elsewhere). The way I read it is that as long you haven't done any "read-write" ops, "readonly" is fine. This is what I've implemented. This is what my tests have also confirmed. It is generally unwise to mix read-only and read-write commands within a single function using SPI; that could result in very confusing behavior, since the read-only queries would not see the results of any database updates done by the read-write queries. (emphasis mine) |
I mean, I suppose it could be saying, "once you go read_only=false, don't go back". If that's what they mean that'd be great. Where did you see other discussion around this? I think I can be convinced to interpret it that way because that does make sense. The outcome of "SELECT"s prior to the first "UPDATE" don't matter for the next "SELECT". What matters is that it see the outcome of that "UPDATE". Which yeah, this does. |
It's unnecessary as client already has it (and error-prone) Solution: get it from the client
I'll try to find if there's something else, but I strongly believe this is the case. It makes sense and my reading of |
Do we need a test that tries to execute an UPDATE statement via a read_only=true SpiClient? I mean, we don't need to test that Postgres works, but probably that we're passing in the right flag. I'm guessing Postgres will raise a |
Solution: write a test that shows it
I wrote a test like this and pushed it to this PR. |
Yeah, nice. I can't see anything else here. Can you? I'm now convinced this code interprets the Postgres docs correctly. And we'll have our discussion here in case something terrible pops up in the future that proves us wrong! |
I think this is good but there will be one issue to address in the future. Namely, when you pass the client into PgTryBuilder for update (since mut ref can't cross it), you might want to be able to recover it. Generally speaking, this is the the territory that's most safe with SubTransaction and I've done this recovery mechanism already, and it's tiny and I'll be able to introduce it in #912 or as a followup. I don't think this is an issue we can or should address right now. |
Since we're approaching a series of SPI improvements, I think this is time to add this to the mix.
Problem: not using readonly flag for SPI commands
We're potentially losing out on some optimizations.
Solution: track use of mutating commands
In my unscientific benchmarks, I saw a 4-5% performance increase when doing repetitive SELECT queries as read-only vs. read-write.
This is an API-breaking change (again), but it would avoid leaving performance on the table for little convenience gain.
Note: I've made
get_one/two/three
helpers useupdate
always as there's no distinction between select/update in these. While slightly suboptimal, this works and again shows the potential/subjective shortcomings of these helpers.This PR will cause conflicts with #912 and #885 but they are not very difficult to resolve. Just need to figure out the order.
A message to pgx users: my ability to actively contribute to pgx largely depends on your support. Please consider sponsoring my work