Skip to content

Conversation

mbeccati
Copy link
Contributor

@mbeccati mbeccati commented Aug 27, 2020

For review only, please don't merge.

First try to optimize unneeded param_hook calls by using . A few open items:

  1. This shouldn't require bumping PDO_API_VERSION
  2. As such, it could be seen as generic bugfix/improvement and suitable for 7.x
  3. Requires benchmarking from the OP

Copy link
Contributor

@ruudboon ruudboon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@inoric
Copy link

inoric commented Aug 27, 2020

Hi!

I'm the one that reported this problem on php.internals mailing list so here's my repeated benchmark results.

Before (master branch)

prep: 0.019, exec: 0.175, fetch: 1.980

After (this pull request)

prep: 0.006, exec: 0.145, fetch: 0.006

So it's a lot faster, this indeed fixes the problem. The prep and exec differences are just disk cache, but I confirmed where this problem is already, fetch is now fine and doesn't iterate doing nothing a few million times.

Here's a gist with the script i used to benchmark this while looking for the problem and schema:
https://gist.github.com/inoric/8e8716118d3113521005f56170d8da95

BUT there is a reason I didn't make this pull request myself, please be careful around here:

if (PDO_PARAM_TYPE(param->param_type) == PDO_PARAM_BOOL &&

Wouldn't this break the case of using PARAM_INPUT_OUTPUT bind params? They seem to be used for INOUT parameters in stored procedures. This is my first time digging around the PHP codebase, but it does seem like this path could still be called in case of PDO_PARAM_EVT_FETCH_PRE and PDO_PARAM_EVT_FETCH_POST.

@mbeccati
Copy link
Contributor Author

Thanks @inoric, that's great news!

Nope, as you can see in

case PDO_PARAM_EVT_EXEC_POST:
there's a return 1 for the EVT types I'm skipping.

@inoric
Copy link

inoric commented Aug 27, 2020

@mbeccati

I saw that, it's the reason I thought it was safe to remove, but that break/return is only called if stmt->supports_placeholders == PDO_PLACEHOLDER_NAMED is true (the part I linked is in the else statement). I have apsolutely no idea when that will be false, but just wanted to leave a warning if someone missed that part.

@inoric
Copy link

inoric commented Aug 27, 2020

I found where it's set.
That parameter will be PDO_PLACEHOLDER_NONE when emulating prepares instead of using actual bind parameters:

if (emulate) {

Maybe make skip_param_evt a property of pdo_stmt_t instead of _pdo_dbh_t and dynamically set it in the if/else I just linked?

@mbeccati
Copy link
Contributor Author

@inoric that part is a bit dodgy tbh. It's been a long time since I worked with the innards of pdo_pgsql, but I can't recall why that bit of code would have to ignore the event type. Also the pdo_sqlite/mysql/pgsql tests pass with the PR as-is, which is even more worrying ;-)

@inoric
Copy link

inoric commented Aug 27, 2020

@mbeccati Well, one likely reason they all pass is because not a single test uses PDO::PARAM_INPUT_OUTPUT, at least according to my very sophisticated grep -IRi -e INPUT_OUTPUT -e INOUT.
I also never used INOUT params myself or saw them used anywhere. Maybe someone with more experience in that area than me could write some test using boolean INOUT params for pdo_pgsql? I can try to do it it myself otherwise.

@mbeccati
Copy link
Contributor Author

@inoric A bit in a hurry, but to me (param->param_type & PDO_PARAM_INPUT_OUTPUT) != PDO_PARAM_INPUT_OUTPUT) seems exactly the opposite, i.e. not PDO_PARAM_INPUT_OUTPUT.

@inoric
Copy link

inoric commented Aug 27, 2020

@mbeccati Woops. You're right. Read that wrong initially and it was stuck that way in my head. Should be safe then? We should never modify parameters while fetching rows unless it's an INOUT, which is not done here anyway.

Alternatively if we're gonna be paranoid maybe use my earlier suggestion to move the skip flag from connection to statement level and skip the optimisation if someone for whatever reason uses emulated prepares in postgres.

@mbeccati
Copy link
Contributor Author

@inoric most likely there's actually room for more optimizations here. With EMULATE_PREPARES turned on, the bool -> 't'/'f' conversion is probably being made on all the events rather than just the one that actually needs it.

@inoric
Copy link

inoric commented Aug 27, 2020

I wanted to say that I can't imagine someone using emulated prepares in postgres today, but then I saw this:

if (scrollable) {
if (S->cursor_name) {
efree(S->cursor_name);
}
spprintf(&S->cursor_name, 0, "pdo_crsr_%08x", ++H->stmt_counter);
emulate = 1;

Emulated prepares seem to be automatically enabled when a scrollable cursor is used (not sure why). And scrollable cursors would be hit even worse by this problem since they could fetch each row more than once.

Btw. I had another, more flexible idea of solving this, but it's more complex, and would be a bigger change to interfaces so it didn't seem like a good candidate:
Instead of a bitflag, create another callback/hook drivers can provide. The hook would take a bound parameter and event type as arguments. Each bound parameter would get passed to it once for each event. The hook would return 1 if it wants param_hook called with that specific parameter for that event. Those can than be cached in a list per-event, eliminating the need to iterate over all parameters for each event more than once.
This would enable a lot more specific optimisations around this, but it just seems too invasive.

@mbeccati
Copy link
Contributor Author

@inoric It's been 10+ years, so I can't recall why I had to use emulation to fix cursors. Does anyone really use PDO cursors anyway?

The last commit should optimize emulated prepares a little further though and hopefully remove any doubts you might have. I don't think it's worth adding new hooks at this point.

@inoric
Copy link

inoric commented Aug 27, 2020

@mbeccati As I said, didn't mention it before because it seemed too complex.
Hope this gets merged as a bugfix so I get my bulk imports fast soon :)

@mbeccati
Copy link
Contributor Author

@inoric It would be great if you could create an entry on bugs.php.net, so that it can be referenced in the NEWS files and final commits. I will ask on internals and see if I get permission to push the fix from 7.3.x upwards.

@inoric
Copy link

inoric commented Aug 27, 2020

@mfn
Copy link
Contributor

mfn commented Aug 28, 2020

I can't imagine someone using emulated prepares in postgres today

The reason are proxies like https://www.pgbouncer.org/

The can operate in different models (session, transaction, statement).

One "problem" with prepared statements is that they create "state" on the server which is not compatible with such proxies operating in modes like transaction or statements, where possible every query may use a different connection due to round-robin things, etc.

When you activate emulation, you "solve" this problem.

Also, for some using prepared statements is a overhead (i.e. additional round trips to the server to manage states) they won't to avoid. Surely depends on the traffic to measure this.

@inoric
Copy link

inoric commented Aug 28, 2020

@mfn Thanks for the explanation. I actually planned on using pgbouncer in one setup soon, but didn't look at the details yet.
After reading your comment I added $db->setAttribute(PDO::ATTR_EMULATE_PREPARES, true); to my benchmark to test that case. Also added a bool column.
It works, it's fast and even got my booleans into the table and back through RETURNING.

@mbeccati
Copy link
Contributor Author

However OT, let me emphasize: PDO::ATTR_EMULATE_PREPARES disabled is a much better default. It is good that emulation can be tuned on for specific use cases.

 * pdo_dblib doesn't use param_hook
 * pdo_firebird is messy and seems to even use the NORMALIZE constant in a wrong context
@mbeccati
Copy link
Contributor Author

Manually merged in 44ade0e

@mbeccati mbeccati closed this Aug 31, 2020
@mbeccati mbeccati deleted the optimize_pdo_param_evt branch August 31, 2020 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants