Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support of prepared statements #845

Merged
merged 52 commits into from
Oct 5, 2023
Merged

Conversation

knizhnik
Copy link
Contributor

@knizhnik knizhnik commented May 14, 2023

This PR is actually successor of #757 which seems to be stuck.
It contains the following fixes/improvements:

  1. Fix bug transformation of bind request
  2. Fix bug in calculating size of prepare packet
  3. Prevent possible collisions in prepared statements mapping caused by using query hash
  4. Replace prepared statement eviction algorithm (use LRU instead of sorting by bind_count)
  5. Reject prepare command with parameter types.
  6. Rewrite cache of prepared statement to avoid occasional colliions

Co-Author: @dashorst

@asdm90
Copy link

asdm90 commented May 17, 2023

@knizhnik I think you are missing the header file messages.h for the messages.c file.

@knizhnik
Copy link
Contributor Author

@knizhnik I think you are missing the header file messages.h for the messages.c file.

Sorry and thank you - fixed.

@asdm90
Copy link

asdm90 commented May 23, 2023

The file "common/uthash.h" also seems to be missing in the PR @knizhnik. It is referenced in the file "include/ps.h"

JelteF
JelteF previously requested changes May 25, 2023
Copy link
Member

@JelteF JelteF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for poking #757 again and creating this follow up one with a few fixes. That made me finally take the time to look closer at this solutinon. Because of the large amount of changes I hadn't taken such a closer look before.

I really quite like the overall design of this solution. Thank you @dashorst!

@knizhnik I do agree with the comments you left on #757

I have not tested the PR yet, partially because it's missing common/uthash.h as @asdm90 already mentioned.

I did leave a bunch of comments though (some large, some small). Once you add common/uthash.h and add a at least 1 test, I'll give it a spin and try to add some more tests while doing so.

doc/config.md Outdated Show resolved Hide resolved
include/ps.h Outdated Show resolved Hide resolved
include/bouncer.h Outdated Show resolved Hide resolved
src/messages.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/client.c Outdated Show resolved Hide resolved
src/client.c Outdated Show resolved Hide resolved
src/messages.c Outdated Show resolved Hide resolved
src/client.c Outdated
Comment on lines 1071 to 1076
if (client->packet_cb_state.flag == CB_HANDLE_COMPLETE_PACKET) {
/* pkt already read from sbuf with cb */
client->packet_cb_state.flag = CB_NONE;
} else {
sbuf_prepare_skip(sbuf, pkt->len);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what this is supposed to do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 1071-1076 are new to me, reading from sbuf should already be handled at this point, so not sure either.

About lines 1064 and the switch block on line 1065:
The original code of handle_client_work begins with a switch (pkt->type) to handle different packet types from the client. At this point there is not yet a server connection bound to the client connection. Inside this first switch block some initial processing/checking is done with regards to prepared statement support.
After this first switch block a server connection is acquired to forward the client packet to the server.
Line 1064 and the switch block starting on line 1065 are basically doing the actual packet rewriting to transform the packet received from the client to match the specific server connection acquired (e.g. client -> bind pkt for named ps S_1 -> pgbouncer -> sent bind pkt for named ps B_7 to server).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear I specifically meant 1071-1076. The part above made sense and it took me a bit to figure out why the code was split in two pieces exactly, but in the end it made sense. Your explanation is nice though and would have helped me understand it easier. Let's add something similar as a comment.

src/client.c Outdated Show resolved Hide resolved
@JelteF
Copy link
Member

JelteF commented May 25, 2023

Sidenote regarding uthash.h. I think it's better if it's included as a gitsubmodule. But feel free to skip that for now if simply committing the PR is easier for you. At least the PR will compile then.

@sboschman
Copy link
Contributor

I believe the PR by @dashorst was crafted from this patch repo. Which has had some updates since then.

Fix:
Instead of direct malloc/free calls the Slab memory store from libusual is used. After a recent os update (glibc maybe?) we sometimes see that pgbouncer keeps claiming memory from the os (until OOM). It seems like freed memory is never again reused for a future malloc call, and malloc keeps requesting more and more memory from the os.

Feat:
The admin console has two new commands to show some parse/bind statistics for client and server sockets. Which at least give a starting point to reason about how effect the ps caching is for a workload.

@dashorst
Copy link
Contributor

I believe the PR by @dashorst was crafted from this patch repo. Which has had some updates since then.

Yes. This PR was crafted on the request of the pgbouncer team to make it easier to see the changes. So I cloned the pgbouncer repo and applied the patch in the patch repo, and crafted this PR.

@JelteF
Copy link
Member

JelteF commented May 26, 2023

Fix: Instead of direct malloc/free calls the Slab memory store from libusual is used. After a recent os update (glibc maybe?) we sometimes see that pgbouncer keeps claiming memory from the os (until OOM). It seems like freed memory is never again reused for a future malloc call, and malloc keeps requesting more and more memory from the os.

Feat: The admin console has two new commands to show some parse/bind statistics for client and server sockets. Which at least give a starting point to reason about how effect the ps caching is for a workload.

This sounds like 2 useful changes.

@JelteF
Copy link
Member

JelteF commented May 26, 2023

I tried creating a basic test. Just to play with this a bit:

import psycopg
import pytest

def test_prepared_statement(bouncer):
    bouncer.admin(f"set pool_mode=transaction")
    prepared_query = "SELECT 1"
    with bouncer.cur() as cur1:
        with bouncer.cur() as cur2:
            cur1.execute(prepared_query, prepare=True)
            cur1.execute(prepared_query)
            with cur2.connection.transaction():
                cur2.execute("SELECT 2")
                cur1.execute(prepared_query)

But there's an issue in the response we return (probably what we return to the Parse). Because the prepare=True line fails with:

psycopg.errors.ProtocolViolation: insufficient data left in message

@sboschman
Copy link
Contributor

I tried creating a basic test. Just to play with this a bit:

import psycopg
import pytest

def test_prepared_statement(bouncer):
    bouncer.admin(f"set pool_mode=transaction")
    prepared_query = "SELECT 1"
    with bouncer.cur() as cur1:
        with bouncer.cur() as cur2:
            cur1.execute(prepared_query, prepare=True)
            cur1.execute(prepared_query)
            with cur2.connection.transaction():
                cur2.execute("SELECT 2")
                cur1.execute(prepared_query)

But there's an issue in the response we return (probably what we return to the Parse). Because the prepare=True line fails with:

psycopg.errors.ProtocolViolation: insufficient data left in message

Missing check (server.c, case '1': /* ParseComplete */ requires to check if ps support is enabled/used ), although I assume you are running without the ps code enabled to trigger this. After adding the check I get a expected: psycopg.errors.InvalidSqlStatementName: prepared statement "_pg3_0" does not exist

Running your test with the ps code enabled works, at least with my current 'patch', which is not the code in this pr.

Changes to test.ini:

; pool_mode = statement
pool_mode = transaction
disable_prepared_statement_support = 0

@knizhnik
Copy link
Contributor Author

I tried creating a basic test. Just to play with this a bit:

import psycopg
import pytest

def test_prepared_statement(bouncer):
    bouncer.admin(f"set pool_mode=transaction")
    prepared_query = "SELECT 1"
    with bouncer.cur() as cur1:
        with bouncer.cur() as cur2:
            cur1.execute(prepared_query, prepare=True)
            cur1.execute(prepared_query)
            with cur2.connection.transaction():
                cur2.execute("SELECT 2")
                cur1.execute(prepared_query)

But there's an issue in the response we return (probably what we return to the Parse). Because the prepare=True line fails with:

psycopg.errors.ProtocolViolation: insufficient data left in message

Thank you for creating test - there was really my bug in handle_bind_command. Looks like in prepared statement name generated by pgbench has the same length as one generated by pgbouncer. This is why pgbench with -M prepared worked. Fixed.

By the way, pgbouncer python tests are not working with the latest version of pytest:

PYTHONIOENCODING=utf8 /usr/bin/pytest -n auto
ERROR: usage: pytest [options] [file_or_dir] [file_or_dir] [...]
pytest: error: unrecognized arguments: -n

Any plans to fix it?

@JelteF
Copy link
Member

JelteF commented May 30, 2023

By the way, pgbouncer python tests are not working with the latest version of pytest:

The -n flag comes from the pytest-xdist plugin. Did you run pip install -r requirements.txt. That should install it automatically.

@sboschman
Copy link
Contributor

Have you considered the implications of using LRU vs LFU when replacing the eviction policy? The eviction policy was very crude and might just work together with a certain workload and tuned pgbouncer and application settings. But LRU might not work for everyone...

A good read on cache admission and eviction policies is this paper. With regard to LRU it states:

Yet, under many workloads, LRU requires much larger caches than LFU in order to obtain the same hit-ratio.

Increasing the server side cache size also places a burden on the postgresql server itself, not only on the pgbouncer process. Hence, we are a bit reluctant to run pgbouncer with huge server side prepared statement caches per server connection. All though it is hard to measure the exact effects as so many variables are involved.

One option would be to make the eviction policy configurable, so people can test what works best for their workload. The other option, which is more work, would be to implement e.g. a (minimal) W-TinyLFU policy and not burden users with figuring out what their workload needs. Or at least implement some sort of doorkeeper to minimise the risk of high frequency items being flushed out of the cache by bursts of less frequent items. Keep in mind that e.g. the pgjdbc driver also uses prepared statements to optimise a batch, and after the batch completes the prepared statement is never reused.

@JelteF
Copy link
Member

JelteF commented May 30, 2023

, although I assume you are running without the ps code enabled to trigger this.

Yes you are right I forgot to enable the prepared statement support in the test.

@knizhnik
Copy link
Contributor Author

By the way, pgbouncer python tests are not working with the latest version of pytest:

The -n flag comes from the pytest-xdist plugin. Did you run pip install -r requirements.txt. That should install it automatically.

Sorry, but it still doesn't help.
Now I get error:

PYTHONIOENCODING=utf8 /home/knizhnik/.local/bin/pytest -n auto
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "/home/knizhnik/.local/lib/python3.10/site-packages/_pytest/main.py", line 265, in wrap_session
INTERNALERROR>     config._do_configure()
INTERNALERROR>   File "/home/knizhnik/.local/lib/python3.10/site-packages/_pytest/config/__init__.py", line 1046, in _do_configure
INTERNALERROR>     self.hook.pytest_configure.call_historic(kwargs=dict(config=self))
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pluggy/hooks.py", line 308, in call_historic
INTERNALERROR>     res = self._hookexec(self, self.get_hookimpls(), kwargs)
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pluggy/manager.py", line 92, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook, methods, kwargs)
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pluggy/manager.py", line 83, in <lambda>
INTERNALERROR>     self._inner_hookexec = lambda hook, methods, kwargs: hook.multicall(
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pluggy/callers.py", line 208, in _multicall
INTERNALERROR>     return outcome.get_result()
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pluggy/callers.py", line 80, in get_result
INTERNALERROR>     raise ex[1].with_traceback(ex[2])
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pluggy/callers.py", line 187, in _multicall
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pytest_benchmark/plugin.py", line 442, in pytest_configure
INTERNALERROR>     bs = config._benchmarksession = BenchmarkSession(config)
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pytest_benchmark/session.py", line 32, in __init__
INTERNALERROR>     self.logger = Logger(self.verbose, config)
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pytest_benchmark/logger.py", line 18, in __init__
INTERNALERROR>     self.term = py.io.TerminalWriter(file=sys.stderr)
INTERNALERROR> AttributeError: module 'py' has no attribute 'io'
make: *** [Makefile:150: check] Error 3

@JelteF
Copy link
Member

JelteF commented May 30, 2023

@knizhnik it seems that error is coming from pytest-benchmark. Which isn't needed for pgbouncer tests and is apparently installed globally for you (based on the path in the stacktrace). Some references to this issue:

  1. pytest-benchmark: AttributeError: module 'py' has no attribute 'io' pytest-dev/pytest#10420
  2. Compatibility issue with pytest 7.2: Package py is used but not listed as dependency ionelmc/pytest-benchmark#226

You'll probably want to use a virtualenv to isolate the python environment for running pgbouncer tests from your system setup:

  1. either the one built into python: https://docs.python.org/3/library/venv.html
  2. or the original one: https://virtualenv.pypa.io/en/latest/

@knizhnik
Copy link
Contributor Author

@knizhnik it seems that error is coming from pytest-benchmark. Which apparently is installed globally for you (based on the path in the stacktrace). Some references to this issue:

1. [`pytest-benchmark`: AttributeError: module 'py' has no attribute 'io' pytest-dev/pytest#10420](https://github.com/pytest-dev/pytest/issues/10420)

2. [Compatibility issue with pytest 7.2: Package `py` is used but not listed as dependency ionelmc/pytest-benchmark#226](https://github.com/ionelmc/pytest-benchmark/issues/226)

You'll probably want to use a virtualenv to isolate the python environment for running pgbouncer tests from your system setup:

1. either the one built into python: https://docs.python.org/3/library/venv.html

2. or the original one: https://virtualenv.pypa.io/en/latest/

I am quite sure that it is some problem with my python installation.
Sorry for false alarm.
But I was able to run this test manually so now the problem with "insufficient data left in message" is solved

@knizhnik
Copy link
Contributor Author

implications

Both LRU and LFU have their pros and cons.
I have replaced eviction algorithm in the original PR #757 because it is expensive (does sort of all pepared statements at each eviction) and is obviously wrong because most likely will evict just added prepared statement as it has the smallest access counter. To correctly implement LFU we need to use something like clock algorithm.
But it has its own drawback: when size of cache is large and it is completely filled, then we need to perform many cycles through larger number of items before been able to locate victim for eviction.

Usually clock is prefered to LRU because it doesn't require global lock.
But pgbouncer is single threaded and we do not need synchronization at all.

I think that for prepared statement LRU is really better choice.
Usually number of prepared statement is used to be not so large.
So we do not need to perform eviction at all.
Or do it only for deteriorated prepared statements: right now we do not keep track about all server where prepared statements is registered. So if some session is closed, there still can be prepared statements created by this session at some servers. Sooner or later them should be evicted. And here LRU will do it best.

}
else if (HASH_COUNT(server->server_prepared_statements) != 1) /* do nothing if there is just one prepared statement */
{
/* Maintain LRU double-linke list: move element to the end of the list */
Copy link
Member

@JelteF JelteF May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create a thread here to keep the cache eviction mechanism discussion focussed in one place.

@sboschman Regarding the cache algorithm, I agree there's many trade-offs here and supporting different eviction mechanisms would be nice. But I believe for the initial version of this prepared statement feature LRU is definitely good enough. The main case where LRU caches are problematic are when the full size of the cache is scanned, thus completely evicting the current set. This is especially problematic if this happens often.

I think it's pretty unlikely for that to happen for a decently sized (>=100 items) prepared statement cache.

I agree with @knizhnik that LRU is at least better than least-often-used one, because in practice that will never evict existing items.

Copy link
Member

@JelteF JelteF May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear. I think it would be beneficial to support multiple cache evection algorithms, but I couldn't quickly find a C implementation of W-TinyLFU. Having an implementation of that in PgBouncer/libusual might be desirable if LRU really has some problems for people in real world scenarios. But requiring that for this initial PR seems unnecessary and will only make this PR harder to review.

src/prepared_statement.c Outdated Show resolved Hide resolved
src/prepared_statement.c Outdated Show resolved Hide resolved
@knizhnik
Copy link
Contributor Author

knizhnik commented Jun 1, 2023

I believe the PR by @dashorst was crafted from this patch repo. Which has had some updates since then.

Yes. This PR was crafted on the request of the pgbouncer team to make it easier to see the changes. So I cloned the pgbouncer repo and applied the patch in the patch repo, and crafted this PR.

@dashorst First of all thank you for your work - I started implementation of prepared statements for pgbouncer myself, found ti to be more complicated than I originally estimated and then found your repository.

So there is now your repo: https://github.com/topicusonderwijs/pgbouncer-ps-patch,
your PR #757 and this my PR based on it.

I think it will be nice if we can somehow coordinate our efforts to add prepared statements support to pgbouncer.
I have created my PR just because there were some bugs and problems in your implementation which I want to fix but I can not make changes in your branch. Now because of reviewer suggestions and requests this PRs becomes more and more divergent from each other.

@JelteF
Copy link
Member

JelteF commented Jun 1, 2023

Currently tests in CI are all failing for a obvious reason. Could you fix that one, to see if it the current tests still pass with these changes. The current error is:

prepared_statement_cache_size is missing in doc/config.md

@knizhnik
Copy link
Contributor Author

knizhnik commented Jun 1, 2023

Currently tests in CI are all failing for a obvious reason. Could you fix that one, to see if it the current tests still pass with these changes. The current error is:

prepared_statement_cache_size is missing in doc/config.md

Sorry, fixed

include/messages.h Outdated Show resolved Hide resolved
src/messages.c Outdated Show resolved Hide resolved
src/messages.c Outdated Show resolved Hide resolved
src/messages.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
doc/config.md Outdated Show resolved Hide resolved
Copy link
Member

@JelteF JelteF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm planning to merge and release this tomorrow morning. @petere If you want to take a final look, please do so before then.

.cirrus.yml Show resolved Hide resolved
eulerto and others added 3 commits October 5, 2023 14:14
Fix a few typos. Rewrite some sentences. Replace the future tense with
the present tense in a few sentences.

Remove a reference to an in-progress patch for PostgreSQL that will
avoid an issue with this feature. We should add a reference to it when
such feature is merged in Postgres.
@JelteF JelteF merged commit 6070802 into pgbouncer:master Oct 5, 2023
5 of 7 checks passed
@dashorst
Copy link
Contributor

Congratulations on landing the PR! Looking forward to using it in our systems!

@NikolayS
Copy link

NikolayS commented Oct 13, 2023

This is big – thanks everyone involved (@knizhnik particularly)

@JelteF
Copy link
Member

JelteF commented Oct 16, 2023

Yes, thanks everyone who helped getting this PR in. Not just the people that wrote code, but also the ones that did reviews and tried it out with various clients.

I released v1.21.0 today, which includes this feature! 🎉

JelteF added a commit that referenced this pull request Feb 22, 2024
In #845 full tracking of expected responses was implemented. This turned
out to have bugs when sending `COPY FROM STDIN` queries using the
extended protocol. [The protocol explicitly allows sending `Sync`
messages after sending an `Execute` message for a `COPY FROM STDIN`
command][1]. Such Sync messages are ignored by the server until a
CopyDone or CopyFail message is received. The new command queue tracking
did not take this into account, and was incorrectly expecting more
ReadyForQuery messages from the server.

This could lead to two problems for the server connection on which the
`COPY FROM STDIN` was sent:
1. The server connection would never be unassigned from the client.
2. Memory usage for this server connection would continue to increase
until the client disconnects.
3. If `max_prepared_statements` was non-zero, then clients might receive
ParseComplete and CloseComplete responses in the wrong order. This
wouldn't result in wrong behaviour by the server, but the client might
get confused and close the connection.

This PR fixes these issues. It also removes the rfq_count tracking,
since it's now only keeping track of a strict subset of what the
outstanding request queue was tracking.

[1]: https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-COPY

Fixes #1023
Fixes #1024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet