Skip to content

feat(tests): add integration tests and CI for proxy and plugins#11

Closed
nik-localstack wants to merge 1 commit into
unc-472-fix-ssl-copy-stallfrom
unc-472-postgresql-proxy-drain-nonblocking-ssl-reads-to-prevent-copy
Closed

feat(tests): add integration tests and CI for proxy and plugins#11
nik-localstack wants to merge 1 commit into
unc-472-fix-ssl-copy-stallfrom
unc-472-postgresql-proxy-drain-nonblocking-ssl-reads-to-prevent-copy

Conversation

@nik-localstack
Copy link
Copy Markdown

@nik-localstack nik-localstack commented May 7, 2026

Motivation

Stacked on #12. Adds test infrastructure and CI so proxy changes can be validated without relying solely on downstream projects.

Changes

  • Add integration test suite (tests/test_proxy.py, tests/test_plugins.py) covering proxy behaviour and plugin interception, including test_psql_ssl_file_batch_stress_no_hang for the UNC-472 SSL COPY scenario
  • Add conftest.py with shared fixtures (disposable PostgreSQL container)
  • Add GitHub Actions workflow to run tests on Python 3.13
  • Add make test and make test-act targets for local runs
  • Add tests/README.md with testing instructions
  • Remove old ad-hoc test files (plugins/tableau_hll/test.py, tests/test_plugin.py)

@nik-localstack nik-localstack self-assigned this May 7, 2026
@nik-localstack nik-localstack force-pushed the unc-472-postgresql-proxy-drain-nonblocking-ssl-reads-to-prevent-copy branch 11 times, most recently from e7d5df8 to 1c51600 Compare May 8, 2026 12:00
@nik-localstack nik-localstack marked this pull request as ready for review May 8, 2026 12:15
Copy link
Copy Markdown
Member

@pinzon pinzon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically, the SSL handshake is now done incrementally in the event loop instead of blocking on accept, and once it's complete we keep draining the SSL buffer until it's empty so data doesn't get stranded. Right?

Thank you for also adding a test workflow, now we can test changes here without relying solely on LS. I'll let the final say to @bentsku or @cloutierMat

Copy link
Copy Markdown
Member

@cloutierMat cloutierMat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does a lot of things that it is hard to review properly. I think the issue is much simpler that this and there is a lot of rewrite of ssl/socket functionality that is most likely handle properly by their respective libraries.

Maybe it is also tackling many other issues I don't know of, but a bit of research revealed this stack overflow post.

My understanding here is that self.listen calls selector.select but as there might still be decrypted data left in the socket, it would be lost and unseen by select. There might be a smarter way to solve but using the sample in the post I could fix the ssl hanging issue by changing the service_connection to handle pending bytes in sock.pending

                if recv_data:
                    LOG.debug('%s received data:\n%s', conn.name, recv_data)
                    conn.received(recv_data)
                    # excerpt from https://docs.python.org/3/library/ssl.html#ssl-nonblocking
                    # Conversely, since the SSL layer has its own framing, a SSL socket may still have data available
                    # for reading without select() being aware of it. Therefore, you should first call SSLSocket.recv()
                    # to drain any potentially available data, and then only block on a select() call if still necessary.
                    while isinstance(sock, ssl.SSLSocket) and sock.pending() > 0:
                        extra = sock.recv(min(4096, sock.pending()))
                        if extra:
                            LOG.debug('%s received pending SSL data:\n%s', conn.name, extra)
                            conn.received(extra)

There might still be something cleaner we can do by calling sock.recv in listen but I think this PR is kinda overboard and makes a lot of changes that would be hard to measure the impact.

I love the addition of a pipeline and cleaner testing, but I would also argue that it would be easier to review and comment, if we would seperate it in multiple PRs

@nik-localstack nik-localstack force-pushed the unc-472-postgresql-proxy-drain-nonblocking-ssl-reads-to-prevent-copy branch from 1c51600 to 46aa6f6 Compare May 11, 2026 10:12
@nik-localstack
Copy link
Copy Markdown
Author

Thanks for the comment @cloutierMat and for the critical thinking.
Indeed this way is quite simpler and it solved the problem. I reworked the fix to match your suggestion.

While reverting, I also caught a related pre-existing bug: on macOS, accepted sockets inherit O_NONBLOCK from the listening socket which caused _handle_ssl_negotiation to raise BlockingIOError on the MSG_PEEK recv ~1/10 times. Fixed with an explicit setblocking(True) before SSL negotiation.

Total change to proxy.py is now 12 lines. Happy to split it if you still prefer or keep this as a single PR given the scope, let me know what do you think.

Copy link
Copy Markdown
Member

@cloutierMat cloutierMat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we would like this change in the monthly release next week and there are a few points to review on both the new testing and the new ci, I think we would benefit from splitting this pr.

An easier to merge approach would be to only change the proxy.py along with the README changelog and setup.py version update in a first PR. We can then push to get the new version release while we review the testing and CI in stacked PRs.

We can still run the tests from the stacked PR or by locally installing the proxy in downstream project for the meantime and validates it fixes the issue at hand.

Comment thread postgresql_proxy/proxy.py Outdated
# for reading without select() being aware of it. Therefore, you should first call SSLSocket.recv()
# to drain any potentially available data, and then only block on a select() call if still necessary.
while isinstance(sock, ssl.SSLSocket) and sock.pending() > 0:
extra = sock.recv(min(4096, sock.pending()))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I know this is from the code I shared last week, but now that I read it, I don't think the min comparison is really necessary since the recv method already works that it grabs what is present until value provided in arg. We can probably safely achieve the same result with extra = sock.recv(4096).

@nik-localstack nik-localstack force-pushed the unc-472-postgresql-proxy-drain-nonblocking-ssl-reads-to-prevent-copy branch from 46aa6f6 to 436410b Compare May 12, 2026 08:50
Co-authored-by: GitHub Copilot <copilot@github.com>
@nik-localstack nik-localstack changed the base branch from master to unc-472-fix-ssl-copy-stall May 12, 2026 08:54
@nik-localstack nik-localstack force-pushed the unc-472-postgresql-proxy-drain-nonblocking-ssl-reads-to-prevent-copy branch from 436410b to 9c2bf4a Compare May 12, 2026 08:54
@nik-localstack nik-localstack changed the title fix(postgresql-proxy): prevent SSL COPY stalls by draining nonblocking reads feat(tests): add integration tests and CI for proxy and plugins May 12, 2026
@nik-localstack
Copy link
Copy Markdown
Author

Superseded by the stacked PRs #13 (local tests) and #14 (CI), both stacked on #12 (proxy fix).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants