Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make data node calls non-blocking and interruptible #5058

Closed
wants to merge 11 commits into from

Conversation

erimatnor
Copy link
Contributor

@erimatnor erimatnor commented Dec 7, 2022

Refactor the use of libpq calls to data nodes so that they honor PostgreSQL interrupt handling (e.g., ctrl-c or statement_timeout) and don't block unnecessarily.

To implement this behavior, data node connections are made non-blocking by default and all libpq functions are wrapped to integrate with PostgreSQL's signal handling (via WaitEventSets) when waiting for read or write readiness.

A change is also made to the life-cycle management of libpq objects, including connections, and remote query results. Instead of tying these to transactions, they are now tied to the life-cycle of memory contexts using a callback. In most cases, the memory context a connection is allocated on has the same lifetime as transactions, but not always. For example, the connection cache lives across connections and is using a longer lived memory context. Previously that was handled as a special case where connections were marked to not auto close on transaction end.

Closes #4958
Closes #2757

@erimatnor erimatnor self-assigned this Dec 7, 2022
@erimatnor erimatnor force-pushed the non-blocking-connections branch 11 times, most recently from ec5f276 to 31b8339 Compare December 13, 2022 12:49
@codecov
Copy link

codecov bot commented Dec 13, 2022

Codecov Report

Merging #5058 (d1c41c4) into main (043092a) will decrease coverage by 0.22%.
The diff coverage is 74.28%.

❗ Current head d1c41c4 differs from pull request most recent head 1852950. Consider uploading reports for the commit 1852950 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5058      +/-   ##
==========================================
- Coverage   89.05%   88.84%   -0.22%     
==========================================
  Files         225      225              
  Lines       51853    52052     +199     
==========================================
+ Hits        46180    46246      +66     
- Misses       5673     5806     +133     
Impacted Files Coverage Δ
tsl/src/remote/dist_copy.c 81.38% <45.45%> (-6.82%) ⬇️
tsl/test/src/remote/txn_resolve.c 96.82% <50.00%> (-3.18%) ⬇️
tsl/src/remote/connection.c 83.47% <75.20%> (-1.56%) ⬇️
tsl/src/remote/txn.c 87.57% <76.00%> (-1.70%) ⬇️
tsl/src/remote/copy_fetcher.c 87.33% <81.25%> (+2.49%) ⬆️
tsl/src/data_node.c 95.84% <81.81%> (-0.10%) ⬇️
tsl/src/remote/async.c 78.37% <84.09%> (-0.06%) ⬇️
tsl/src/nodes/data_node_dispatch.c 92.97% <100.00%> (+0.14%) ⬆️
tsl/src/remote/connection_cache.c 94.28% <100.00%> (+0.03%) ⬆️
tsl/src/remote/dist_txn.c 91.90% <100.00%> (+0.03%) ⬆️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 043092a...1852950. Read the comment docs.

@erimatnor erimatnor force-pushed the non-blocking-connections branch 11 times, most recently from 1a9b13f to 677e75a Compare December 16, 2022 15:17
@erimatnor erimatnor added tech-debt Needs refactoring and improvement tasks related to the source code and its architecture. enhancement An enhancement to an existing feature for functionality labels Dec 16, 2022
Copy link
Contributor

@nikkhils nikkhils left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had one comment. Rest looks good otherwise to me.

Copy link
Contributor

@pmwkaa pmwkaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, I've left some comments

ResetLatch(MyLatch);
CHECK_FOR_INTERRUPTS();
}
if (event.events & WL_SOCKET_READABLE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should check exactly then the read event came from the connection socket here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is only the one possibility, but I can assert it.


do
{
ret = PQisBusy(conn->pg_conn);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part looks good, I was mostly curious about it

tsl/src/remote/connection.c Show resolved Hide resolved
tsl/src/remote/connection.c Show resolved Hide resolved
tsl/src/remote/connection.c Show resolved Hide resolved
tsl/src/remote/connection.c Show resolved Hide resolved
tsl/src/remote/connection.c Show resolved Hide resolved
@@ -1809,13 +2024,14 @@ remote_connection_drain(TSConnection *conn, TimestampTz endtime, PGresult **resu
/* In what follows, do not leak any PGresults on an error. */
PG_TRY();
{
ModifyWaitEvent(conn->wes, conn->sockeventpos, WL_SOCKET_READABLE, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this? all the data could be in the buffer already

tsl/src/remote/connection.c Show resolved Hide resolved
tsl/src/remote/connection.c Show resolved Hide resolved
@erimatnor erimatnor force-pushed the non-blocking-connections branch 2 times, most recently from 7f1ad87 to 37d6056 Compare January 19, 2023 12:56
@erimatnor erimatnor changed the title Make data node calls non-blocking and interruptable Make data node calls non-blocking and interruptible Jan 19, 2023
@erimatnor erimatnor force-pushed the non-blocking-connections branch 7 times, most recently from fb156c6 to b6e0802 Compare January 25, 2023 12:52
Broken code caused the async connection module to never send queries
using prepared statements. Instead, queries were sent using the
parameterized query statement instead.

Fix this so that prepared statements are used when created.
Refactor the use of libpq calls to data nodes so that they honor
PostgreSQL interrupt handling (e.g., ctrl-c or `statement_timeout`)
and don't block unnecessarily.

To implement this behavior, data node connections are made
non-blocking by default and all `libpq` functions are wrapped to
integrate with PostgreSQL's signal handling (via `WaitEventSets`) when
waiting for read or write readiness.

A change is also made to the life-cycle management of `libpq` objects,
including connections, and remote query results. Instead of tying
these to transactions, they are now tied to the life-cycle of memory
contexts using a callback. In most cases, the memory context a
connection is allocated on has the same lifetime as transactions, but
not always. For example, the connection cache lives across connections
and is using a longer lived memory context. Previously that was
handled as a special case where connections were marked to not auto
close on transaction end.

Closes timescale#4958
Closes timescale#2757
@erimatnor
Copy link
Contributor Author

Splitting this up into smaller PRs.

@erimatnor erimatnor closed this Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An enhancement to an existing feature for functionality multinode tech-debt Needs refactoring and improvement tasks related to the source code and its architecture.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Graceful command exit using CTRL+C on MN Make data node connection establishment non-blocking
4 participants