New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plug connection leaks found during profiling #582
Plug connection leaks found during profiling #582
Conversation
Hey Goergios, You're making good points here, and the analysis is solid, but as you saw our goal has shifted and the code has not been maintained well enough to show our current intentions. Sorry about that.
If you run
We started from that much simpler implementation yes, and then made it more complex to be able to re-use a previously established connection.
What's wrong with running queries in the same connection when you LISTEN for changes? |
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, February 5, 2021 11:37 AM, Dimitri Fontaine ***@***.***> wrote:
Hey Goergios,
You're making good points here, and the analysis is solid, but as you saw our goal has shifted and the code has not been maintained well enough to show our current intentions. Sorry about that.
It happens to all software :)
So the real problem is leaking connections and connection structs. With out this patch,
several 10s of Kb where leaked (definitely lost) every few loops under some test scenarios.
This can become substantial memleak on a long running process. That is what this PR
is plugging.
> It seems that pgsql_execute_with_params() during its lifetime has been
> inconsistently altered. The latest version notes in the comments that the
> connection is not persistant to facilitate error handling. However that was not
> entirely true and several parts of the code assumed it to not be true. Others
> assumed to be true and failed to release the connection once used.
If you run `pg_autoctl` in DEBUG mode (using `-vv` for very verbose) you will see connections and disconnections made I the log messages. The idea is that we should refrains from series of connect, disconnect, connect, disconnect, connect, disconnect within the same work unit. That's not the best way to use Postgres, we should be smart enough to re-use a connection and manage the client-side libpq clean-up that is necessary.
Absolutely. You should not leak memory though. My rough read is that in order
to achieve that, a small redesign of the interface will be needed. IMHO, it might
make sense to prevent the memleaks while the redesign is taking place which can
end up being a while. Again, only an opinion :)
> For the sake of clarity, the function will now explicitly close the connection
> that has used, regardless of wether it is a new or existing connection. That
> simplifies most of the code and plugs the connection leaks.
We started from that much simpler implementation yes, and then made it more complex to be able to re-use a previously established connection.
> It also unconvers an inconsistency on the connections used for notification. The
> code mixed the connection it was using to listen to events from the monitor and
> with others. A new PGconn member has been added in the monitor struct to
> distinguish between the two distinct cases.
What's wrong with running queries in the same connection when you LISTEN for changes?
Nothing, providing you are not leaking. If you try to plug the leaks with the current
interface, you will stop listening to events in parts of the code where it is not desired
to stop listening to events. Or that has been my understanding.
…
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Okay, between your message and a chat with @JelteF I am now convinced that we should 1. plug the leak and 2. find a principled way to be smart about re-using connections when that's better.
Yeah that's a good point too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase to current master and let's see about the CI before merging!
Thinking more about it, I think this breaks some of our retry loops, such as the one in keeper_register_and_init
in https://github.com/citusdata/pg_auto_failover/blob/master/src/bin/pg_autoctl/keeper.c#L1231
src/bin/pg_autoctl/keeper_pg_init.c
Outdated
@@ -546,6 +546,7 @@ wait_until_primary_is_ready(Keeper *keeper, | |||
KeeperStateData *keeperState = &(keeper->state); | |||
int timeoutMs = PG_AUTOCTL_KEEPER_SLEEP_TIME * 1000; | |||
|
|||
(void) pgsql_listen(&(monitor->notificationClient), &((char *){ 0 })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need at least a comment that explains why it's okay to send an empty list of channels here, and then I would rather avoid this advanced notation and just use a const char *channels[] = { 0 };
variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's clearly a hack. I only added it as a quick way to open a connection as there was no proper interface available.
Instead of hacking pgsql_listen(), how about exposing a lower level interface like pgsql_open_connection()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we don't want to expose the lower-level interface. The whole point is that ensuring libpq-level clean-up and “lifetime management” is pretty hard, so we want to have that all sit in the same place and remain kind of opaque to the higher levels.
Also I'm not sure why we now have an explicit call to pgsql_listen
that we didn't have before. Maybe that's where the new API could happen?
src/bin/pg_autoctl/service_keeper.c
Outdated
* Finally make establish a connection for notifications in case it had | ||
* closed before | ||
*/ | ||
(void) pgsql_listen(&(keeper->monitor.notificationClient), &((char *){ 0 })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before, I'm not happy with this notation, let's make it a whole lot more explicit please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took time to review the changes and the incompatibility with our manual transaction handling that we do in places. Do you want to update the PR?
src/bin/pg_autoctl/pgsql.c
Outdated
PQfinish(pgsql->connection); | ||
pgsql->connection = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a couple places in the code we are handling transactions to sync local state file creation with transaction commit on the monitor. When the local activity fails, we ROLLBACK the transaction on the monitor. I think we need to track if an explicit transaction is being used in our PGSQL object and provide pgsql_begin
, pgsql_commit
, and pgsql_rollback
functions, and then have the PGfinish
call in pgsql_execute_with_params
depend on whether an explicit transaction is in flight or not.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, March 15, 2021 1:17 PM, Dimitri Fontaine ***@***.***> wrote:
@DimCitus requested changes on this pull request.
I took time to review the changes and the incompatibility with our manual transaction handling that we do in places. Do you want to update the PR?
Thank you. Let me try to get a fresh look at it.
---------------------------------------------------------------
In [src/bin/pg_autoctl/pgsql.c](#582 (comment)):
> + PQfinish(pgsql->connection);
+ pgsql->connection = NULL;
In a couple places in the code we are handling transactions to sync local state file creation with transaction commit on the monitor. When the local activity fails, we ROLLBACK the transaction on the monitor. I think we need to track if an explicit transaction is being used in our PGSQL object and provide pgsql_begin, pgsql_commit, and pgsql_rollback functions, and then have the PGfinish call in pgsql_execute_with_params depend on whether an explicit transaction is in flight or not.
Yeah, it seems that the naive, 'use once' approach will not cut it if a transaction has to be open.
Let me try some more targeted valgrind runs and see if I can catch the actual offender(s) before
attempting a re-write of the API.
… —
You are receiving this because you authored the thread.
Reply to this email directly, [view it on GitHub](#582 (review)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/ALBTIO4RYLKNWBALRBS77T3TDX3F3ANCNFSM4XDM5VVQ).
|
In case that's needed, I think most call sites would remain exactly the same as today. Only those where we issue manual BEGIN/ROLLBACK/COMMIT instructions would have to change. That's not too many, I think I can only find one... |
78c1e76
to
8b61e50
Compare
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, March 15, 2021 5:58 PM, Dimitri Fontaine ***@***.***> wrote:
> Yeah, it seems that the naive, 'use once' approach will not cut it if a transaction has to be open. Let me try some more targeted valgrind runs and see if I can catch the actual offender(s) before attempting a re-write of the API.
In case that's needed, I think most call sites would remain exactly the same as today. Only those where we issue manual BEGIN/ROLLBACK/COMMIT instructions would have to change. That's not too many, I think I can only find one...
Thank you for looking and apologies for the delay.
I rebased the current (I know it is public but I am not assuming anyone to have used it :)) and tried to address the comments.
Valgrind seems happy. However I am getting some flakiness in the test_022_detect_network_partition test. Let the infra run
the tests and see what it says.
//Georgios
…
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
8b61e50
to
ccec708
Compare
@DimCitus Now that was a bit embarrassing. I had previously pushed only the rebase and not the changes. Please find a fresh push with a new rebase and the requested changes. Valgrind is still happy and the tests do pass locally. Let us wait for the CI to conclude. |
Thanks @gkokolatos ! Your approach/API looks better than my own attempt yesterday. I think it'd be good to rename and improve the
Some of the failures I see seem related to a merge error, typically:
And then you need to run And another one is the Travis infamous one, where your PR is running some processes in DEBUG mode:
|
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, March 24, 2021 11:40 AM, Dimitri Fontaine ***@***.***> wrote:
> @DimCitus Now that was a bit embarrassing. I had previously pushed only the rebase and not the changes.
> Please find a fresh push with a new rebase and the requested changes.
Thanks @gkokolatos ! Your approach/API looks better than my own attempt yesterday. I think it'd be good to rename and improve the `emptyChannels` to `char *emptyChannelsList = { NULL };` but that's a minor issue.
Thank you! Fixed.
> Valgrind is still happy and the tests do pass locally. Let us wait for the CI to conclude.
Some of the failures I see seem related to a merge error, typically:
======================================================================
FAIL: test_extension_update.test_001_update_extension
----------------------------------------------------------------------
Traceback (most recent call last):
File "/opt/python/3.7.6/lib/python3.7/site-packages/nose/case.py", line 198, in runTest
self.test(*self.arg)
File "/home/travis/build/citusdata/pg_auto_failover/tests/test_extension_update.py", line 41, in test_001_update_extension
eq_(results, [("dummy",)])
AssertionError: [('1.5.0.2',)] != [('dummy',)]
----------------------------------------------------------------------
Ran 34 tests in 74.179s
Let me have a look at that.
And then you need to run `make indent` again apparently.
Too true. Fixed
…
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, March 24, 2021 12:51 PM, ***@***.***> wrote:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, March 24, 2021 11:40 AM, Dimitri Fontaine ***@***.*** wrote:
> > @DimCitus Now that was a bit embarrassing. I had previously pushed only the rebase and not the changes.
> > Please find a fresh push with a new rebase and the requested changes.
>
> Thanks @gkokolatos ! Your approach/API looks better than my own attempt yesterday. I think it'd be good to rename and improve the `emptyChannels` to `char *emptyChannelsList = { NULL };` but that's a minor issue.
Thank you! Fixed.
> > Valgrind is still happy and the tests do pass locally. Let us wait for the CI to conclude.
>
> Some of the failures I see seem related to a merge error, typically:
>
> ======================================================================
> FAIL: test_extension_update.test_001_update_extension
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "/opt/python/3.7.6/lib/python3.7/site-packages/nose/case.py", line 198, in runTest
> self.test(*self.arg)
> File "/home/travis/build/citusdata/pg_auto_failover/tests/test_extension_update.py", line 41, in test_001_update_extension
> eq_(results, [("dummy",)])
> AssertionError: [('1.5.0.2',)] != [('dummy',)]
>
> ----------------------------------------------------------------------
> Ran 34 tests in 74.179s
>
Let me have a look at that.
There exists some flakiness for sure.
What really caught my eye was test_multi_ifdown.test_011_prepare_candidate_priorities
which had many a successful runs (e.g. 2887.1 and 2887.2) but in 2887.3 there
was a deadlock detected.
Log follows which can also be seen here (https://travis-ci.com/github/citusdata/pg_auto_failover/jobs/493356759):
12:01:24 28140 ERROR Monitor ERROR: deadlock detected
358712:01:24 28140 ERROR Monitor DETAIL: Process 28157 waits for ShareLock on transaction 3012; blocked by process 28156.
358812:01:24 28140 ERROR Monitor Process 28156 waits for ExclusiveLock on advisory lock [16385,822708183,0,11]; blocked by process 28157.
358912:01:24 28140 ERROR Monitor HINT: See server log for query details.
359012:01:24 28140 ERROR Monitor CONTEXT: while updating tuple (0,17) in relation "node"
359112:01:24 28140 ERROR Monitor SQL statement "UPDATE pgautofailover.node SET candidatepriority = $1, replicationquorum = $2 WHERE nodeid = $3 and nodehost = $4 AND nodeport = $5"
359212:01:24 28140 ERROR Failed to update node candidate priority on node "node_2"in formation "default" for candidate_priority: "100"
359312:01:24 28140 ERROR Failed to set "candidate-priority" to "100".
I am not certain if and how this is related to the current PR,
but it seems serious enough to demand some scrutiny. I will be
taking a look but feel free to weigh in, if you want.
…
> And then you need to run `make indent` again apparently.
Too true. Fixed
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub, or unsubscribe.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did another round of review, focusing on some details that seem to require more attention.
src/bin/pg_autoctl/keeper_pg_init.c
Outdated
char *emptyChannelList[] = { NULL }; | ||
|
||
(void) pgsql_listen(&(monitor->notificationClient), emptyChannelList); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, how does it work now? We are listening to no channel at all, how do we expect to get a notification that some state change happened in our formation and group, or even for our own node?
pgsql->connectionStatementType = PGSQL_CONNECTION_MULTI_STATEMENT; | ||
connection = pgsql_open_connection(pgsql); | ||
if (connection == NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we check for "transaction already in progress" errors and report it as a BUG, possibly forcing an exit?
src/bin/pg_autoctl/service_keeper.c
Outdated
monitor->notificationClient.connectionStatementType == | ||
PGSQL_CONNECTION_SINGLE_STATEMENT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's always false, so that we never close the connection, right?
src/bin/pg_autoctl/service_keeper.c
Outdated
/* Finally establish a connection for notifications if none present */ | ||
(void) pgsql_listen(&(keeper->monitor.notificationClient), emptyChannelList); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we don't need that here, the monitor_wait_for_state_change
in the beginning of the main loop reconnects if needed. We have a now spurious "Lost connection" warning that we should probably get rid of in monitor_wait_for_state_change
that said, I believe it's now expected to have to establish a connection every once in a while from that point.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, March 24, 2021 3:29 PM, Dimitri Fontaine ***@***.***> wrote:
@DimCitus requested changes on this pull request.
I did another round of review, focusing on some details that seem to require more attention.
Excellent, thank you!
In src/bin/pg_autoctl/keeper_pg_init.c:
> + char *emptyChannelList[] = { NULL };
+ (void) pgsql_listen(&(monitor->notificationClient), emptyChannelList);
By the way, how does it work now? We are listening to no channel at all, how do we expect to get a notification that some state change happened in our formation and group, or even for our own node?
It works as it did before.
The following call is to monitor_wait_for_state_change does right at the top
```
PGconn *connection = monitor->notificationClient.connection;
WaitForStateChangeNotificationContext context = {
(char *) formation,
groupId,
nodeId,
false /* stateHasChanged */
};
char *channels[] = { "state", NULL };
if (connection == NULL)
{
log_warn("Lost connection.");
return false;
}
```
which means that it demands to have a connection open.
The call to pgsql_listen() with an empty list, does exactly that, opens a connection.
It is needed because the lower level function pgsql_open_connection() is not exposed.
Of course we could expose that one, but you had objected when I suggested it.
In src/bin/pg_autoctl/pgsql.c:
> + pgsql->connectionStatementType = PGSQL_CONNECTION_MULTI_STATEMENT;
+ connection = pgsql_open_connection(pgsql);
+ if (connection == NULL)
Could we check for "transaction already in progress" errors and report it as a BUG, possibly forcing an exit?
I think that pgsql_open_connection() should take care of that. Or isn't it?
In src/bin/pg_autoctl/service_keeper.c:
> + monitor->notificationClient.connectionStatementType ==
+ PGSQL_CONNECTION_SINGLE_STATEMENT)
That's always false, so that we never close the connection, right?
It is set as single statement from the listen call earlier. So it is always true.
It is added in the code in order to enforce expectations.
In src/bin/pg_autoctl/service_keeper.c:
> + /* Finally establish a connection for notifications if none present */
+ (void) pgsql_listen(&(keeper->monitor.notificationClient), emptyChannelList);
+
I believe we don't need that here, the `monitor_wait_for_state_change` in the beginning of the main loop reconnects if needed. We have a now spurious "Lost connection" warning that we should probably get rid of in `monitor_wait_for_state_change` that said, I believe it's now expected to have to establish a connection every once in a while from that point.
That was necessary a in the first iteration of the code. Let me recheck it. Thanks.
…
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, March 24, 2021 3:42 PM, ***@***.***> wrote:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, March 24, 2021 3:29 PM, Dimitri Fontaine ***@***.*** wrote:
> @DimCitus requested changes on this pull request.
> I did another round of review, focusing on some details that seem to require more attention.
Excellent, thank you!
> In src/bin/pg_autoctl/keeper_pg_init.c:
>
> > - char *emptyChannelList[] = { NULL };
> >
> >
>
> - (void) pgsql_listen(&(monitor->notificationClient), emptyChannelList);
>
>
>
> By the way, how does it work now? We are listening to no channel at all, how do we expect to get a notification that some state change happened in our formation and group, or even for our own node?
It works as it did before.
The following call is to monitor_wait_for_state_change does right at the top
PGconn *connection = monitor->notificationClient.connection;
WaitForStateChangeNotificationContext context = {
(char *) formation,
groupId,
nodeId,
false /* stateHasChanged */
};
char *channels[] = { "state", NULL };
if (connection == NULL)
{
log_warn("Lost connection.");
return false;
}
which means that it demands to have a connection open.
The call to pgsql_listen() with an empty list, does exactly that, opens a connection.
It is needed because the lower level function pgsql_open_connection() is not exposed.
Of course we could expose that one, but you had objected when I suggested it.
> In src/bin/pg_autoctl/pgsql.c:
>
> > - pgsql->connectionStatementType = PGSQL_CONNECTION_MULTI_STATEMENT;
>
> - connection = pgsql_open_connection(pgsql);
> - if (connection == NULL)
>
> Could we check for "transaction already in progress" errors and report it as a BUG, possibly forcing an exit?
I think that pgsql_open_connection() should take care of that. Or isn't it?
> In src/bin/pg_autoctl/service_keeper.c:
>
> > - monitor->notificationClient.connectionStatementType ==
> >
> >
>
> - PGSQL_CONNECTION_SINGLE_STATEMENT)
>
>
>
> That's always false, so that we never close the connection, right?
It is set as single statement from the listen call earlier. So it is always true.
It is added in the code in order to enforce expectations.
> In src/bin/pg_autoctl/service_keeper.c:
>
> > - /* Finally establish a connection for notifications if none present */
>
> - (void) pgsql_listen(&(keeper->monitor.notificationClient), emptyChannelList);
> -
>
> I believe we don't need that here, the `monitor_wait_for_state_change` in the beginning of the main loop reconnects if needed. We have a now spurious "Lost connection" warning that we should probably get rid of in `monitor_wait_for_state_change` that said, I believe it's now expected to have to establish a connection every once in a while from that point.
That was necessary a in the first iteration of the code. Let me recheck it. Thanks.
I did recheck and it is needed because the keeper_node_active_loop calls monitor_wait_for_state_change which by itself expects an open connection.
Previous to the PR, the connection would be closed and then reopen (intentionally leak) in keeper_node_active. The call was added there instead of
the tight loop before monitor_wait_for_state_change() in order to closely resemble the previous location of opening connections.
Now I am a bit curious, did you successfully managed run a node without that specific pgsql_listen() call? . I failed every singe time in every
scenario, but if you did manage, then it would be very useful to try to reproduce.
…
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub, or unsubscribe.
|
I think this PR needs to be rebased/merged on-top on current master's branch again. Recent changes in the release numbers (expected files for the monitor extension should mention extension version 1.5 now) and the Pyroute2 integration in the test framework are the main changes. |
It seems that pgsql_execute_with_params() during its lifetime has been inconsistently altered. The latest version notes in the comments that the connection is not persistant to facilitate error handling. However that was not entirely true and several parts of the code assumed it to not be true. Others assumed to be true and failed to release the connection once used. For the sake of clarity, the function will now explicitly close the connection that has used, regardless of wether it is a new or existing connection. That simplifies most of the code and plugs the connection leaks. It also unconvers an inconsistency on the connections used for notification. The code mixed the connection it was using to listen to events from the monitor and with others. A new PGconn member has been added in the monitor struct to distinguish between the two distinct cases.
… close that connection when appropriate
* Remove confusing pgsql_listen calls in favour of a new user friendly call * Use the same tight conn loop around monitor_wait_for_state_change everywhere * Correctly close the connections after it.
9462288
to
0ed0cd0
Compare
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, March 30, 2021 10:56 AM, Dimitri Fontaine ***@***.***> wrote:
I think this PR needs to be rebased/merged on-top on current master's branch again. Recent changes in the release numbers (expected files for the monitor extension should mention extension version 1.5 now) and the Pyroute2 integration in the test framework are the main changes.
Sure. Force pushed a rebased version.
…
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Thanks! I'm not sure why we still have the following error:
Can you reproduce it locally and get more logs maybe? Will have a look later, is it possible that the changes in this PR are somehow preventing the automated ALTER EXTENSION UPDATE mechanics at startup of the pg_autoctl of the monitor? To repro, with a docker environment available, simply do:
|
On Tue, Mar 30, 2021 at 13:53, Dimitri Fontaine ***@***.***> wrote:
Thanks! I'm not sure why we still have the following error:
======================================================================
FAIL: test_extension_update.test_001_update_extension
----------------------------------------------------------------------
Traceback (most recent call last):
File "/opt/python/3.7.6/lib/python3.7/site-packages/nose/case.py", line 198, in runTest
self.test(*self.arg)
File "/home/travis/build/citusdata/pg_auto_failover/tests/test_extension_update.py", line 41, in test_001_update_extension
eq_(results, [("dummy",)])
AssertionError: [('1.5',)] != [('dummy',)]
----------------------------------------------------------------------
Can you reproduce it locally and get more logs maybe? Will have a look later, is it possible that the changes in this PR are somehow preventing the automated ALTER EXTENSION UPDATE mechanics at startup of the pg_autoctl of the monitor?
I saw that. I am currently looking at it. I will give an update by eod if I am still in doubt.
… —
You are receiving this because you were mentioned.
Reply to this email directly, [view it on GitHub](#582 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/ALBTIO5VCEV62NU5UKHU3K3TGG3T7ANCNFSM4XDM5VVQ).
|
Addresses CI failure of test_extension_update case.
The latest commit seems to address the test_extension_update.test_001_update_extension failure. There seems to be one more failure as per https://travis-ci.com/github/citusdata/pg_auto_failover/jobs/494941418 |
It seems Travis is confused, can you push a meaningless commit to trigger another build? |
Merged! Thanks for your contribution! |
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, March 30, 2021 7:19 PM, Dimitri Fontaine ***@***.***> wrote:
Merged! Thanks for your contribution!
Awesome! Thank you for carrying it across the line.
… —
You are receiving this because you were mentioned.
Reply to this email directly, [view it on GitHub](#582 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/ALBTIO7OAH6SO6FBBCLBSWDTGIBYPANCNFSM4XDM5VVQ).
|
It seems that pgsql_execute_with_params() during its lifetime has been
inconsistently altered. The latest version notes in the comments that the
connection is not persistant to facilitate error handling. However that was not
entirely true and several parts of the code assumed it to not be true. Others
assumed to be true and failed to release the connection once used.
For the sake of clarity, the function will now explicitly close the connection
that has used, regardless of wether it is a new or existing connection. That
simplifies most of the code and plugs the connection leaks.
It also unconvers an inconsistency on the connections used for notification. The
code mixed the connection it was using to listen to events from the monitor and
with others. A new PGconn member has been added in the monitor struct to
distinguish between the two distinct cases.