-
Notifications
You must be signed in to change notification settings - Fork 128
Remove logging of monitor connection string with password #512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this PR! We have some implementation details to change, as per my comments. The most important is that you removed the connection string entirely from some error messages, leaving our users in a position where they won't be able to understand and fix their situations.
Other than that, we don't need too free stack allocated memory buffers, those buffer initial value should be { 0 }, and I think we can do better with the “scrubbing” API. I think for the sake of user sanity, showing either stars or unicode bullets (•) would be best, so that they know we did parse and use their password and just don't want to expose their secrets in the clear in the logs.
About unicode bullets, unfortunately I am not sure if we can use that freely. Many systems didn't get the memo from 1991 yet. Same with IPv6. Let's not get started on SQL:2016.
| { | ||
| log_fatal("Failed to determine monitor hostname when parsing " | ||
| "Postgres URI \"%s\"", config->monitor_pguri); | ||
| "Postgres URI"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we remove the connection string in the error message? How is the user going to figure out what's happening there in a way they can fix it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it here because I assumed that it was un-parsable. Do you think in the cases of un-parsable connection strings we should log it, even if it may reveal private information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we failed to parse the monitor Postgres URI, there is nothing that can work in the deployment, and we want the user to fix that. How can they fix the situation without us showing what's wrong?
So while I have some level of sympathy to hiding private information in the connection string, my personal opinion is that allowing easy understanding of what's wrong in order to fix it takes priority. We don't know if the URI contains sensitive information: we failed to parse it. Let's at least show what we failed to parse, hopefully making it obvious what is to be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a possible compromise be to print the mangled URI if and only if the logging level is debug or verbose? Most folks don't run in that configuration in production, and that's the place we feel that inapprioriately logging credentials is most sensitive. This way, if someone really needs to see what pg_auto_failover is tripping over, they can re-run in verbose mode, pick up the error, fix any mangling and also rotate credentials at that time?
I feel that this may be the right way to go, in that the biggest concern is credentials being silently printed to logs without the operator realizing. This way, we only print sensitive information to the log when we know that someone is explicitly choosing to look at the log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good first step would be a version of this PR that omits the controversial points, and then we can open a new PR for what's left in your audit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a look and make the changes, I've been in the process of moving, so I haven't been very available lately. :)
src/bin/pg_autoctl/cli_do_tmux.c
Outdated
| char pg_autoctl_argv0[MAXPGPATH]; | ||
| char pg_autoctl_program[MAXPGPATH]; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that with producing a new binary file we might need to move things in a shared space. Why cli_do_tmux.c though? This at least requires some comments about it, and I guess a better choice of a common module...
| "The monitor SSL setup is ready and your current " | ||
| "connection string is \"%s\", you might need to update it", | ||
| config->monitor_pguri); | ||
| "The monitor SSL setup is ready, you might need to update it"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, why are we removing the connection string entirely here? If the user has to update the monitor URI in the configuration file, which URI should they write in there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add it back for this one :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah wait, this one is actually kind of surprising to me -- were we not able to parse the string? What are we actually doing here (I would expect it to return false).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parse_pguri_info_key_vals might have other bugs than a failure in PQconninfoParse...
| } | ||
|
|
||
|
|
||
| char scrubbedConnectionString[MAXCONNINFO]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have a default value of { 0 }.
char scrubbedConnectionString[MAXCONNINFO] = { 0 };|
|
||
|
|
||
| char scrubbedConnectionString[MAXCONNINFO]; | ||
| parse_and_scrub_connection_string(newPgURI, scrubbedConnectionString); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API returns a boolean to signal if something went wrong, let's preferably use it, or if you believe we can bypass checking the result, add a (void) prefix before the function call to make that intend clear to readers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case where we can't parse/scrub the connection string, what do you think would be the best option? Maybe print "Un-parsable connection string" instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either the user is impacted by the error and we need to help them in their journey to setup our software correctly, or they are not impacted and should not be distracted by our internal limits. Are the users impacted by this failure here? How badly?
src/bin/pg_autoctl/keeper_pg_init.c
Outdated
| log_debug("pg exists: %s", postgresInstanceExists ? "yes" : "no"); | ||
| log_debug("pg is primary: %s", postgresInstanceIsPrimary ? "yes" : "no"); | ||
|
|
||
| PQfreemem(scrubbedConnectionString); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, no need.
| char pg_autoctl_argv0[MAXPGPATH]; | ||
| char pg_autoctl_program[MAXPGPATH]; | ||
| int pgconnect_timeout = 2; /* see also POSTGRES_CONNECT_TIMEOUT */ | ||
|
|
||
| char *ps_buffer; /* will point to argv area */ | ||
| size_t ps_buffer_size; /* space determined at run time */ | ||
| size_t last_status_len; /* use to minimize length of clobber */ | ||
|
|
||
| Semaphore log_semaphore; /* allows inter-process locking */ | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way to solve that problem would be to copy/paste the same definitions in the other main file that you introduce, in order to minimise changes?
src/bin/pg_autoctl/parsing.c
Outdated
| if (conninfo == NULL) | ||
| { | ||
| log_error("Failed to parse pguri \"%s\": %s", pguri, errmsg); | ||
| log_error("Failed to parse pguri: %s", errmsg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we failed to parse a connection string. Which connection string? How is the user going to be able to fix their setup now, if we refrain from showing them any clue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Along with my comment above -- do you think it's worth printing it here, even if it may reveal a secret to the logs? I was torn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will the user fix the Postgres URI that we fail to parse if they don't know which URI it is? The only caller to that function is in cli_enable_disable in update_monitor_connection_string, which means the configuration file might be “corrupted” somehow. Yes, I think it's worth printing what we got.
src/bin/pg_autoctl/parsing.c
Outdated
| /* | ||
| * remove_password_from_parameters is a non-exported function that will remove the keyword | ||
| * and value if the keyword matches "password" for scrubbing a connection string in logs. | ||
| */ | ||
| void | ||
| remove_password_from_parameters(KeyVal *parameters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a better way to approach this: what about introducing a KeyVal *secrets to the buildPostgresURIfromPieces API? When building the connection string it is then easy to replace secret parameters with a sequence of stars, or even a unicode bullet • (see https://en.wikipedia.org/wiki/Bullet_(typography)).
The KeyVal *secrets in buildPostgresURIfromPieces would take as input the keyword names that are secrets and implement the computation of their associated “bullets” value in the secrets->values[index] and use that in the final URI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm maybe we can use the existing overrides option in parse_pguri_info_key_vals as well -- let me try it.
Are there other secret parameters other than password?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm looks like overrides get set even if the original parameters didn't have the value -- I'll try your idea :)
Failing test:
test_parse_and_scrub_connection_string_with_no_password_in_it:0: Assertion 'scrubbedPguri == "postgres://admin@monitor.local:9999/testdb?"' failed: scrubbedPguri == "postgres://admin@monitor.local:9999/testdb?password=****", "postgres://admin@monitor.local:9999/testdb?" == "postgres://admin@monitor.local:9999/testdb?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like you did publish the new code that is failing the test, and I don't know if the test should be updated or the code fixed, here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We implemented an update that uses overrides.
src/bin/pg_autoctl/parsing.c
Outdated
| parseUri = parse_pguri_info_key_vals(pguri, &overrides, &uriParams); | ||
| if (!parseUri) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style issue: we would usually just test the return value from calling the function:
if (!parse_pguri_info_key_vals(pguri, &overrides, &uriParams))
{
log_error(...);
return false;
}For debugging purposes, it might be better to have the return value in a variable that we can inspect, in that case the variable name that we usually would prefer is success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We updated to the inlined style.
6b697e0 to
31fdcc0
Compare
|
@DimCitus Hi Dimitri! I think we've addressed all of your concerns for this PR, but I am getting some flakiness on one of the python tests.
Do you know of connections that might be left open? The test passes when I run |
a3762ce to
95af913
Compare
|
Hi folks, @rheaton, I just did a review. There are still some small problems, that we could fix very easily (log messages, variable default assignment styles, etc). That said I would like to first review/finish/merge the password editing parts, and then in a separate PR think about the added unit tests. Can you please split the work there in two? |
|
@DimCitus I will split it into another PR -- based off this branch. We did use the unit-tests to TDD some of the code and only added the missing ones for characterization tests to understand how things were working. I did not attempt to backfill all unit-tests at once. |
- Tests added for hapostgres#512 - Some Characterization tests added
- Tests added for hapostgres#512 - Some Characterization tests added
|
@DimCitus I've pulled out the unit tests (and I'll deal with that PR once we are through this one 🙂). I'm still getting the nose test failure from test 019. Same issue as before. Any idea what could have been added that would introduce the "too many connections" issue? #512 (comment) Additionally, let's get those other small changes in! Thank you, |
|
@DimCitus I've made updates to this PR to make it as slim as possible. There is some more work to be done to completely close out the issue, but implementation probably should be discussed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me, I think with a pass of make indent and the Semaphore declaration clean-up we should be good to merge!
src/bin/pg_autoctl/lock_utils.h
Outdated
| int semId; | ||
| } Semaphore; | ||
|
|
||
| Semaphore log_semaphore; /* allows inter-process locking */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this change part of this PR?
| Semaphore log_semaphore; /* allows inter-process locking */ | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, I don't think this PR should change that.
|
@DimCitus I'm not seeing any changes with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of whitespace changes that I am not comfortable with, still. The most important are around the static SQL string definitions, but also white lines that have no reason to be added or removed. It might be just a setup or tooling problem? you might have to use a different editor to finish up the PR unfortunately...
src/bin/pg_autoctl/keeper_pg_init.c
Outdated
| if (!parse_and_scrub_connection_string(config->monitor_pguri, | ||
| scrubbedConnectionString)) | ||
| { | ||
| log_error("Monitor connection string is unparseable"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better written in the same style as the other errors messages, such as: "Failed to parse the monitor connection string".
src/bin/pg_autoctl/parsing.c
Outdated
| /* */ | ||
| /* Look for a populated password connection parameter */ | ||
| /* */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange formatting.
src/bin/pg_autoctl/parsing.c
Outdated
| if ( | ||
| strcmp(option->keyword, "password") == 0 && | ||
| option->val != NULL && | ||
| strcmp(option->val, "") != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should use IS_EMPTY_STRING_BUFFER here?
src/bin/pg_autoctl/pgsql.c
Outdated
| #define ERRCODE_OBJECT_IN_USE "55006" | ||
|
|
||
| static char * ConnectionTypeToString(ConnectionType connectionType); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this PR changes the white lines here, and in the following lines in this file?
src/bin/pg_autoctl/pgsql.c
Outdated
| char *sqlTemplate = | ||
| /* | ||
| * We could simplify the writing of this query, but we prefer that it | ||
| * looks as much as possible like the query used in | ||
| * pgsql_replication_slot_maintain() so that we can maintain both | ||
| * easily. | ||
| */ | ||
| "WITH nodes(slot_name, lsn) as (" | ||
| " SELECT '" REPLICATION_SLOT_NAME_DEFAULT "_' || id, lsn" | ||
| " FROM (%s) as sb(id, lsn) " | ||
| "), \n" | ||
| "dropped as (" | ||
| " SELECT slot_name, pg_drop_replication_slot(slot_name) " | ||
| " FROM pg_replication_slots pgrs LEFT JOIN nodes USING(slot_name) " | ||
| " WHERE nodes.slot_name IS NULL " | ||
| " AND ( slot_name ~ '" REPLICATION_SLOT_NAME_PATTERN "' " | ||
| " OR slot_name ~ '" REPLICATION_SLOT_NAME_DEFAULT "' )" | ||
| " AND not active" | ||
| " AND slot_type = 'physical'" | ||
| "), \n" | ||
| "created as (" | ||
| "SELECT c.slot_name, c.lsn " | ||
| " FROM nodes LEFT JOIN pg_replication_slots pgrs USING(slot_name), " | ||
| " LATERAL pg_create_physical_replication_slot(slot_name, true) c" | ||
| " WHERE pgrs.slot_name IS NULL " | ||
| ") \n" | ||
| "SELECT 'create', slot_name, lsn FROM created " | ||
| " union all " | ||
| "SELECT 'drop', slot_name, NULL::pg_lsn FROM dropped"; | ||
| char *sqlTemplate = | ||
| /* | ||
| * We could simplify the writing of this query, but we prefer that it | ||
| * looks as much as possible like the query used in | ||
| * pgsql_replication_slot_maintain() so that we can maintain both | ||
| * easily. | ||
| */ | ||
| "WITH nodes(slot_name, lsn) as (" | ||
| " SELECT '" REPLICATION_SLOT_NAME_DEFAULT "_' || id, lsn" | ||
| " FROM (%s) as sb(id, lsn) " | ||
| "), \n" | ||
| "dropped as (" | ||
| " SELECT slot_name, pg_drop_replication_slot(slot_name) " | ||
| " FROM pg_replication_slots pgrs LEFT JOIN nodes USING(slot_name) " | ||
| " WHERE nodes.slot_name IS NULL " | ||
| " AND ( slot_name ~ '" REPLICATION_SLOT_NAME_PATTERN "' " | ||
| " OR slot_name ~ '" REPLICATION_SLOT_NAME_DEFAULT "' )" | ||
| " AND not active" | ||
| " AND slot_type = 'physical'" | ||
| "), \n" | ||
| "created as (" | ||
| "SELECT c.slot_name, c.lsn " | ||
| " FROM nodes LEFT JOIN pg_replication_slots pgrs USING(slot_name), " | ||
| " LATERAL pg_create_physical_replication_slot(slot_name, true) c" | ||
| " WHERE pgrs.slot_name IS NULL " | ||
| ") \n" | ||
| "SELECT 'create', slot_name, lsn FROM created " | ||
| " union all " | ||
| "SELECT 'drop', slot_name, NULL::pg_lsn FROM dropped"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a whitespace change that we should not have in this PR, please fix.
src/bin/pg_autoctl/pgsql.c
Outdated
| char *sqlTemplate = | ||
| "WITH nodes(slot_name, lsn) as (" | ||
| " SELECT '" REPLICATION_SLOT_NAME_DEFAULT "_' || id, lsn" | ||
| " FROM (%s) as sb(id, lsn) " | ||
| "), \n" | ||
| "dropped as (" | ||
| " SELECT slot_name, pg_drop_replication_slot(slot_name) " | ||
| " FROM pg_replication_slots pgrs LEFT JOIN nodes USING(slot_name) " | ||
| " WHERE nodes.slot_name IS NULL " | ||
| " AND slot_name ~ '" REPLICATION_SLOT_NAME_PATTERN "' " | ||
| " AND not active" | ||
| " AND slot_type = 'physical'" | ||
| "), \n" | ||
| "advanced as (" | ||
| "SELECT a.slot_name, a.end_lsn" | ||
| " FROM pg_replication_slots s JOIN nodes USING(slot_name), " | ||
| " LATERAL pg_replication_slot_advance(slot_name, lsn) a" | ||
| " WHERE nodes.lsn <> '0/0' and nodes.lsn >= s.restart_lsn " | ||
| "), \n" | ||
| "created as (" | ||
| "SELECT c.slot_name, c.lsn " | ||
| " FROM nodes LEFT JOIN pg_replication_slots pgrs USING(slot_name), " | ||
| " LATERAL pg_create_physical_replication_slot(slot_name, true) c" | ||
| " WHERE pgrs.slot_name IS NULL " | ||
| ") \n" | ||
| "SELECT 'create', slot_name, lsn FROM created " | ||
| " union all " | ||
| "SELECT 'drop', slot_name, NULL::pg_lsn FROM dropped " | ||
| " union all " | ||
| "SELECT 'advance', slot_name, end_lsn FROM advanced "; | ||
| char *sqlTemplate = | ||
| "WITH nodes(slot_name, lsn) as (" | ||
| " SELECT '" REPLICATION_SLOT_NAME_DEFAULT "_' || id, lsn" | ||
| " FROM (%s) as sb(id, lsn) " | ||
| "), \n" | ||
| "dropped as (" | ||
| " SELECT slot_name, pg_drop_replication_slot(slot_name) " | ||
| " FROM pg_replication_slots pgrs LEFT JOIN nodes USING(slot_name) " | ||
| " WHERE nodes.slot_name IS NULL " | ||
| " AND slot_name ~ '" REPLICATION_SLOT_NAME_PATTERN "' " | ||
| " AND not active" | ||
| " AND slot_type = 'physical'" | ||
| "), \n" | ||
| "advanced as (" | ||
| "SELECT a.slot_name, a.end_lsn" | ||
| " FROM pg_replication_slots s JOIN nodes USING(slot_name), " | ||
| " LATERAL pg_replication_slot_advance(slot_name, lsn) a" | ||
| " WHERE nodes.lsn <> '0/0' and nodes.lsn >= s.restart_lsn " | ||
| "), \n" | ||
| "created as (" | ||
| "SELECT c.slot_name, c.lsn " | ||
| " FROM nodes LEFT JOIN pg_replication_slots pgrs USING(slot_name), " | ||
| " LATERAL pg_create_physical_replication_slot(slot_name, true) c" | ||
| " WHERE pgrs.slot_name IS NULL " | ||
| ") \n" | ||
| "SELECT 'create', slot_name, lsn FROM created " | ||
| " union all " | ||
| "SELECT 'drop', slot_name, NULL::pg_lsn FROM dropped " | ||
| " union all " | ||
| "SELECT 'advance', slot_name, end_lsn FROM advanced "; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a whitespace change that we should not have in this PR, please fix.
src/bin/pg_autoctl/pgsql.c
Outdated
| char *sql = | ||
| /* | ||
| * Make it so that we still have the current WAL LSN even in the case | ||
| * where there's no replication slot in use by any standby. | ||
| * | ||
| * When on the primary, we might have multiple standby nodes connected. | ||
| * We're good when at least one of them is either 'sync' or 'quorum'. | ||
| * We don't check individual replication slots, we take the "best" one | ||
| * and report that. | ||
| */ | ||
| "select pg_is_in_recovery()," | ||
| " coalesce(rep.sync_state, '') as sync_state," | ||
| " case when pg_is_in_recovery()" | ||
| " then coalesce(pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn())" | ||
| " else pg_current_wal_flush_lsn()" | ||
| " end as current_lsn," | ||
| " pg_control_version, catalog_version_no, system_identifier" | ||
| " from (values(1)) as dummy" | ||
| " full outer join" | ||
| " (select pg_control_version, catalog_version_no, system_identifier " | ||
| " from pg_control_system()" | ||
| " )" | ||
| " as control on true" | ||
| " full outer join" | ||
| " (" | ||
| " select sync_state" | ||
| " from pg_replication_slots slot" | ||
| " join pg_stat_replication rep" | ||
| " on rep.pid = slot.active_pid" | ||
| " where slot_name ~ '" REPLICATION_SLOT_NAME_PATTERN "' " | ||
| " or slot_name = '" REPLICATION_SLOT_NAME_DEFAULT "' " | ||
| "order by case sync_state " | ||
| " when 'quorum' then 4 " | ||
| " when 'sync' then 3 " | ||
| " when 'potential' then 2 " | ||
| " when 'async' then 1 " | ||
| " else 0 end " | ||
| " desc limit 1" | ||
| " ) " | ||
| "as rep on true"; | ||
| char *sql = | ||
| /* | ||
| * Make it so that we still have the current WAL LSN even in the case | ||
| * where there's no replication slot in use by any standby. | ||
| * | ||
| * When on the primary, we might have multiple standby nodes connected. | ||
| * We're good when at least one of them is either 'sync' or 'quorum'. | ||
| * We don't check individual replication slots, we take the "best" one | ||
| * and report that. | ||
| */ | ||
| "select pg_is_in_recovery()," | ||
| " coalesce(rep.sync_state, '') as sync_state," | ||
| " case when pg_is_in_recovery()" | ||
| " then coalesce(pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn())" | ||
| " else pg_current_wal_flush_lsn()" | ||
| " end as current_lsn," | ||
| " pg_control_version, catalog_version_no, system_identifier" | ||
| " from (values(1)) as dummy" | ||
| " full outer join" | ||
| " (select pg_control_version, catalog_version_no, system_identifier " | ||
| " from pg_control_system()" | ||
| " )" | ||
| " as control on true" | ||
| " full outer join" | ||
| " (" | ||
| " select sync_state" | ||
| " from pg_replication_slots slot" | ||
| " join pg_stat_replication rep" | ||
| " on rep.pid = slot.active_pid" | ||
| " where slot_name ~ '" REPLICATION_SLOT_NAME_PATTERN "' " | ||
| " or slot_name = '" REPLICATION_SLOT_NAME_DEFAULT "' " | ||
| "order by case sync_state " | ||
| " when 'quorum' then 4 " | ||
| " when 'sync' then 3 " | ||
| " when 'potential' then 2 " | ||
| " when 'async' then 1 " | ||
| " else 0 end " | ||
| " desc limit 1" | ||
| " ) " | ||
| "as rep on true"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a whitespace change that we should not have in this PR, please fix.
src/bin/pg_autoctl/pgsql.c
Outdated
| char *sql = | ||
| " select $1::pg_lsn <= flush_lsn, flush_lsn " | ||
| " from pg_replication_slots slot" | ||
| " join pg_stat_replication rep" | ||
| " on rep.pid = slot.active_pid" | ||
| " where ( slot_name ~ '" REPLICATION_SLOT_NAME_PATTERN "' " | ||
| " or slot_name = '" REPLICATION_SLOT_NAME_DEFAULT "') " | ||
| " and sync_state in ('sync', 'quorum') " | ||
| "order by flush_lsn desc limit 1"; | ||
| char *sql = | ||
| " select $1::pg_lsn <= flush_lsn, flush_lsn " | ||
| " from pg_replication_slots slot" | ||
| " join pg_stat_replication rep" | ||
| " on rep.pid = slot.active_pid" | ||
| " where ( slot_name ~ '" REPLICATION_SLOT_NAME_PATTERN "' " | ||
| " or slot_name = '" REPLICATION_SLOT_NAME_DEFAULT "') " | ||
| " and sync_state in ('sync', 'quorum') " | ||
| "order by flush_lsn desc limit 1"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a whitespace change that we should not have in this PR, please fix.
| static void reload_configuration(Monitor *monitor); | ||
| static bool monitor_ensure_configuration(Monitor *monitor); | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a whitespace change that we should not have in this PR, please fix.
e7b7cb7 to
6a1260a
Compare
|
Pushed up fixes. My indent was doing strange things. Appears to be working alright now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple very small changes and we can merge!
| } | ||
| else | ||
| { | ||
| log_info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now notice we return false in that case. I suppose we should then either log_error or maybe log_warn if appropriate (not sure it would be).
src/bin/pg_autoctl/keeper_pg_init.c
Outdated
| if (!parse_and_scrub_connection_string(config->monitor_pguri, | ||
| scrubbedConnectionString)) | ||
| { | ||
| log_error("Monitor connection string is unparseable"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another error message that would better read "Failed to parse the monitor connection string".
- Exceptions are unparsable monitor connection strings, and connection strings that are too long
|
Updated |
|
Thanks for your contribution @rheaton ; and all the fixes you accepted to go through for this PR to get merged! I appreciate your work here! |
In the cases when the connection string is parsable, we scrub and remove the password prior to logging in.
In the cases where the connection string is un-parsable, we have removed it from the log messages.
Additionally, this PR adds (C Unit tests)[https://libcheck.github.io/check/] that can be run via
make test_unitIf we decide that we like these unit tests, we should integrate them into the travis builds.
Fixes #466