New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hide client command arguments #11747
base: unstable
Are you sure you want to change the base?
Conversation
This is from AWS, and I'll add some more context about the intention. We wanted to make sure that user data wasn't getting stored into log files, which has use cases around data sovereignty and security. The idea is that we should be able to produce a clean log file that can be stored and moved and only contain server information. Some thoughts about this:
|
I agree. We should start by coming up with a list of those places and what could fall under this definition (e.g. client names, file names, |
Just adding on ACL users and any command arguments or partial query information (we dump that in a couple of places) since we can't know their content. @oranagra Do you have any thoughts about providing some way to sanitize user data out of log files? |
src/networking.c
Outdated
@@ -2762,7 +2769,7 @@ sds catClientInfoString(sds s, client *client) { | |||
getClientPeerId(client), | |||
getClientSockname(client), | |||
connGetInfo(client->conn, conninfo, sizeof(conninfo)), | |||
client->name ? (char*)client->name->ptr : "", | |||
(client->name && !shouldHideClientInfo()) ? (char*)client->name->ptr : "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should we hide the client name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
client name can hold sensitive information about the client purpose/ what the client does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regarding client-list, I hide clients names because it is printed inside crash reports.
The client names will not be hidden when calling client-list directly.
ok, now i understand that change better (had difficulties wrapping my head around the code).
i think the solution of using isBugReportStart
and shouldHideClientInfo
is ugly, and instead, it would be much nicer to just add a hide_user_info
argument to catClientInfoString
and getAllClientsInfoString
.
the problem with that is that catClientInfoString
is widely used, and we don't wanna modify all the calls.
the solution to that is to add an alias function, i.e. catClientInfoStringGeneric
will take the argument and the existing catClientInfoString
will just be a wrapper.
I agree regarding the extended effort needed in this case. I think we should provide some sort of mask-pii mode for redis so it will avoid reporting internal data (key names, values, client names, addresses etc... as they might hold some info regarding customer application) but we will also need to provide the customer way to get that data in case he needs it (like turning off the pii masking per connection?) |
Customer will be able to watch those details by disabling Independent parser to run over logs and omit client parts is a complex alternative. It gets strings as input while in the config case we have the full logical context. |
you mean that admin would be able to do multi; config set hide-client-info no; client list; config set hide-client-info yes; exec? |
Not a fan of the idea, but It could be an acl user flag. But I still don't understand why we want to filter client list. Unlike the crash log, it's just a command, like KEYS or GET, and can be blocked by acl |
It is true that it can be blocked, but in cases were Redis is provided as a service sometimes clients would like to prevent operators from accessing private data while the operators would still need some tools in order to debug/manage the service .commands like slowlog, info client list etc... can be used by both operator and clients while having this flag can help reduce the potential pii leakage |
@ranshid I think we need to consider the use case you describe as a separate feature. Preventing PII from getting into log files is one thing. Removing PII from the output of administrative commands is another thing - it takes us to multi-level admin privileges, which is a deeper rabbit hole. |
I agree that some deeper thought should be placed to this, I was only pointing out some use cases which we encountered in AWS. As said it is possible to use some external tool to remove PII from logs and command outputs, but in such cases customers cannot realy validate the efficiency of such tool so having a redis built-in mechanism can help build trust with external users. |
src/config.c
Outdated
@@ -3061,6 +3061,7 @@ standardConfig static_configs[] = { | |||
createBoolConfig("latency-tracking", NULL, MODIFIABLE_CONFIG, server.latency_tracking_enabled, 1, NULL, NULL), | |||
createBoolConfig("aof-disable-auto-gc", NULL, MODIFIABLE_CONFIG, server.aof_disable_auto_gc, 0, NULL, updateAofAutoGCEnabled), | |||
createBoolConfig("replica-ignore-disk-write-errors", NULL, MODIFIABLE_CONFIG, server.repl_ignore_disk_write_error, 0, NULL, NULL), | |||
createBoolConfig("hide-client-info", NULL, MODIFIABLE_CONFIG, server.hide_client_info, 0, NULL, NULL), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the name of this config is not clear enough.
if it keeps it's current purpose, it should say "log".
alternatively we need to design where this is going before making any changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. documentation in redis.conf is missing.
src/debug.c
Outdated
@@ -1022,6 +1022,11 @@ void _serverAssert(const char *estr, const char *file, int line) { | |||
bugReportEnd(0, 0); | |||
} | |||
|
|||
/* Returns the amount of client's command arguments we allow logging */ | |||
int clientArgsToShow(const client *c) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe rename "Show" to "Log"
src/networking.c
Outdated
@@ -2712,7 +2712,7 @@ char *getClientSockname(client *c) { | |||
|
|||
/* Concatenate a string representing the state of a client in a human | |||
* readable format, into the sds string 's'. */ | |||
sds catClientInfoString(sds s, client *client) { | |||
sds catClientInfoStringGeneric(sds s, client *client, bool hide_user_info) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hide_user_info
is misleading, it could mean that we hide the ACL user name.
actually, i'll argue that again that i don't see why CLIENT SETNAME is PII, and if it is, then the ACL user name could be too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, client name will not be redacted
src/networking.c
Outdated
@@ -2762,7 +2762,7 @@ sds catClientInfoString(sds s, client *client) { | |||
getClientPeerId(client), | |||
getClientSockname(client), | |||
connGetInfo(client->conn, conninfo, sizeof(conninfo)), | |||
client->name ? (char*)client->name->ptr : "", | |||
(client->name && !hide_user_info) ? (char*)client->name->ptr : "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it should say --redacted--
instead of remain empty?
src/debug.c
Outdated
serverLog(LL_WARNING|LL_RAW,"%s\n", client); | ||
sdsfree(client); | ||
for (j = 0; j < cc->argc; j++) { | ||
for (j = 0; j < clientArgsToShow(cc); j++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we hide the arguments, let's at least print the argc
instead, and maybe a bold "redacted" message.
we discussed this in a core-team meeting and concluded that we would like to proceed only after we prepare some detailed list of everything we could want to redact (maybe host names and other things). |
Ack, I will create a draft list for discussion |
Those are the points I found which we might spill to logs some PII.
Please let me know if I missed something |
I don't think SLOWLOG is part of it, if it did then MONITOR and CLIENT LIST are too. |
I agree regarding |
it doesn't log the stack "trace". it logs the stack contents (could contain variables) |
I would be inclined to say function names count and file names do not. Typically file names are administrator, and we can ask them to set it to something not related to user data. |
Updated list
|
Hide replication backlog & stack frame and fix comments |
e848b0c
to
2617412
Compare
@naglera Are you abandoning this effort? |
No, horrible git accident, I started this commit from unstable, and it closed automatically when I pulled from upstream. By the way, @madolson, are you also approving the list? #11747 (comment) |
@naglera Yeah, that looked good to me :) |
@naglera What about this test code? since Lines 352 to 360 in 5fdaa53
|
fix spaces Co-authored-by: debing.sun <debing.sun@redis.com>
fix spaces Co-authored-by: debing.sun <debing.sun@redis.com>
fix log Co-authored-by: debing.sun <debing.sun@redis.com>
Hi @sundb, I'm not sure if we actually use |
if (j >= clientArgsToLog(c)){ | ||
serverLog(LL_WARNING|LL_RAW,"client->argv[%d]: *redacted*\n", j); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (j >= clientArgsToLog(c)){ | |
serverLog(LL_WARNING|LL_RAW,"client->argv[%d]: *redacted*\n", j); | |
if (j >= clientArgsToLog(c)) { | |
serverLog(LL_WARNING|LL_RAW,"client->argv[%d]: *redacted*", j); |
@@ -1917,6 +1930,10 @@ void logCurrentClient(client *cc, const char *title) { | |||
sdsfree(client); | |||
serverLog(LL_WARNING|LL_RAW,"argc: '%d'\n", cc->argc); | |||
for (j = 0; j < cc->argc; j++) { | |||
if (j >= clientArgsToLog(cc)){ | |||
serverLog(LL_WARNING|LL_RAW,"argv[%d]: *redacted*\n", j); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serverLog(LL_WARNING|LL_RAW,"argv[%d]: *redacted*\n", j); | |
serverLog(LL_WARNING|LL_RAW,"argv[%d]: *redacted*", j); |
|
In the event of an assertion failure, hide command arguments from the operator.
In some cases, private client information can be voluntarily exposed when a redis instance crashes due to an assertion failure.
This commit prevent וnintentional client info exposure.
Operators can still access the hidden data, but they must actively request it.
Any of the client info commands remains the unchanged.