-
Couldn't load subscription status.
- Fork 476
Documented --redact-logs flag
#8708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r1, 1 of 1 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Amruta-Ranade)
v20.2/cockroach-debug-merge-logs.md, line 34 at r2 (raw file):
`--from` | Start time for the time range filter. `--to` | End time for the time range filter. `--redact-logs` | Redact sensitive PII data from the log files. Note that this flag removes sensitive information only from the log files. The other items (listed ablve) collected by the `debug zip` command may still contain sensitive information.
-
I think we want to remove the abbreviation "PII". Back when we implemented this we did not fully understand that customers only care about "sensitive" data. Whether PII is sensitive or not depends on the application, and it's not a relevant distinction from CRL's perspective.
-
in any case the phrase "sensitive data" is not an industry standard and thus needs to be defined. I would recommend that you create a callout include with the definition, and include it from here and the other place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @knz)
v20.2/cockroach-debug-merge-logs.md, line 34 at r2 (raw file):
Previously, knz (kena) wrote…
I think we want to remove the abbreviation "PII". Back when we implemented this we did not fully understand that customers only care about "sensitive" data. Whether PII is sensitive or not depends on the application, and it's not a relevant distinction from CRL's perspective.
in any case the phrase "sensitive data" is not an industry standard and thus needs to be defined. I would recommend that you create a callout include with the definition, and include it from here and the other place.
I agree. How do we define sensitive data? Do we have a writeup about it somewhere?
|
@Amruta-Ranade, I think we need to do more here than just document the debug flags. If people know to look at those commands, this is great, but how can we expose this functionality to users who might not know it exists? Please think about this. There are a few places where we describe logging options. We may also want to mention this ability in the context of security in a place or two. |
We say that information is "sensitive" (also called "unsafe") when it does not fit clearly into the definition of “safe for reporting”. ("reporting" as in "can be reported automatically via telemetry to CRL") So the right way to explain this is to start by defining what we consider “safe” then explain that everything else is considered unsafe/sensitive. What is safe:
Except for these specific items which are designated as safe, everything else is assumed to be unsafe/sensitive. This includes (but not exclusively):
|
a45e6cf to
91e7eb0
Compare
I don't think this feature affects any other docs. It redacts sensitive information only from the output of |
|
@Amruta-Ranade, that make sense. But do you think we should mention this feature in places like https://www.cockroachlabs.com/docs/v20.1/debug-and-error-logs.html? Does it apply to sql logs (https://www.cockroachlabs.com/docs/v20.1/sql-faqs.html#how-do-i-log-sql-queries)? |
|
Yes, I updated the debug-and-errors-log doc. SQL logs and hence the SQL FAQs doc are unaffected by this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r3, 1 of 2 files at r4, 1 of 1 files at r5.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @Amruta-Ranade and @knz)
v20.2/cockroach-debug-merge-logs.md, line 12 at r5 (raw file):
{{site.data.alerts.callout_danger}} The file produced by `cockroach debug zip` can contain highly [sensitive, unanonymized information](debug-and-error-logs.html#redacted-logs), such as usernames, hashed passwords, and possibly your table's data. You can use the [`redact`](#example) flag to redact the sensitive data out of log files and crash reports before sharing them with Cockroach Labs.
I believe the common word for "unanomymized" is "identifiable" or "nominal".
v20.2/cockroach-debug-merge-logs.md, line 34 at r5 (raw file):
`--from` | Start time for the time range filter. `--to` | End time for the time range filter. `--redact` | Redact [sensitive data](debug-and-error-logs.html#redacted-logs) from the log files. Note that this flag removes sensitive information only from the log files. The other items (listed above) collected by the `debug zip` command may still contain sensitive information.
in the command cockroach debug merge-log, there is no need to refer to "only log files" or debug zip. It's a separate command.
v20.2/debug-and-error-logs.md, line 117 at r5 (raw file):
## Redacted logs If you contact CockroachDB Support for troubleshooting help, you might be asked to run [`cockroach debug zip`](cockroach-debug-zip.html) and share the resulting file with the CockroachDB team. The log files created by `cockroach debug zip` may contain highly sensitive, unanonymized information, such as usernames, hashed passwords, and possibly your table's data.
see my comment above about the word "unanonymized"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @knz)
v20.2/cockroach-debug-merge-logs.md, line 34 at r2 (raw file):
Previously, Amruta-Ranade (Amruta Ranade) wrote…
I agree. How do we define sensitive data? Do we have a writeup about it somewhere?
Done.
v20.2/cockroach-debug-merge-logs.md, line 12 at r5 (raw file):
Previously, knz (kena) wrote…
I believe the common word for "unanomymized" is "identifiable" or "nominal".
Done.
v20.2/cockroach-debug-merge-logs.md, line 34 at r5 (raw file):
Previously, knz (kena) wrote…
in the command
cockroach debug merge-log, there is no need to refer to "only log files" ordebug zip. It's a separate command.
Done.
v20.2/debug-and-error-logs.md, line 117 at r5 (raw file):
Previously, knz (kena) wrote…
see my comment above about the word "unanonymized"
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a couple suggestions!
v20.2/cockroach-debug-merge-logs.md
Outdated
|
|
||
| {{site.data.alerts.callout_danger}} | ||
| The file produced by `cockroach debug merge-log` can contain highly sensitive, unanonymized information, such as usernames, passwords, and possibly your table's data. You should share this data only with Cockroach Labs developers and only after determining the most secure method of delivery. | ||
| The file produced by `cockroach debug zip` can contain highly [sensitive, identifiable information](debug-and-error-logs.html#redacted-logs), such as usernames, hashed passwords, and possibly your table's data. You can use the [`redact`](#example) flag to redact the sensitive data out of log files and crash reports before sharing them with Cockroach Labs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest calling this "the --redact flag"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
v20.2/cockroach-debug-zip.md
Outdated
|
|
||
| {{site.data.alerts.callout_danger}} | ||
| The file produced by `cockroach debug zip` can contain highly sensitive, unanonymized information, such as usernames, hashed passwords, and possibly your table's data. You should share this data only with Cockroach Labs developers and only after determining the most secure method of delivery. | ||
| The file produced by `cockroach debug zip` can contain highly [sensitive, identifiable information](debug-and-error-logs.html#redacted-logs), such as usernames, hashed passwords, and possibly your table's data. You can use the [`redact-logs`](#redact-sensitive-information-from-the-logs) flag to redact the sensitive data out of log files and crash reports before sharing them with Cockroach Labs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest calling this "the --redact-logs flag" (easier to search for)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
v20.2/debug-and-error-logs.md
Outdated
|
|
||
| ## Redacted logs | ||
|
|
||
| If you contact CockroachDB Support for troubleshooting help, you might be asked to run [`cockroach debug zip`](cockroach-debug-zip.html) and share the resulting file with the CockroachDB team. The log files created by `cockroach debug zip` may contain highly sensitive, unanonymized information, such as usernames, hashed passwords, and possibly your table's data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest working in the phrase "personally identifiable information (PII)" since that is a standard term that will also be searched for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raphael recommended not using the term (PII) and saying "sensitive, unanonymized information" instead. I agree with you though -- IMO, PII is a standard term that should feature somewhere in the doc. cc @knz for approval.
|
|
||
| <span class="version-tag">New in v20.2</span> You can run `cockroach debug zip` with the [`redact-logs` flag](cockroach-debug-zip.html#redact-sensitive-information-from-the-logs) to redact the sensitive data out of log files and crash reports before sharing them with Cockroach Labs. Redactable sensitive data includes but is not limited to: | ||
|
|
||
| - Stored values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better yet, the PII term could go here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r7.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Amruta-Ranade and @rmloveland)
v20.2/debug-and-error-logs.md, line 117 at r6 (raw file):
Previously, Amruta-Ranade (Amruta Ranade) wrote…
Raphael recommended not using the term (PII) and saying "sensitive, unanonymized information" instead. I agree with you though -- IMO, PII is a standard term that should feature somewhere in the doc. cc @knz for approval.
-
I wrote in my comments above that the word "unanonymized" is a strange word. For one, I can't find it in any online dictionary. I would encourage you to use a word that actually exists. I made a few suggestions earlier, maybe you have a better opinion.
-
Regarding the phrase "PII" or "personally identifiable information". This is only one kind of sensitive data. For example, our customer's IP address if often not PII, but it is most definitely sensitive.
So you may wish to write "sensitive information, included but not limited to PII" or something to that effect.
Hmm.. I did change it to "sensitive, identifiable information" in the other files, but forgot to fix it in the debug-and-error-logs file. Fixed. |
Closes #7490 and #8395
Documented the
--redact-logsflag fordebug-zipanddebug-merge-logscommands. Did not document theredactable-logsflag forcockroach start,cockroach start-single-node, andcockroach demobecause the flag is now enabled by default and no longer needs user input.Added an example to redact logs and the sample output. Not sure if we need to elaborate on the logging format - IMO, it needs to be a follow-up docs project that's bigger than just redacted logs.