Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion content/editions/features.md
Original file line number Diff line number Diff line change
Expand Up @@ -472,7 +472,7 @@ message Msg {

This feature determines how generated code should treat string fields. This
replaces the `ctype` option from proto2 and proto3, and offers a new
`string_view` feature. In Edition 2023, either `ctype` or `string_view` may be
`string_view` feature. In Edition 2023, either `ctype` or `string_type` may be
specified on a field, but not both.

**Values available:**
Expand Down
14 changes: 7 additions & 7 deletions content/includes/version-tables.css
Original file line number Diff line number Diff line change
Expand Up @@ -95,36 +95,36 @@ table.version-chart td.active {
/* latest release column */
/*
* How to advance the selection of the latest release:
* Replace class 'y24q3' in the following selectors with 'y24q4' (the class
* Replace class 'y24q4' in the following selectors with 'y25q1' (the class
* referring to the quarter of the next release). Please also update this
* instruction as a courtesy to the next maintainer.
*/

/* visually replace 'yyQq' heading with string 'Latest' */
table.version-chart th.y24q3 span {
table.version-chart th.y24q4 span {
display: none;
}
table.version-chart th.y24q3::after {
table.version-chart th.y24q4::after {
content: "Latest"
}

/* draw a focus rectangle around the latest release column */
table.version-chart th.y24q3 {
table.version-chart th.y24q4 {
border-top: 2px solid #e06666 !important;
border-left: 2px solid #e06666 !important;
border-right: 2px solid #e06666 !important;
}
table.version-chart td.y24q3 {
table.version-chart td.y24q4 {
font-weight: bold;
border-left: 2px solid #e06666 !important;
border-right: 2px solid #e06666 !important;
}
table.version-chart tr:last-child td.y24q3 {
table.version-chart tr:last-child td.y24q4 {
border-bottom: 2px solid #e06666 !important;
}

/* future release columns */
table.version-chart td:not(:has(~ .y24q3)):not(.y24q3) {
table.version-chart td:not(:has(~ .y24q4)):not(.y24q4) {
font-style: italic;
}

Expand Down
77 changes: 77 additions & 0 deletions content/news/2024-12-04.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
+++
title = "Changes announced December 4, 2024"
linkTitle = "December 4, 2024"
toc_hide = "true"
description = "Changes announced for Protocol Buffers on December 4, 2024."
type = "docs"
+++

We are planning to modify the Protobuf debug APIs (including Protobuf
AbslStringify, `proto2::ShortFormat`, `proto2::Utf8Format`,
`Message::DebugString`, `Message::ShortDebugString`, `Message::Utf8DebugString`)
in v30 to redact sensitive fields annotated by `debug_redact`; the outputs of
these APIs will contain a per-process randomized prefix, and so will no longer
be parseable by Protobuf TextFormat Parsers.

## Motivation

Currently Protobuf debug APIs print every field in a proto into human-readable
formats. This may lead to privacy incidents where developers accidentally log
Protobuf debug outputs containing sensitive fields.

## How to Annotate Sensitive Fields

There are two ways to mark fields sensitive:

* Mark a field with the field option `debug_redact = true`, directly.

```proto
message Foo {
optional string secret = 1 [debug_redact = true];
}
```

* If you have already defined a field annotation of type Enum by extending
`proto2.FieldOptions`, and certain values of this annotation are used to
annotate fields you would like to redact, then you can annotate these values
with `debug_redact = true`. All the fields that have been annotated with
such values will be redacted.

```proto
package my.package;

extend proto2.FieldOptions {
# The existing field annotation
optional ContentType content_type = 1234567;
};

enum ContentType {
PUBLIC = 0;
SECRET = 1 [debug_redact = true];
};

message Foo {
# will not be redacted
optional string public_info = 1 [
(my.package.content_type) = PUBLIC
];
# will be redacted
optional string secret = 1 [
(my.package.content_type) = SECRET
];
}
```

## New Debug Format

Compared to the existing debug format, the new debug format has two major
differences:

* The sensitive fields annotated with `debug_redact` are redacted
automatically in the output formats
* The output formats will contain a per-process randomized prefix, which will
make them no longer be parsable by TextFormat parsers.

Note that the second change is true regardless of whether the proto contains
sensitive fields or not, which ensures that any debug output always cannot be
deserialized regardless of the proto content.
2 changes: 2 additions & 0 deletions content/news/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ New news topics will also be published to the
The following news topics provide information in the reverse order in which it
was released.

* [December 4, 2024](/news/2024-12-04) - `DebugString`
replaced
* [November 7, 2024](/news/2024-11-07) - More breaking
changes in the upcoming 30.x release
* [October 2, 2024](/news/2024-10-02) - Breaking
Expand Down
12 changes: 12 additions & 0 deletions content/news/v30.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,18 @@ dependencies.
If you use installed packages, you won't be affected. It could break some CMake
workflows.

### Modify Debug APIs to Redact Sensitive Fields {#debug-redaction}

We are planning to modify the Protobuf debug APIs (including Protobuf
AbslStringify, `proto2::ShortFormat`, `proto2::Utf8Format`,
`Message::DebugString`, `Message::ShortDebugString`, `Message::Utf8DebugString`)
in v30 to redact sensitive fields annotated by `debug_redact`; the outputs of
these APIs will contain a per-process randomized prefix, and so will no longer
be parseable by Protobuf TextFormat Parsers.

Read more about this in the
[news article released November 21, 2024](/news/2024-11-21).

### Remove Deprecated APIs {#remove-deprecated}

v30 will remove the following public runtime APIs, which have been marked
Expand Down
120 changes: 120 additions & 0 deletions content/programming-guides/deserialize-debug.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
+++
title = "Deserializing Debug Proto Representations"
weight = 89
description = "How to log debugging information in Protocol Buffers."
type = "docs"
+++

From version 29.x, `DebugString` APIs (`proto2::DebugString`,
`proto2::ShortDebugString`, `proto2::Utf8DebugString`) are deprecated.
DebugString users should migrate to some Abseil string functions (such as
`absl::StrCat`, `absl::StrFormat`, `absl::StrAppend`, AND `absl::Substitute`),
Abseil logging API, and some Protobuf APIs (`proto2::ShortFormat`,
`proto2::Utf8Format`) to automatically convert proto arguments into a new
debugging format .

Unlike the Protobuf DebugString output format, the new debugging format
automatically redacts sensitive fields by replacing their values with the string
"[REDACTED]" (without the quotation marks). In
addition, to ensure that this new output format cannot be deserialized by
Protobuf TextFormat parsers, regardless of whether the underlying proto contains
SPII fields, we add a set of randomized links pointing to this article
and a randomized-length whitespace sequence. The new debugging format looks as
follows:

```none
go/nodeserialize
spii_field: [REDACTED]
normal_field: "value"
```

Note that the new debugging format is only different from the output format of
DebugString format in two ways:

* The URL prefix
* The values of SPII fields are replaced by
"[REDACTED]" (without the quotes)

The new debugging format never removes any field names; it only replaces the
value with
"[REDACTED]" if the field is considered sensitive.
**If you don't see certain fields in the output, it is because those fields are
not set in the proto.**

**Tip:** If you see only the URL and nothing else, your proto is empty!

## Why is this URL here?

We want to make sure nobody deserializes human-readable representations of a
protobuf message intended for humans debugging a system. Historically,
`.DebugString()` and `TextFormat` were interchangeable, and existing systems use
DebugString to transport and store data.

We want to make sure sensitive data does not accidentally end up in logs.
Therefore, we are transparently redacting some field values from protobuf
messages before turning them into a string
("[REDACTED]"). This reduces the security & privacy
risk of accidental logging, but risks data loss if other systems deserialize
your message. To address this risk, we are intentionally splitting the
machine-readable TextFormat from the human-readable debug format to be used in
log messages.

### Why are there links in my web page? Why is my code producing this new "debug representation"?

This is intentional, to make the "debug representation" of your protos
(produced, for example, by logging) incompatible with TextFormat. We want to
prevent anyone from depending on debugging mechanisms to transport data between
programs. Historically, the debug format (generated by the DebugString APIs) and
TextFormat have been incorrectly used in a interchangeable fashion. We hope this
intentional effort will prevent that going forward.

We intentionally picked a link over less visible format changes to get an
opportunity to provide context. This might stand out in UIs, such as if you
display status information on a table in a webpage. You may use
`TextFormat::PrintToString` instead, which will not redact any information and
preserves formatting. However, use this API cautiously -- there are no built in
protections. As a rule of thumb, if you are writing data to debug logs, or
producing status messages, you should continue to use the Debug Format with the
link. Even if you are currently not handling sensitive data, keep in mind that
systems can change and code gets re-used.

### I tried converting this message into TextFormat, but I noticed the format changes every time my process restarts.

This is intentional. Don't attempt to parse the output of this debug format. We
reserve the right to change the syntax without notice. The debug format syntax
randomly changes per process to prevent inadvertent dependencies. If a syntactic
change in the debug format would break your system, chances are you shouldn't
use the debug representation of a proto.

## FAQ

### Can I Just Use TextFormat Everywhere?

Don't use TextFormat for producing log messages. This will bypass all built-in
protections, and you risk accidentally logging sensitive information. Even if
your systems are currently not handling any sensitive data, this can change in
the future.

Distinguish logs from information that's meant for further processing by other
systems by using either the debug representation or TextFormat as appropriate.

### I Want to Write Configuration Files That Need to Be Both Human-Readable And Machine-Readable

For this use case, you can use TextFormat explicitly. You are responsible for
making sure your configuration files don't contain any PII.

### I Am Writing a Unit Test, and Want to Compare Debugstring in a Test Assertion

If you want to compare protobuf values, use `MessageDifferencer` like in the
following:

```cpp
using google::protobuf::util::MessageDifferencer;
...
MessageDifferencer diff;
...
diff.Compare(foo, bar);
```

Besides ignoring formatting and field order differences, you will also get
better error messages.
2 changes: 1 addition & 1 deletion content/programming-guides/dos-donts.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ from repeated to scalar will result in the last deserialized value "winning."

Going from scalar to repeated is OK in proto2 and in proto3 with
`[packed=false]` because for binary serialization the scalar value becomes a
one-element list .
one-element list.

<a id="do-follow-the-style-guide-for-generated-code"></a>

Expand Down
Loading