@@ -29,13 +29,15 @@ discuss aspects of the wire format.
2929The Protoscope tool can also dump encoded protocol buffers as text. See
3030https://github.com/protocolbuffers/protoscope/tree/main/testdata for examples.
3131
32+ All examples in this topic assume that you are using Edition 2023 or later.
33+
3234## A Simple Message {#simple}
3335
3436Let's say you have the following very simple message definition:
3537
3638``` proto
3739message Test1 {
38- optional int32 a = 1;
40+ int32 a = 1;
3941}
4042```
4143
@@ -241,7 +243,7 @@ Consider this message schema:
241243
242244``` proto
243245message Test2 {
244- optional string b = 2;
246+ string b = 2;
245247}
246248```
247249
@@ -275,7 +277,7 @@ an embedded message of our original example message, `Test1`:
275277
276278``` proto
277279message Test3 {
278- optional Test1 c = 3;
280+ Test1 c = 3;
279281}
280282```
281283
@@ -293,36 +295,49 @@ and a length of 3, exactly the same way as strings are encoded.
293295In Protoscope, submessages are quite succinct. ` ``1a03089601`` ` can be written
294296as ` 3: {1: 150} ` .
295297
296- ## Optional and Repeated Elements {#optional}
298+ ## Missing Elements {#optional}
297299
298- Missing ` optional ` fields are easy to encode: we just leave out the record if
300+ Missing fields are easy to encode: we just leave out the record if
299301it's not present. This means that "huge" protos with only a few fields set are
300302quite sparse.
301303
302- ` repeated ` fields are a bit more complicated. Ordinary (not [ packed] ( #packed ) )
303- repeated fields emit one record for every element of the field. Thus, if we have
304+ <span id =" packed " ></span >
305+
306+ ## Repeated Elements {#repeated}
307+
308+ Starting in Edition 2023, ` repeated ` fields of a primitive type
309+ (any [ scalar type] ( /programming-guides/proto2#scalar )
310+ that is not ` string ` or ` bytes ` ) are [ "packed"] ( /editions/features#repeated_field_encoding ) by default.
311+
312+ Packed ` repeated ` fields, instead of being encoded as one
313+ record per entry, are encoded as a single ` LEN ` record that contains each
314+ element concatenated. To decode, elements are decoded from the ` LEN ` record one
315+ by one until the payload is exhausted. The start of the next element is
316+ determined by the length of the previous, which itself depends on the type of
317+ the field. Thus, if we have:
304318
305319``` proto
306320message Test4 {
307- optional string d = 4;
308- repeated int32 e = 5 ;
321+ string d = 4;
322+ repeated int32 e = 6 ;
309323}
310324```
311325
312326and we construct a ` Test4 ` message with ` d ` set to ` "hello" ` , and ` e ` set to
313- ` 1 ` , ` 2 ` , and ` 3 ` , this * could* be encoded as `` ` 220568656c6c6f280128022803 `
314- ``, or written out as Protoscope,
327+ ` 1 ` , ` 2 ` , and ` 3 ` , this * could* be encoded as `` `3206038e029ea705` `` , or
328+ written out as Protoscope,
315329
316330``` proto
3173314: {"hello"}
318- 5: 1
319- 5: 2
320- 5: 3
332+ 6: {3 270 86942}
321333```
322334
323- However, records for ` e ` do not need to appear consecutively, and can be
324- interleaved with other fields; only the order of records for the same field with
325- respect to each other is preserved. Thus, this could also have been encoded as
335+ However, if the repeated field is set to expanded (overriding the default packed
336+ state) or is not packable (strings and messages) then an entry for each
337+ individual value is encoded. Also, records for ` e ` do not need to appear
338+ consecutively, and can be interleaved with other fields; only the order of
339+ records for the same field with respect to each other is preserved. Thus, this
340+ could look like the following:
326341
327342``` proto
3283435: 1
@@ -331,6 +346,24 @@ respect to each other is preserved. Thus, this could also have been encoded as
3313465: 3
332347```
333348
349+ Only repeated fields of primitive numeric types can be declared "packed". These
350+ are types that would normally use the ` VARINT ` , ` I32 ` , or ` I64 ` wire types.
351+
352+ Note that although there's usually no reason to encode more than one key-value
353+ pair for a packed repeated field, parsers must be prepared to accept multiple
354+ key-value pairs. In this case, the payloads should be concatenated. Each pair
355+ must contain a whole number of elements. The following is a valid encoding of
356+ the same message above that parsers must accept:
357+
358+ ``` proto
359+ 6: {3 270}
360+ 6: {86942}
361+ ```
362+
363+ Protocol buffer parsers must be able to parse repeated fields that were compiled
364+ as ` packed ` as if they were not packed, and vice versa. This permits adding
365+ ` [packed=true] ` to existing fields in a forward- and backward-compatible way.
366+
334367### Oneofs {#oneofs}
335368
336369[ ` Oneof ` fields] ( /programming-guides/proto2#oneof ) are
@@ -368,53 +401,6 @@ message.MergeFrom(message2);
368401This property is occasionally useful, as it allows you to merge two messages (by
369402concatenation) even if you do not know their types.
370403
371- ### Packed Repeated Fields {#packed}
372-
373- Starting in v2.1.0, ` repeated ` fields of a primitive type
374- (any [ scalar type] ( /programming-guides/proto2#scalar )
375- that is not ` string ` or ` bytes ` ) can be declared as "packed". In proto2 this is
376- done using the field option ` [packed=true] ` . In proto3 it is the default.
377-
378- Instead of being encoded as one record per entry, they are encoded as a single
379- ` LEN ` record that contains each element concatenated. To decode, elements are
380- decoded from the ` LEN ` record one by one until the payload is exhausted. The
381- start of the next element is determined by the length of the previous, which
382- itself depends on the type of the field.
383-
384- For example, imagine you have the message type:
385-
386- ``` proto
387- message Test5 {
388- repeated int32 f = 6 [packed=true];
389- }
390- ```
391-
392- Now let's say you construct a ` Test5 ` , providing the values 3, 270, and 86942
393- for the repeated field ` f ` . Encoded, this gives us `` `3206038e029ea705` `` , or
394- as Protoscope text,
395-
396- ``` proto
397- 6: {3 270 86942}
398- ```
399-
400- Only repeated fields of primitive numeric types can be declared "packed". These
401- are types that would normally use the ` VARINT ` , ` I32 ` , or ` I64 ` wire types.
402-
403- Note that although there's usually no reason to encode more than one key-value
404- pair for a packed repeated field, parsers must be prepared to accept multiple
405- key-value pairs. In this case, the payloads should be concatenated. Each pair
406- must contain a whole number of elements. The following is a valid encoding of
407- the same message above that parsers must accept:
408-
409- ``` proto
410- 6: {3 270}
411- 6: {86942}
412- ```
413-
414- Protocol buffer parsers must be able to parse repeated fields that were compiled
415- as ` packed ` as if they were not packed, and vice versa. This permits adding
416- ` [packed=true] ` to existing fields in a forward- and backward-compatible way.
417-
418404### Maps {#maps}
419405
420406Map fields are just a shorthand for a special kind of repeated field. If we have
@@ -430,8 +416,8 @@ this is actually the same as
430416``` proto
431417message Test6 {
432418 message g_Entry {
433- optional string key = 1;
434- optional int32 value = 2;
419+ string key = 1;
420+ int32 value = 2;
435421 }
436422 repeated g_Entry g = 7;
437423}
0 commit comments