Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cases of UnmarshalFirst? #483

Closed
MastaP opened this issue Jan 31, 2024 · 2 comments
Closed

Use cases of UnmarshalFirst? #483

MastaP opened this issue Jan 31, 2024 · 2 comments

Comments

@MastaP
Copy link

MastaP commented Jan 31, 2024

Hello,

I'm new to CBOR and trying to wrap my head around the functionality that UnmarshalFirst() provides.
The documentation says:

// UnmarshalFirst decodes first CBOR data item and returns remaining bytes.

It seems that with this lib, the top-level data item is always a CBOR array or a map. Thus, UnmarshalFirst() always returns the whole item, that is, array or a map with all contents.

The only way I was able to utilize UnmarshalFirst() was to create a malformed CBOR by concatenating two (or more) encoded structures into a single byte array.

func TestCBOR_concat_UnmarshalFirst(t *testing.T) {
	type MySubStruct struct {
		B byte
	}
	data := []*MySubStruct{{B: 1}, {B: 2}}
	buf := &bytes.Buffer{}
	enc := cbor.NewEncoder(buf)
	for _, d := range data {
		require.NoError(t, enc.Encode(d))
	}
	bytes := buf.Bytes()
	//require.NoError(t, cbor.Wellformed(bytes)) // fails
	fmt.Printf("%X\n", bytes)
	result := make([]*MySubStruct, 0)
	for len(bytes) > 0 {
		df, r, err := cbor.DiagnoseFirst(bytes)
		require.NoError(t, err)
		fmt.Printf("first: %s\n", df)
		fmt.Printf("rest: %X\n", r)
		d := &MySubStruct{}
		bytes, err = cbor.UnmarshalFirst(bytes, d)
		require.NoError(t, err)
		result = append(result, d)
	}
	require.Equal(t, data, result)
}

Is it possible to construct wellformed CBOR encoding and be able to iterate multiple data items from the top level?

Thanks.

p.s. in my use-case I'd like to version my data structures and be able to use UnmarshalFirst() to first read the version and then decode the rest of the payload accordingly.

@fxamacker
Copy link
Owner

Hi @MastaP 👋

This library supports both CBOR (RFC 8949) and CBOR Sequences (RFC 8742). A CBOR Sequence is simply a concatenation of zero or more CBOR data items.

Unmarshal() requires one CBOR data item and it must not have any trailing bytes. Otherwise, the data is rejected for being malformed.

UnmarshalFirst() allows trailing bytes, so it supports more use-cases than Unmarshal().

Use-cases for UnmarshalFirst() include:

  • decoding first CBOR data item (RFC 8949) in a CBOR Sequence (RFC 8742)

  • decoding first CBOR data item in a mixed encoding (e.g. CBOR data item as header, followed by non-CBOR payload)

UnmarshalFirst() checks the first CBOR data item for well-formedness and validity but allows trailing bytes. For performance, it does not try to decode or check any of the trailing bytes.

p.s. in my use-case I'd like to version my data structures and be able to use UnmarshalFirst() to first read the version and then decode the rest of the payload accordingly.

For your use-case, you can encode to a CBOR Sequence:

  • first CBOR data item for metadata such as version, count of top-level data items, etc.
  • remaining CBOR data item(s) for the data.

At a glance, your code snippet looks like it is using CBOR Sequence.

Both Unmarshal() and UnmarshalFirst() check for well-formedness and validity (with some differences) so you don't need to manually check before decoding.

It seems that with this lib, the top-level data item is always a CBOR array or a map.

Actually, this library supports all CBOR data items as top-level data item, such as CBOR integers, bool, array, map, etc.

For your use-case, these RFCs may be of interest:

  • RFC 8949, CBOR (defines CBOR data item)
  • RFC 8742, CBOR Sequences (concatenated CBOR data items)
  • RFC 7049, CBOR (obsoleted by RFC 8949, referenced by RFC 8742, RFC 8610, etc.)
  • Maybe also RFC 8610, CDDL (if you need to specify very complex CBOR-based data formats)

@MastaP
Copy link
Author

MastaP commented Feb 7, 2024

Thanks a lot for the comprehensive reply, @fxamacker .

@MastaP MastaP closed this as completed Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants