Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: add support for handling CBOR enumerated alternative data items #508

Open
agaffney opened this issue Mar 18, 2024 · 7 comments
Open
Labels
enhancement New feature or request

Comments

@agaffney
Copy link
Contributor

Is your feature request related to a problem? Please describe.

There doesn't seem to be a reasonable way to implement support for CBOR alternatives (as described in section 9.1 of https://datatracker.ietf.org/doc/draft-bormann-cbor-notable-tags/). There are 129 tag numbers that can be represented by 3 different types, but there's no way to use TagSet to map multiple tag numbers to the same type in a way that the tag number can be preserved for later reference.

Describe the solution you'd like

I need to be able to support arbitrary CBOR alternative numbers without creating a separate type for each and every tag number. I don't really have a good suggestion for how to accomplish this without adding support directly into the library.

Describe alternatives you've considered

N/A

Additional context

N/A

@agaffney
Copy link
Contributor Author

agaffney commented Mar 18, 2024

A reasonable way to approach this might be a flag in TagOptions that will pass the entire tag CBOR to the custom type's Unmarshal function rather than just the tag content CBOR. This would allow the custom type to do a generic decode of the tag to extract the tag number without breaking any existing interfaces.

EDIT: my understanding of how this works was wrong

@fxamacker
Copy link
Owner

@agaffney thanks for opening this issue and suggesting an approach!

To confirm my understanding, the suggested approach doesn't change API or default behavior. It adds a new opt-in flag that makes codec provide both CBOR tag number and content (instead of just content) to custom type's Unmarshal function.

This sounds promising, I'll try to take a look this month at existing tag-related code to confirm if this is feasible.

@fxamacker fxamacker changed the title feature: method for handling CBOR alternatives feature: add support for handling CBOR enumerated alternative data items Mar 20, 2024
@fxamacker fxamacker added the enhancement New feature or request label Mar 20, 2024
@fxamacker
Copy link
Owner

@agaffney Thanks again for opening this issue! I took a closer look at the current tag-related code.

Currently, both CBOR tag number and its content are passed to the custom type's UnmarshalCBOR() if custom type was registered with TagSet. However, only unique type is allowed to register with TagSet because TagSet is also used for encoding and we can't encode the same type with different tag numbers.

I'd like more info. Can you share some example code for your use case? For example,

  • does your custom type implement Unmarshaler?
  • are you decoding to interface{}?
  • etc.

@agaffney
Copy link
Contributor Author

@fxamacker I'm decoding into interface{} and my custom type implements Unmarshaler.

You say that both the CBOR tag number and its content are passed to the custom type's UnmarshalCBOR() function, but that's not my experience in practice. The decode seems to be asymmetrical with encode with custom marshal/unmarshal functions with TagSet in that regard. I've got an example custom tag type implemented here, and I have to ignore the tag part on unmarshal and explicitly add it on the marshal.

https://github.com/blinklabs-io/gouroboros/blob/ef3e13e5b5e2fc84bbf47305ab6e2a25b556b0a6/cbor/tags.go#L92-L110

@fxamacker
Copy link
Owner

@agaffney

I'm decoding into interface{} and my custom type implements Unmarshaler.

I see, thanks!

You say that both the CBOR tag number and its content are passed to the custom type's UnmarshalCBOR() function, but that's not my experience in practice. The decode seems to be asymmetrical with encode with custom marshal/unmarshal functions with TagSet in that regard. I've got an example custom tag type implemented here, and I have to ignore the tag part on unmarshal and explicitly add it on the marshal.

Hmm, in the following code example, UnmarshalCBOR() receives entire encoded CBOR tag (both number and content).

https://go.dev/play/p/-zSz6NcTiWL

type enum struct {
	data []byte
}

func (e *enum) UnmarshalCBOR(data []byte) error {
        // 👉 data contains entire CBOR tag (both number and content)
	e.data = append(e.data[:0], data...)
	return nil
}

func main() {

	data, _ := hex.DecodeString("d87a42ff00") // 122(h'ff00')

	tags := cbor.NewTagSet()
	_ = tags.Add(
		cbor.TagOptions{EncTag: cbor.EncTagRequired, DecTag: cbor.DecTagRequired},
		reflect.TypeOf(enum{}),
		122,
	)

	dm, _ := cbor.DecOptions{}.DecModeWithTags(tags)

	var v interface{}
	err := dm.Unmarshal(data, &v)
	if err != nil {
		fmt.Printf("failed to unmarshal data 0x%x: %s\n", data, err)
	} else {
		fmt.Printf("enum.data: %x\n", v.(enum).data)
	}

	// Output:
	// enum.data: d87a42ff00
}

@agaffney
Copy link
Contributor Author

@fxamacker it seems there may be more magic than I had thought. In your example, I can take data in enum.UnmarshalCBOR() and unmarshal it into both cbor.Tag and []byte without any special handling. I guess I had originally implemented it to only decode the tag content and then assumed that it must be that way because otherwise how could it be working?

@agaffney
Copy link
Contributor Author

I can now confirm that I can capture that tag number, but what I can't do is register multiple tag numbers with the same object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants