Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graphsync 2.0 refactors: Bind node part 1 - message #301

Closed
hannahhoward opened this issue Feb 18, 2022 · 4 comments
Closed

Graphsync 2.0 refactors: Bind node part 1 - message #301

hannahhoward opened this issue Feb 18, 2022 · 4 comments
Assignees

Comments

@hannahhoward
Copy link
Collaborator

Goals

#300 integrates the Graphsync 2.0 branch and passes tests.

However, it uses a pretty serious hack to make the GraphSync extension work. Graphsync extensions now MUST be a go-ipld-prime datamodel.Node. In order to make this work with data transfer messages, which are currently a go struct with cbor-gen encoding/decoding, we employ a pretty inefficient trick to coerce the data into a datamodel.Node -- we simply serialize to CBOR and deserialize using basicnode.Any.

A better strategy is to use bindnode -- this is just more work and may push on features of bindnode that are still experimental.

So, here's what is needed:

  • define a schema for data transfer messages - transferMessage, transferRequest, transferResponse -- they may may need renames to play nicely with schema (shouldn't affect encoding).
    • my read of pointer type encoding in cbor-gen is the fields should be nullable but not optional
    • use map representation
  • change any cbg.Deferred types to datamodel.Node and in the schema use ANY.
  • handle encoding of vouchers to datamodel.Node -- here again we will employ a hack for now, similar to the one we use with messages currently -- I think we'll need to make a new method in encoding -- EncodeToNode(encoding.Encodable) (datamodel.Node, error) -- and a new method for the Decoder interface -- DecodeFromNode(datamodel.Node) (encoding.Encodable, error)
  • update message constructors
  • update ToIPLD function -- I believe this simply becomes bindnode.Wrap with schema, followed by calling Representation() (so it serializes properly in Graphsync)
  • update FromIPLD function -- I believe this looks like bindnode.Prototoype and then calling Representation().NewBuidler().AssignNode()then bindnode.Unwrap on the built node.

Potential issues:

  • we have a datatransfer.ChannelID in one of the message types -- probably need a schema for that, which should probably live at the root of the package where it's defined -- we may need a way to load two seperate schema files into a single schema type system
  • unsigned int usage -- hopefully fixed by node/bindnode: add support for unsigned integers ipld/go-ipld-prime#363
  • we need to maintain byte to byte CBOR-gen serialization compatibility. Unknown if this plays out in practice. We should probably copy the old types somewhere and run some decode/encode tests.
  • does Representation().NewBuilder.AssignNode() actually work?

Sidebar improvement: We can use schema rename functionality to fix the field names in the go types -- they're arbitrarily short as cbor-gen didn't have rename functionality.

recommend collaboration with @mvdan where needed

@rvagg
Copy link
Member

rvagg commented Feb 18, 2022

byte to byte CBOR-gen serialization compatibility

How precise is this statement? The main issue I'm aware of is map key sorting between cbor-gen and everything else, including ipld-prime. So maps may produce different bytes, but the decodes will still work on both ends regardless. Is there any reason that this would matter? i.e. different bytes from encode by decode compatibility between implementations for those different bytes forms.

@hannahhoward
Copy link
Collaborator Author

Sorry "byte to byte" does not need to mean "the exact same bytes". Instead, it means "roundtrip serialization / deserialization works in both directions".

@mvdan
Copy link

mvdan commented Feb 18, 2022

Sorry "byte to byte" does not need to mean "the exact same bytes". Instead, it means "roundtrip serialization / deserialization works in both directions".

That seems much easier to reach than "exact same bytes" level of compatibility :)

@hannahhoward
Copy link
Collaborator Author

@mvdan indeed -- basically if an old client can talk to a new client, I don't care.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

3 participants