Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better document that representation-level prototypes build type-level nodes #362

Closed
hannahhoward opened this issue Feb 17, 2022 · 4 comments · Fixed by #391
Closed

better document that representation-level prototypes build type-level nodes #362

hannahhoward opened this issue Feb 17, 2022 · 4 comments · Fixed by #391
Assignees

Comments

@hannahhoward
Copy link
Collaborator

hannahhoward commented Feb 17, 2022

Let's say I have a node n which satisfies schema.TypedNode. How do I encode this to CBOR and decode it back to the same kind of schema.TypedNode

Let's say I do:

buf := new(bytes.Buffer)
dagcbor.Encode(n, buf)

I believe this will not encode n as expected -- it will encode according to the in memory node interface presented by n, rather than it's representation, which according to the spec describes how it maps to the IPLD data model and hence how its serialized.

Ok so instead I do:

buf := new(bytes.Buffer)
dagcbor.Encode(n.Representation(), buf)

I believe I now have the right data serialized.

But how to get get it back out and into another instance the same kind of node as n. I can get the node prototype by doing:

np := n.Prototype()

But if I do:

nb := np.NewBuilder()
dagcbor.Decode(nb, buf)

Well this doesn't work as np.NewBuilder returns a builder for the in memory representation presumably, which also presumably will fail to deserialize should I use say representation tuple on a struct.

Ok, so I can get the representation prototype and decode with it as so:

tnp := np.(schema.TypedPrototype)
rp := tnp.Representation()
rnb := rp.NewBuilder()
dagcbor.Decode(rnb, buf)

But if I do:

n2 := rnb.Build()

I get a representation node and now I am stuck, cause I have no method to get back from the representation to schema.TypedNode

I wonder if need to make the following changes to the schema types:

type TypedNode interface {
	// schema.TypedNode acts just like a regular Node for almost all purposes;
	// which datamodel.Kind it acts as is determined by the TypeKind.
	// (Note that the representation strategy of the type does *not* affect
	// the Kind of schema.TypedNode -- rather, the representation strategy
	// affects the `.Representation().Kind()`.)
	//
	// For example: if the `.Type().TypeKind()` of this node is "struct",
	// it will act like Kind() == "map"
	// (even if Type().(Struct).ReprStrategy() is "tuple").
	datamodel.Node

	// Type returns a reference to the reified schema.Type value.
	Type() Type

	// Representation returns a datamodel.Node which sees the data in this node
	// in its representation form.
	//
	// For example: if the `.Type().TypeKind()` of this node is "struct",
	// `.Representation().TypeKind()` may vary based on its representation strategy:
	// if the representation strategy is "map", then it will be Kind=="map";
	// if the streatgy is "tuple", then it will be Kind=="list".
	Representation() RepresentationNode
}

type RepresentationNode interface {
      datamodel.Node
      
      // Typed returns a datamodel.Node which sees the data in this node
      // in its typed schema form
      Typed() TypedNode
}
// ....

// TypedPrototype is a superset of the datamodel.Nodeprototype interface, and has
// additional behaviors, much like TypedNode for datamodel.Node.
type TypedPrototype interface {
	datamodel.NodePrototype

	// Type returns a reference to the reified schema.Type value.
	Type() Type

	// Representation returns a datamodel.NodePrototype for the representation
	// form of the prototype.
	Representation() RepresentationPrototype
}

type RepresentationPrototype interface {
      datamodel.NodePrototype
      
      	// Representation returns a datamodel.NodePrototype for the typed
	// form of the prototype.
      Typed() TypedPrototype
@hannahhoward hannahhoward changed the title How do I cbor round trip a schema.TypedNode? How do I CBOR round trip a schema.TypedNode? Feb 17, 2022
@hannahhoward hannahhoward changed the title How do I CBOR round trip a schema.TypedNode? Discussion: How do I CBOR round trip a schema.TypedNode? Feb 17, 2022
@mvdan
Copy link
Contributor

mvdan commented Feb 17, 2022

I get a representation node and now I am stuck, cause I have no method to get back from the representation to schema.TypedNode

Are you sure that the repr builder gives you a repr node? At least bindnode does not:

// TODO: returning a repr node here is probably good, but there's a gotcha: one
// can go from a typed node to a repr node via the Representation method, but
// not the other way. That's probably why codegen returns a typed node here.
// The solution might be to add a way to go from the repr node to its parent
// typed node.
func (w *_builderRepr) Build() datamodel.Node {
// TODO: see the notes above.
// return &_nodeRepr{schemaType: w.schemaType, val: w.val}
return &_node{schemaType: w.schemaType, val: w.val}
}

See my TODO there; I wrote down exactly the same concerns that you have :) I will admit that this sharp edge of the builder interface isn't well documented, which is why I wrote the TODO while implementing bindnode.

@warpfork's codegen also behaves the same way:

https://github.com/ipld/go-codec-dagpb/blob/fa6d623cbc0da39a4940da01d7506d05df80c7ad/ipldsch_satisfaction.go#L2898-L2899

Note how the repr assembler/builder produce a *_PBNode rather than a *_PBNode__Repr.

So I think we have three options:

  1. Keep the current semantics, which support everything that we need, and properly document that repr builders produce typed nodes.

  2. Keep the current semantics and document them, but also add a method to go from a repr node or prototype to its typed version. The Typed method might be a nice to have from the point of view of UX, but I also don't think it's strictly necessary.

  3. Change the semantics as you describe, adding the Typed method so that we still allow building a typed node via its repr form.

I personally think option 1 is the easiest and best option in the short term. If we have strong evidence that a Typed method would be a nice addition, then we could also do option 2 at some point. I am wary about option 3, because any changes to the datamodel interfaces will cause quite a bit of pain for both implementers and consumers.

@mvdan
Copy link
Contributor

mvdan commented Feb 17, 2022

One more thought: option 2 adding Typed methods is still a breaking change, as any Node/Prototype implementations will need to add the method. But that feels much safer than option 3, changing an existing method, because consumers should never be broken by adding a method.

@hannahhoward
Copy link
Collaborator Author

hannahhoward commented Feb 18, 2022

@mvdan I agree that option 1 is the best, but we should definitely document it. It's sufficiently non-obvious that I never even considered when I wrote this up that Representation builders would actually return the typed node, not the representation node. But actually it makes sense -- when would I ever really want to build a representation from scratch? (I mean, who knows but also, I can get back to a representation node if I really need it, as you say)

@mvdan mvdan self-assigned this Feb 18, 2022
@mvdan mvdan changed the title Discussion: How do I CBOR round trip a schema.TypedNode? better document that representation-level prototypes build type-level nodes Feb 22, 2022
@warpfork
Copy link
Collaborator

warpfork commented Mar 8, 2022

FWIW, I really have a lot of questions in hindsight if my initial choices about this were sensible. (If one draws the data phase flow on a whiteboard, most things have nice closed cycles with arrows back and forth... except the build method on reprs jumping directly into the type phase. It's glaringly asymmetric on the graph.)

I think if we did it again, today, with no code inertia, I'd probably choose option 3.

Of course, we have code inertia. :) And I agree that in this scenario it's probably pretty sizable.

Documenting the current behavior more loudly seems like the practical choice.

I'd note that what we do here is a golang, and specifically go-ipld-prime choice. If we made other IPLD libraries (and those people somehow see this issue, heh), I might recommend them to consider different choice. If we make externally facing serial APIs for operations like "take this object and match it against a schema", I might also at this point suggest such an API have a separate, explicit boolean field for whether to shift from repr level to type, and vice versa. But again, both of those are different scenarios than what this issue discusses -- I just say this for completeness, if this topic is visited again in the future from either of those angles.

mvdan added a commit that referenced this issue Mar 24, 2022
Alongside a short example on how to get to the repr-level node.

Fixes #362.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🎉 Done
Development

Successfully merging a pull request may close this issue.

3 participants