codegen: make error info available when tuples process data that is too long. #99

warpfork · 2020-10-20T17:43:52Z

Make error info available when tuples process data that is too long.

(Should fix #97 .)

This requires introducing an error-carrying NodeAssembler,
because the AssembleValue methods don't have the ability to return errors themselves.

AssembleValue methods have not needed to return errors before!
Most lists don't have any reason to error: out of our whole system,
it's only struct-with-tuple-representation that can have errors here,
due to tuples having length limits.
AssembleValue for maps doesn't have a similar challenge either,
because key invalidity can always be indicated by errors returned from
the key assembly process.

I'm not a big fan of this diff -- error carrying thunks like this are
ugly to write and they're also pretty ugly to use -- but I'm not sure
what would be better. ListAssembler.AssembleValue returning an error?
Turning ListAssembler into a two phase thing, e.g. with an Advance
method that fills some of the same role as AssembleKey does for maps,
and gives us a place to return errors?

…oo long. This requires introducing an error-carrying NodeAssembler, because the AssembleValue methods don't have the ability to return errors themselves. AssembleValue methods have not needed to return errors before! Most lists don't have any reason to error: out of our whole system, it's only struct-with-tuple-representation that can have errors here, due to tuples having length limits. AssembleValue for maps doesn't have a similar challenge either, because key invalidity can always be indicated by errors returned from the key assembly process. I'm not a big fan of this diff -- error carrying thunks like this are ugly to write and they're also pretty ugly to use -- but I'm not sure what would be better. ListAssembler.AssembleValue returning an error? Turning ListAssembler into a two phase thing, e.g. with an Advance method that fills some of the same role as AssembleKey does for maps, and gives us a place to return errors?

willscott

This seems preferable to the other two approaches you mention in terms of code complexity 👍

mvdan · 2020-10-20T21:43:42Z

I agree that this solution feels a bit hacky, but I don't think it's terrible. The other options you suggest would make the interface more complex, and this is an edge case that the vast majority of users shouldn't run into anyway.

What about returning a nil assembler (like before), and we document that the caller should do a nil check? We can document that a nil can be returned in specific invalid cases like this one here. That doesn't work if we want to surface the ErrNoSuchField error, though.

Another option, perhaps a bit cleaner, would be to return a NodeAssembler which is entirely a no-op, and have the Finish call return the error. This could result in the caller doing more work as they would think the NodeAssembler is working normally, but the positive is that we'd report the error in a better place.

How about a mix of both:

assembler := x.AssembleValue()
if assembler == nil {
    // Something went wrong. Fetch the error via Finish.
    err := x.Finish()
    // handle error
}
// continue assembling

warpfork · 2020-10-21T18:56:51Z

Not a fan of nils in general. Getting a nil dereference panic is generally my least favorite error to debug. Even if the line numbers do it, it's... not a joy. Can't write documentation that people will find when they put that error message string into their search engine of choice unless you control what the error message text is.

(Returning nil from methods like MapIterator() is less bothersome, because the only cases where that can appear are those where one already should've switched on the Kind(). Although also, now that I think of this... if we update the Node interface to have many fewer error returns and panic instead, we should probably make the MapIterator and ListIterator methods panic in those cases too, rather than just return nil. But I digress.)

A no-op'ing NodeAssembler that holds onto the error is an interesting idea. I think the caller doing more work is going to be too big a cost in practice though: if we tried to hold onto that logic recursively, that could be a lot of extra work. (Also, it would be unclear what we'd do for yielding further child assemblers. Yield more no-op'ers recursively? Possible but not feeling good.)

warpfork · 2020-10-21T19:13:45Z

(Okay, this comment is going to be a no-op, because I talked myself back out of it by the end, but still, to make the reasoning available...)

You got me thinking about the exact position of the error a bit more. We should think about whether this gives us the right "line numbers" if we imagine this thing paired with a serial decoding function that can give line numbers and offsets in the stream when the

If we do what this diff currently does -- channeling the error through a dummy NodeAssembler and raising it on the next opportunity -- we will get the error roughly one step too late.

In other words, if we have a schema type Foo struct { a Int, b Int, c Int } representation tuple, and we parse json [1, 2, 3, 4]...

Hm.

Nope, on second thought: this is fine, because our internal reasoning typically elides separator tokens that are codec details anyway. The distinction between "the comma before the four" and "the four" is so minute we just don't regard it: none of our codecs pass up distinct event for "the wire format indicated an element coming", they just pass up the element, and get the child assembler and call its assign method in the same breath. So: this is not a concern: the code as is works, nevermind.

warpfork · 2020-10-21T19:28:21Z

Think I'm gonna give this a merge then; thanks for the additional review!

willscott approved these changes Oct 20, 2020

View reviewed changes

warpfork mentioned this pull request Oct 21, 2020

Panics possible when using dagcbor to unmarshal data that doesn't fit in in a schema #97

Closed

warpfork merged commit e0aac3b into master Oct 21, 2020

warpfork deleted the codegen-error-from-tuple-overshoot branch October 21, 2020 19:28

frrist mentioned this pull request Dec 1, 2020

feat: add watch and walk commands to index chain during traversal filecoin-project/lily#249

Merged

9 tasks

aschmahmann mentioned this pull request Feb 18, 2021

Release v0.8.0 ipfs/kubo#7707

Closed

73 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

codegen: make error info available when tuples process data that is too long. #99

codegen: make error info available when tuples process data that is too long. #99

warpfork commented Oct 20, 2020

willscott left a comment

mvdan commented Oct 20, 2020

warpfork commented Oct 21, 2020

warpfork commented Oct 21, 2020

warpfork commented Oct 21, 2020

codegen: make error info available when tuples process data that is too long. #99

codegen: make error info available when tuples process data that is too long. #99

Conversation

warpfork commented Oct 20, 2020

willscott left a comment

Choose a reason for hiding this comment

mvdan commented Oct 20, 2020

warpfork commented Oct 21, 2020

warpfork commented Oct 21, 2020

warpfork commented Oct 21, 2020