add support for null-padded carv1 payloads #140

mvdan · 2021-07-05T16:57:37Z

Filecoin has been using a fork of the carv1 library with commit 11b6074, which allows decoding carv1 files that end with null padding. They use this because their crypto proofs require power-of-two payload lengths.

Spec-wise, this seems valid. A CARv1 is a series of "frames", in the form of:

cid+block length as a varint
cid
block

So, if such a frame begins with the zero byte, then we should simply skip to the next byte, because we have a varint of value 0 for the cid+block length, and thus we have nothing else to do. It's reasonable to simply ignore the frame entirely, because there's no CID and no block data.

There's only one place in the carv2 library where we decode CARv1 frames: index.Generate. Right now this padding makes it error, because it unconditionally reads a CID, even if the frame size is zero. In that way, it incorrectly implements the spec; it shouldn't attempt to read CID bytes when it knows a frame is of zero length. So we definitely have something to fix in carv2.

The CARv1 spec doesn't explicitly allow this kind of empty CID plus empty block, but it's not explicitly disallowed either. It does contain a section on padding, however:

The CAR format contains no specified means of padding to achieve specific total archive sizes or internal byte offset alignment of block data. Because it is not a requirement that each block be part of a coherent DAG under one of the roots of the CAR, dummy block entries may be used to achieve padding. Such padding should also account for the size of the length-prefix varint and the CID for a section. All sections must be valid and dummy entries should still decode to valid IPLD blocks.

I think the ship has sailed in terms of forbidding this kind of padding in the CARv1 spec, given that Filecoin has been using this for some time. We could tell them to "migrate" all of those null-padded CARv1 files, so that they also store them alongside a "padding offset" and cut the payload when using the carv2 library, effectively producing an EOF error when the library reaches the padding.

However, this would come at a relatively high cost to them (migrations are work), whereas on our side it's fairly trivial to amend one line to handle zero-length frames in a better way. What I propose is to skip the zero-length frames, effectively meaning that index.Generate would skip the entire null padding one byte at a time.

That would be pretty slow, but it would work. Later on, we have two options:

A) In the CARv1 spec, specify that a decoder is allowed to treat a frame beginning with a null byte as an EOF, discarding the rest of the input. This would allow us to quickly exit and stop reading bytes.

B) In the carv2 implementation, when we reach a null-byte frame, we read the rest of the input in big chunks, which would allow us to quickly reach EOF as long as there are no non-zero bytes. If there are any non-zero bytes, we would go back to the slow path of decoding each frame starting with a varint. Purely an optimization for the common case.

I lean towards making the fix above today, and later on amending the CARv1 spec as per option A.

The text was updated successfully, but these errors were encountered:

mvdan · 2021-07-05T16:58:27Z

cc @magik6k since you originally authored the go-car commit to skip the padding.
cc @willscott @raulk if you have thoughts

willscott · 2021-07-05T17:23:44Z

your proposal of a lenient index.Generate sounds good to me. I think we should be lenient in what we accept there.

mvdan · 2021-07-05T21:45:16Z

A) In the CARv1 spec, specify that a decoder is allowed to treat a frame beginning with a null byte as an EOF, discarding the rest of the input. This would allow us to quickly exit and stop reading bytes.

This is the behavior that Filecoin relies on, and it seems like we all find it reasonable (at least from the people I've talked to, including the IPLD sync call). Plus, it would be weird for the CARv1 to allow interleaving null bytes in between valid block frames.

I'm making the implementation treat the first null byte as EOF. We can change the CARv1 spec shortly after, unless @rvagg disagrees.

rvagg · 2021-07-06T03:47:12Z

ooof, I thought Filecoin was keeping track of payload length so it could just reverse fr32, make a limitreader and read the car to its extent.

I think the main hangup to making this "OK" is that we lose any determinism that might be afforded by packing into a CARv1, i.e. you can have an arbitrary number of (potentially random) bytes trailing it when at rest. I don't think this is a very serious concern as a system can always work work just the bytes it cares about (like Filecoin does). It may also open up some interesting avenues for low-grade attacks (DoS or similar) when transferring CARs between untrusted parties.

But, what we could do is simply make it an optional mode in the spec, something like:

By default a CAR[v1] should not allow the null (0x00) byte at the beginning of a section; each section should begin with a valid unsigned varint. However, a system using the CAR format may optionally choose to allow the null byte to signal an end-of-stream for a CAR. Code libraries that implement the CAR format should treat this mode as an option, and not the default for either reading or creation of the CAR format. Without this option enabled, a reader should treat the null byte at the beginning of a section as a malformed CAR, and a writer should not write additional bytes after the final section.

mvdan · 2021-07-06T05:31:34Z

LimitReader was also my first response, but I was told they don't keep the padding offset.

Do you have any suggestion in mind in terms of having them detect the limit/offset efficieny without a full migration? Even when reading from the end of the file, it seems relatively slow to me.

You're right that the null bytes are effectively part of the file so they affect determinism. And you're right that it opens up infinite ways to represent almost duplicate CARv1s.

I'm not sure I agree with this behavior being on by default is problematic. Certainly safer to make it opt-in, but that also makes the API bigger - note that right now we don't have any index.Generate options.

rvagg · 2021-07-06T06:02:02Z

note that right now we don't have any index.Generate options

I'm suggesting the opt-in for CARv1. CARv2 already has padding all over the place, very little determinism to be found in the format outside of the CARv1 payload, and it's not as useful as a transport anyway so I reckon CARv2 could use CARv1 with the "null byte == EOF" option enabled, no need to add a way to turn it off.

The only question is then of extending CARv1 NewCarReader to add an option (which CARv2 can always turn on), is there a nice way to achieve that?

mvdan · 2021-07-06T07:19:59Z

I'm suggesting the opt-in for CARv1.

Gotcha, I had missed that nuance. I'm happy to merge the PR as soon as someone does a final review, perhaps from @masih or @aarshkshah1992.

The only question is then of extending CARv1 NewCarReader to add an option (which CARv2 can always turn on), is there a nice way to achieve that?

We won't have a very nice way per se, because retrofitting options into a frozen API is pretty constrained. But luckily we can retrofit an option by adding a method to the reader, such as:

func (*CarReader) StopAtNullFrame()

Then the opt-in would look like:

cr, err := carv1.NewCarReader(...)
if err != nil { ... }
cr.StopAtNull()
for {
    bl, err := cr.Next()
    ...
}

rvagg · 2021-07-06T08:09:07Z

cr.StopAtNull()

not awesome but workable, let's do that

masih · 2021-07-06T08:26:57Z

I reckon CARv2 could use CARv1 with the "null byte == EOF" option enabled, no need to add a way to turn it off.

The only question is then of extending CARv1 NewCarReader to add an option (which CARv2 can always turn on), is there a nice way to achieve that?

OK this makes sense, though I would have loved the spec to be as strict as possible. I think the ship has sailed on this and we have to flex the spec to to not break existing usage.

Fixes #140.

mvdan · 2021-07-06T08:32:55Z

Fixed by #141, but let's keep this open as a reminder to update the carv1 spec to mention null padding.

rvagg · 2021-07-06T08:33:39Z

though I would have loved the spec to be as strict as possible

❤️

sadly there's lots of regrets in CARv1! but we shall cope, and imagine a beautiful CARv3 ...

mvdan · 2021-07-06T10:31:14Z

not awesome but workable, let's do that

Filed #142, and since the PR here is merged and the spec PR is open, I think we can close this in favor of the v1 issue and the spec PR.

raulk · 2021-07-07T11:38:09Z

Late to the party in this issue, but I would advise against including this mode in the spec as an allowed mode of operation. I would just put this under an "implementers' note", merely targeted at dissipating doubts on why this hack exists, for folks reading go-car (as the reference implementation).

Rationale:

There is no such thing as an "optionally supported modes" in data format specifications. Data format specs are written to achieve interoperability. If the spec allows for certain forms of a payload to exist, but they make the efficient handling of those payloads "optional" (throwing their implementation into a single-byte read busyloop), this becomes a footgun, and actually mandatory.
It is not possible for CAR implementations to emit payloads like this.
The usage of this mechanism is entirely private within Lotus, with the caveat that it lives in the go-car lib because that's the efficient place to put it. These payloads do not escape Lotus AFAIK. There is no interoperability need.

The right way to fix this is to record the real payload size in Lotus/go-fil-markets, but unfortunately that's a very costly thing to backfill now, because it would require us to unseal every deal to find the 0x00... boundaries. With the current PoRep, that's unfeasible.

Another option is wrapping the read stream Lotus-side in a utility that would interrupt the stream with an EOF when the boundary was detected. This could work by using "lookahead buffers", but unfortunately there's no way of making it reliable without leaking implementation details.

I think our best bet to get rid of this workaround is to wait for fast unsealing.

rvagg · 2021-07-08T01:51:54Z

It sounds like we're mostly all leaning in the direction of this being an unfortunate, but internal concern for Lotus rather than something we want to be able to live with long term, which is great because we can leave the spec unmolested and keep the hack .. somewhat hidden. 👍

* Implement an option for read-write blockstore, that if enabled the write can resume from where the writer left off. For resumption to work the `WithResumption` option needs to be set explicitly. Otherwise, if path to an existing file is passed, the blockstore construction will return an error. The resumption requires the roots passed to constructor as well as padding options to be identical with roots in file. Resumption only works on paths where at least V2 pragma and CAR v1 header was successfully written onto the file. Otherwise an error is returned. * Implement resumption test that verifies files resumed from match expected header, data and index. * Implement a CAR v1 equals function to check if two given headers are identical. This implementation requires exact ordering of root elements. A TODO is left to relax the exact ordering requirement. * Implement Seeker in internal offset writer in order to forward offset of CAR v1 writer within a resumed read-write blockstore after resumption. The offset of the writer needs to be set to the latest written frame in order for consecutive writes to be at the right offset. * Fix bug in offset read seeker in internal IO, where seek and returned position was not normalized by the base. Reflect the fix in read-only blockstore AllKeysChan where reader was twisted to work. We now read the header to get its size, then seek past it to then iterate over blocks to populate the channel. * Add TODOs in places to make treating zero-length frames as EOF optional; See #140 for context. * Run `gofumpt -l -w .` on everything to maintain consistent formatting.

Fixes #140.

* Implement an option for read-write blockstore, that if enabled the write can resume from where the writer left off. For resumption to work the `WithResumption` option needs to be set explicitly. Otherwise, if path to an existing file is passed, the blockstore construction will return an error. The resumption requires the roots passed to constructor as well as padding options to be identical with roots in file. Resumption only works on paths where at least V2 pragma and CAR v1 header was successfully written onto the file. Otherwise an error is returned. * Implement resumption test that verifies files resumed from match expected header, data and index. * Implement a CAR v1 equals function to check if two given headers are identical. This implementation requires exact ordering of root elements. A TODO is left to relax the exact ordering requirement. * Implement Seeker in internal offset writer in order to forward offset of CAR v1 writer within a resumed read-write blockstore after resumption. The offset of the writer needs to be set to the latest written frame in order for consecutive writes to be at the right offset. * Fix bug in offset read seeker in internal IO, where seek and returned position was not normalized by the base. Reflect the fix in read-only blockstore AllKeysChan where reader was twisted to work. We now read the header to get its size, then seek past it to then iterate over blocks to populate the channel. * Add TODOs in places to make treating zero-length frames as EOF optional; See #140 for context. * Run `gofumpt -l -w .` on everything to maintain consistent formatting. Address review comments * Implement equality check for CAR v1 headers where roots in different order are considered to be equal. * Improve resumption docs to clarify what matching roots mean.

mvdan added the P1 High: Likely tackled by core team if no one steps up label Jul 5, 2021

mvdan added this to the CAR v2 milestone Jul 5, 2021

rvagg mentioned this issue Jul 6, 2021

index: add support for null paddings in Generate #141

Merged

masih linked a pull request Jul 6, 2021 that will close this issue

index: add support for null paddings in Generate #141

Merged

masih pushed a commit that referenced this issue Jul 6, 2021

index: add support for null paddings in Generate

aa61149

Fixes #140.

masih mentioned this issue Jul 6, 2021

CARv1: allow optional mode of (length=0x00)==EOF ipld/ipld#113

Closed

mvdan mentioned this issue Jul 6, 2021

add opt-in support for treating null padding as EOF in v2 #142

Closed

mvdan closed this as completed Jul 6, 2021

masih mentioned this issue Jul 8, 2021

Implement ReadWrite blockstore resumption #147

Merged

mvdan mentioned this issue Jul 16, 2021

add opt-in support for treating null padding as EOF in v0 (CARv1) #171

Closed

mvdan added a commit that referenced this issue Jul 16, 2021

index: add support for null paddings in Generate

6765c84

Fixes #140.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add support for null-padded carv1 payloads #140

add support for null-padded carv1 payloads #140

mvdan commented Jul 5, 2021

mvdan commented Jul 5, 2021

willscott commented Jul 5, 2021

mvdan commented Jul 5, 2021 •

edited

Loading

rvagg commented Jul 6, 2021

mvdan commented Jul 6, 2021

rvagg commented Jul 6, 2021

mvdan commented Jul 6, 2021 •

edited

Loading

rvagg commented Jul 6, 2021

masih commented Jul 6, 2021 •

edited

Loading

mvdan commented Jul 6, 2021

rvagg commented Jul 6, 2021

mvdan commented Jul 6, 2021

raulk commented Jul 7, 2021

rvagg commented Jul 8, 2021

add support for null-padded carv1 payloads #140

add support for null-padded carv1 payloads #140

Comments

mvdan commented Jul 5, 2021

mvdan commented Jul 5, 2021

willscott commented Jul 5, 2021

mvdan commented Jul 5, 2021 • edited Loading

rvagg commented Jul 6, 2021

mvdan commented Jul 6, 2021

rvagg commented Jul 6, 2021

mvdan commented Jul 6, 2021 • edited Loading

rvagg commented Jul 6, 2021

masih commented Jul 6, 2021 • edited Loading

mvdan commented Jul 6, 2021

rvagg commented Jul 6, 2021

mvdan commented Jul 6, 2021

raulk commented Jul 7, 2021

rvagg commented Jul 8, 2021

mvdan commented Jul 5, 2021 •

edited

Loading

mvdan commented Jul 6, 2021 •

edited

Loading

masih commented Jul 6, 2021 •

edited

Loading