Skip to content
This repository has been archived by the owner on Sep 12, 2018. It is now read-only.

NG: StorageDriver.WriteStream shouldn't require size argument #781

Closed
stevvooe opened this issue Nov 24, 2014 · 18 comments
Closed

NG: StorageDriver.WriteStream shouldn't require size argument #781

stevvooe opened this issue Nov 24, 2014 · 18 comments

Comments

@stevvooe
Copy link
Contributor

The current storagedriver.WriteStream call is as follows:

WriteStream(path string, offset, size uint64, readCloser io.ReadCloser) error

This has a few problems:

  1. io.ReadCloser is unnecessary. The lifecycle of reader should be dictated by the caller.
  2. The use of size and io.EOF presents several, overlapping errors. Only one should dictate the course of the write.
  3. There is not way to detect partial writes on failure.

Generally, in Go, its more idiomatic to have a method such as this take a reader and write until it returns io.EOF. A function signature as follows would solve the problems above and make the call more idiomatic:

// WriteStream writes the chunk to the specified path until reader returns io.EOF. The number
// of bytes written along with a possible error are returned.
WriteStream(path string, offset int64, reader io.Reader) (n int64, err error)
@stevvooe stevvooe added this to the GO-beta milestone Nov 24, 2014
@stevvooe stevvooe self-assigned this Nov 24, 2014
@stevvooe
Copy link
Contributor Author

@wking
Copy link
Contributor

wking commented Nov 25, 2014

On Mon, Nov 24, 2014 at 01:22:30PM -0800, Stephen Day wrote:

The current storagedriver.WriteStream call is as follows:

WriteStream(path string, offset, size uint64, readCloser io.ReadCloser) error

This has a few problems:

  1. io.ReadCloser is unnecessary. The lifecycle of reader should be
    dictated by the caller.

+1

  1. The use of size and io.EOF presents several, overlapping
    errors. Only one should dictate the course of the write.

+1

  1. There is not way to detect partial writes on failure.

Why not? Can't the storage driver check how much it received and
return an error if it's < size.

Generally, in Go, its more idiomatic to have a method such as this
take a reader and write until it returns io.EOF.

That's fine inside Go, but it would make it hard if the storage driver
is connecting to a storage backend that expects the size ahead of time
(e.g. if the storage backend uses HTTP for pushes, or if its on the
local filesystem and the driver wants to allocate the whole block of
disk space ahead of time to avoid fragmentation). Since we have the
size from the client↔registry POST, we might as well pass it on for
the storage driver to use (or not) as it sees fit.

@stevvooe
Copy link
Contributor Author

stevvooe commented Dec 3, 2014

Why not? Can't the storage driver check how much it received and
return an error if it's < size.

There is no way for the caller of WriteStream to detect the partial write, thus why we add the number of bytes written to the return values.

That's fine inside Go, but it would make it hard if the storage driver
is connecting to a storage backend that expects the size ahead of time
(e.g. if the storage backend uses HTTP for pushes, or if its on the
local filesystem and the driver wants to allocate the whole block of
disk space ahead of time to avoid fragmentation). Since we have the
size from the client↔registry POST, we might as well pass it on for
the storage driver to use (or not) as it sees fit.

Generally, we won't actually know the size beforehand. This function may be passed an io.Reader directly from an http request. While the "Content-Length" header may provide a hint, the actual read may be shorter or longer. From a security perspective, it can't really be trusted. If the driver does actually require the size, to support block allocation or sized http pushes, it has the option to buffer the data in memory or local disk. The new proposed interface effectively passes on the lack of knowledge from incoming http requests. The returned write size can be checked against incoming "Content-Length" to validate the write.

@wking
Copy link
Contributor

wking commented Dec 3, 2014

On Tue, Dec 02, 2014 at 08:16:55PM -0800, Stephen Day wrote:

Why not? Can't the storage driver check how much it received and
return an error if it's < size.

There is no way for the caller of WriteStream to detect the
partial write, thus why we add the number of bytes written to the
return values.

The storage driver is isn't calling WriteStream, the registry is
calling WriteStream and the storage driver is implementing it. All
the caller should care about is whether the call succeeded or not, and
a ‘partial write’ error returned by the driver should cover you there.

That's fine inside Go, but it would make it hard if the storage
driver is connecting to a storage backend that expects the size
ahead of time…

Generally, we won't actually know the size beforehand. This function
may be passed an io.Reader directly from an http request. While the
"Content-Length" header may provide a hint, the actual read may be
shorter or longer.

I'd consider either case (shorter or longer) reads an error.

In the event of a short read, I'd return the ‘partial write’ error,
and record the actually-read size as part of that error object. Then
the registry could pass that on to the client instead of requiring the
client to send a new request to figure out how much was written.

In the event of a long read, I'd just drop the whole thing, and
consider ignoring the client for a cool-off period.

From a security perspective, it can't really be trusted.

Why is this a trust issue? If the client says “I'm uploading 100
bytes”, and I only get 25, I expect a later API call to try and push
the remaining 75. If I get 200 bytes, I say, “this client is crazy”,
and I ignore them until they start talking sense. In no case am I
trusting the client's Content-Length blindly.

If the driver does actually require the size, to support block
allocation or sized http pushes, it has the option to buffer the
data in memory or local disk. The new proposed interface effectively
passes on the lack of knowledge from incoming http requests. The
returned write size can be checked against incoming "Content-Length"
to validate the write.

Sure, but that seems wasteful to me. Why not push the nominal size
and associated validation check into the storage driver, and let it
skip the local buffering? I don't think you gain anything by moving
the check to the registry core, and you'd want some benefit to justify
the cost of buffering in the storage driver.

@stevvooe
Copy link
Contributor Author

stevvooe commented Dec 3, 2014

None of the driver implementations even use the size parameter or use it inconsistently. It's not needed and it makes error handling more complex. It's also confusing in places where its assumed to be the total size of the target content at path rather than the write chunk size.

@wking
Copy link
Contributor

wking commented Dec 3, 2014

On Wed, Dec 03, 2014 at 12:21:42AM -0800, Stephen Day wrote:

None of the driver implementations even use the size parameter or
use it inconsistently.

That's just a documentation/testing issue.

It's not needed and it makes error handling more complex.

It doesn't seem simpler to me to handle some errors in the driver, but
keep invalid-size error handling in the core.

It's also confusing in places where its assumed to be the total size
of the target content at path rather than the write chunk size.

That's also a documentation/testing issue. It should be the size of
data that the call to WriteStream is expected to deliver.

For chunked transfer encoding (where Content-Length isn't set), I'm
not sure how you distinguish between “EOF, next chunk size is zero,
successfully received” and “EOF, connection lost, incomplete
transfer”. However you were planning on doing that in the core, you
should be able to take the same approach in the drivers.

@stevvooe
Copy link
Contributor Author

stevvooe commented Dec 3, 2014

I think the confusion here is that WriteStream will not be called multiple times for http Chunked Transfer Encoding. The http library exposes Request.Body, which is an io.ReadCloser. The transfer encoding will be transparent to the request handler. WriteStream implementations will do best effort to read all of incoming io.Reader until io.EOF. The io.Reader object is effectively "sized" with io.EOF (hence size being redundant). If the caller wants to write less data, they can wrap io.Reader in an io.LimitedReader, which can be restricted to size.

This puts buffering and data acceptance into the hands of the driver, which has the best chance of correct and performant handling. It also separates accepted bytes from the error type, allowing the driver to control whether partial, full or chunked writes are accepted, transparent to the caller.

I'll be posting PRs for this over the next couple of days. You'll see that it simplifies the write flow and makes error handling a lot cleaner. I'd recommend you read up on the [io package][http://golang.org/pkg/io/#LimitedReader]. That knowledge will be helpful for understanding how some of these decisions are being made.

@wking
Copy link
Contributor

wking commented Dec 3, 2014

On Wed, Dec 03, 2014 at 10:44:28AM -0800, Stephen Day wrote:

I think the confusion here is that WriteStream will not be called
multiple times for http Chunked Transfer Encoding. The http library
exposes Request.Body, which is an io.ReadCloser. The transfer
encoding will be transparent to the request handler.

I'm still not clear on how you distinguish between successfully
reaching the end of chunked data and a broken connection that leads to
an incomplete read. The docs for NewChunkedReader only describe the
handling of a successful read (returning io.EOF 1). In any case,
handling partial chunked uploads is going to be the same regardless of
whether you check for this error in the driver or in the core.

If the caller wants to write less data, they can wrap io.Reader in
an io.LimitedReader, which can be restricted to size.

Why would you want to do that?

This puts buffering and data acceptance into the hands of the
driver, which has the best chance of correct and performant
handling.

If you pass an explicit expected size, the driver can still decide if
it wants to buffer or accept the data. This is just giving the driver
less information to use when making that decision.

It also separates accepted bytes from the error type, allowing the
driver to control whether partial, full or chunked writes are
accepted, transparent to the caller.

Is that a decision we want to make on a per-driver level?

I'll be posting PRs for this over the next couple of days. You'll
see that it simplifies the write flow and makes error handling a lot
cleaner.

Ok. Getting something more concrete to talk about will probably help
;).

@wking
Copy link
Contributor

wking commented Dec 5, 2014

On Wed, Dec 03, 2014 at 10:44:28AM -0800, Stephen Day wrote:

I'll be posting PRs for this over the next couple of days. You'll
see that it simplifies the write flow and makes error handling a lot
cleaner.

Looking at #820, it seems like the test suite now has (with 2037b1d,
Update testsuite with storagedriver interface changes, 2014-12-03,
1):

nn, err := suite.StorageDriver.WriteStream(filename, 0, bytes.NewReader(contentsChunk1))
c.Assert(err, check.IsNil)
c.Assert(nn, check.Equals, int64(len(contentsChunk1)))

instead of:

err := suite.StorageDriver.WriteStream(filename, 0, 3*chunkSize, ioutil.NopCloser(bytes.NewReader(contentsChunk1)))
c.Assert(err, check.IsNil)

When I'd prefer:

err := suite.StorageDriver.WriteStream(filename, 0, chunkSize, bytes.NewReader(contentsChunk1))
c.Assert(err, check.IsNil)

The approach in 2037b1d doesn't seem a lot cleaner to me. Maybe the
cleanliness savings are coming somewhere else?

Looking at ab9570f (Migrate filesystem driver to new storagedriver
calls, 2014-12-03, 2) and 2ebc373 (Refactor inmemory driver for
Stat and WriteStream methods, 2014-12-04), I see that the filesystem
and inmemory drivers never used the ‘size’ argument for ‘WriteStream’
at all. Maybe that's what we should be fixing, by adding size-written
checks to the storage drivers?

@stevvooe
Copy link
Contributor Author

stevvooe commented Dec 5, 2014

@wking You're missing the point: none of the drivers used the size argument correctly. It was confusing, incorrect and redundant. Please see #814 for details.

I'm also confused why you are doubling down on this matter. What are you trying to achieve?

@stevvooe
Copy link
Contributor Author

stevvooe commented Dec 5, 2014

Furthermore, these changes aren't about preference. They are about consistency. It's inconsistent to both pass in an io.Reader, that will be read in full and an argument indicating its size. It adds an unnecessary order of complexity in calls to the WriteStream function and the errors it will produce.

@wking
Copy link
Contributor

wking commented Dec 5, 2014

On Fri, Dec 05, 2014 at 09:48:16AM -0800, Stephen Day wrote:

@wking You're missing the point: none of the drivers used the size
argument correctly. It was confusing, incorrect and
redundant. Please see #814 for details.

The point of ‘size’ is that sometimes you do know ahead of time how
much data the client is intending to send. Passing that information
along to the storage driver allows it to:

  • Pass that along to the backend (if the backend cares),
  • Detect overly large payloads (and reject them as corrupt), and
  • Detect small payloads, likely due to a dropped socket.

If all you're looking at is EOF, you can't do any of that.

I'm also confused why you are doubling down on this matter. What are
you trying to achieve?

I'm just trying to keep the storage-driver API clean, and the returned
error code already seems like a good way to tell the registry if a
write was shorter or longer than expected. Putting this check in the
storage driver code means you don't have to remember to repeat it when
you call the storage driver from the core. And having tests in the
storage driver test suite to exercise invalid sizes means you don't
have to remember to add the test when you write a storage driver
(you'll just fail the tests until you add it). Of course, I'm not
implementing code on either side of this API yet, and I'm not a
maintainer, so if my arguments don't make sense to you, you're
obviously free to ignore me ;).

@stevvooe
Copy link
Contributor Author

stevvooe commented Dec 5, 2014

@wking WriteStream will really only be used in a few places, whereas common size checks will have be implemented in each bespoke driver, resulting in divergent behavior. Large and small payload detection must be handled in the storage layer and not the drivers, in one place.

The issue with size is that this is an incoming stream. The size is not known by definition. For the use case of the registry, the size is not known by a trusted source beforehand. Typically, the driver doesn't have enough information to do proper size checks that don't result in an invalid io.Reader. There also needs to be very careful error handling in the case of a short read/write that cannot be handled in the driver (check out my TODO in layerupload.go).

We have the same goals. API cleanliness is very important. You're feedback is detailed but if there is specific action you'd like me to take based on your feedback, please say so. While I do have the best visibility into this problem at this time, ignoring your feedback would be unacceptable.

With this change and additions described in #814, based on what I understand you are after, we really are getting the right behavior while avoiding confusion.

@stevvooe
Copy link
Contributor Author

stevvooe commented Dec 5, 2014

instead of:

err := suite.StorageDriver.WriteStream(filename, 0, 3*chunkSize, ioutil.NopCloser(bytes.NewReader(contentsChunk1)))
c.Assert(err, check.IsNil)

When I'd prefer:

err := suite.StorageDriver.WriteStream(filename, 0, chunkSize, bytes.NewReader(contentsChunk1))
c.Assert(err, check.IsNil)

Also, I'm not sure if you noticed when quoting this, but the chunkSize argument changed from 3*chunkSize, which would be total size to chunkSize. This is part of the confusion we're trying to avoid with these changes.

@wking
Copy link
Contributor

wking commented Dec 5, 2014

On Fri, Dec 05, 2014 at 11:24:57AM -0800, Stephen Day wrote:

@wking WriteStream will really only be used in a few places,
whereas common size checks will have be implemented in each
bespoke driver, …

But the driver may have more information on why the sizes don't match,
which you lose by squeezing the error down to a single number.

… resulting in divergent behavior.

Not if you're enforcing consistent behaviour in the test suite.

Large and small payload detection must be handled in the storage
layer and not the drivers, in one place.

This is not a complicated check, and I've expressed my preference for
libraries to share code for stuff that many drivers would want before
1. Maybe we'll just have to agree to disagree here ;).

The issue with size is that this is an incoming stream. The size
is not known by definition.

I'm still not clear on how anyone detects broken cnked uploads in Go
2. For the other cases, you will know the client's Content-Length.
It's not “trusted”, but the idea is that you error out of WriteStream
if you don't match the Content-Length.

Typically, the driver doesn't have enough information to do proper
size checks that don't result in an invalid io.Reader.

Then how are they returning the number of bytes written? I'm just
suggesting the:

c.Assert(nn, check.Equals, size)

check happen in the driver instead of the core. It's not a huge
difference ;).

There also needs to be very careful error handling in the case of a
short read/write that cannot be handled in the driver (check out my
TODO in layerupload.go).

That's 3? That looks like an issue with the previous length of the
file, not an issue with the size of the request body being written.

We have the same goals. API cleanliness is very important. You're
feedback is detailed but if there is specific action you'd like me
to take based on your feedback, please say so. While I do have the
best visibility into this problem at this time, ignoring your
feedback would be unacceptable.

My feedback is to keep ‘size’ in the API and move the short/long write
check into the driver.

Fri, Dec 05, 2014 at 11:26:52AM -0800, Stephen Day:

Also, I'm not sure if you noticed when quoting this, but the
chunkSize argument changed from 3*chunkSize, which would be total
size to chunkSize. This is part of the confusion we're trying to
avoid with these changes.

Yeah, that was intentional 4. And confusion is easily avoided
through documentation and testing.

@wking
Copy link
Contributor

wking commented Dec 5, 2014

@stevvooe reached out to me on IRC and we kicked this around some
more. The disagreement (to pass size to the driver and check there
vs. getting back the size written and checking sizes in the registry
core) boiled down to whether or not the driver to trust data sent from
the registry. I'd prefer a skeptical driver that gets both the
expected size and byte stream, so it can verify the read length
itself. @stevvooe prefers (for now) a trusting driver that expects
the registry to handle the error detection.

We'll want to revist if we find a backend that needs the
Content-Length information up-front and can't handle chunked transfer
encoding from the driver, or if we decide that drivers should be able
to detect broken transfers on their own, or if we want to implement
some other efficiencies based on the nominal Content-Length. Maybe
none of those will ever happen ;). In any case, I'm comfortable
agreeing to disagree here.

@stevvooe
Copy link
Contributor Author

stevvooe commented Dec 5, 2014

@wking Thank you! We'll proceed with this for now and definitely revisit later.

@stevvooe
Copy link
Contributor Author

stevvooe commented Dec 9, 2014

Addressed in #820.

@stevvooe stevvooe closed this as completed Dec 9, 2014
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants