Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io: add an Err field to LimitedReader #51115

Open
DeedleFake opened this issue Feb 9, 2022 · 59 comments
Open

io: add an Err field to LimitedReader #51115

DeedleFake opened this issue Feb 9, 2022 · 59 comments
Labels
NeedsFix The path to resolution is known, but the work has not been done. Proposal Proposal-Accepted
Milestone

Comments

@DeedleFake
Copy link

DeedleFake commented Feb 9, 2022

New Proposal

As per @ianlancetaylor's suggestion, this proposal is now to add an Err field to io.LimitedReader. If not nil, this field's value will be returned when trying to read past the limit imposed by the reader instead of the default io.EOF.

Original Proposal

For various reasons, I had a need to use MaxBytesReader() in a situation where a ResponseWriter wasn't easily available, so I looked through the code to see if it was safe to pass nil and found that all it does is check a type assertion and then call an unexported method on it:

type requestTooLarger interface {
	requestTooLarge()
}
if res, ok := l.w.(requestTooLarger); ok {
	res.requestTooLarge()
}

While it's true that because of this it's safe to pass it a nil ResponseWriter, it feels quite odd to rely on undocumentated behavior of this kind.

Proposal

Several ways to clean this situation up come to mind:

  1. Add an alternative in the http package that doesn't need a ResponseWriter. Because of how it works, it could actually just return the same implementation, but the function will be more obvious in its usage.
  2. Document how the implementation uses the ResponseWriter and explicitly state that a nil ResponseWriter is valid.
  3. Add a reimplementation of MaxBytesReader() in io that doesn't use a ResponseWriter at all and isn't tied to http behavior, but includes the other differences. This could actually be done, for the most part, by just adding an Err error field to io.LimitedReader that, if non-nil, is the error returned when the limit is reached instead of io.EOF.

In whichever case, the documentation for MaxBytesReader() should probably be filled in a bit. It is quite odd that something called Reader requires a type that is primarily an io.Writer implementation for non-obvious reasons and doesn't mention it anywhere in the documentation.

@gopherbot gopherbot added this to the Proposal milestone Feb 9, 2022
@jimmyfrasche
Copy link
Member

I have forked io.LimitedReader to return a custom error at least three times.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Feb 9, 2022

If the proposed function doesn't use http.ResponseWriter, then I can't see any reason that it should live in the net/http package. That seems to reduce this to

type MyLimitedReader struct {
    r   io.LimitedReader
    err error
}

func (mlr *MyLimitedReader) Read(p []byte) (int, error) {
    n, err := mlr.Read(p)
    if err == io.EOF {
        err = mlr.err
    }
    return n, er
}

Is that correct?

If so, perhaps this proposal should be rewritten to add an Err field io.LimitedReader.

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Feb 9, 2022
@DeedleFake DeedleFake changed the title proposal: net/http: add an alternative to MaxBytesReader that doesn't require a ResponseWriter proposal: io: add an Err field to LimitedReader Feb 10, 2022
@earthboundkid
Copy link
Contributor

I like the new proposal. Returning io.EOF makes io.LimitedReader pretty unhelpful. But I think it would also be good to document that http.MaxBytesReader accepts nil ResponseWriter.

@bcmills
Copy link
Contributor

bcmills commented Feb 11, 2022

I have forked io.LimitedReader to return a custom error at least three times.

FWIW, my moreio.LimitedWriter has an Err field too:

@earthboundkid
Copy link
Contributor

Can someone tag this proposal as under review?

@xeoncross
Copy link

As an alternative to defining your own error, I just created a package and export the error so the caller can check against it.

// Throws an error if the reader is bigger than limit.
var ErrSizeExceeded = errors.New("stream size exceeded")

type MaxBytesReader struct {
	io.ReadCloser       // reader object
	N             int64 // max bytes remaining.
}

func NewMaxBytesReader(r io.ReadCloser, limit int64) *MaxBytesReader {
	return &MaxBytesReader{r, limit}
}

func (b *MaxBytesReader) Read(p []byte) (n int, err error) {
	if b.N <= 0 {
		return 0, ErrSizeExceeded
	}

	if int64(len(p)) > b.N {
		p = p[0:b.N]
	}

	n, err = b.ReadCloser.Read(p)
	b.N -= int64(n)
	return
}

@earthboundkid
Copy link
Contributor

earthboundkid commented Mar 26, 2022

That would break current users of io.LimitReader/LimitedReader.

I misunderstood the proposal.

@ianlancetaylor
Copy link
Contributor

@carlmjohnson It is already so tagged. We'll get to it soon, I think.

@gopherbot
Copy link

Change https://go.dev/cl/396215 mentions this issue: io: add an Err field to LimitedReader

@rsc
Copy link
Contributor

rsc commented Mar 30, 2022

This seems reasonable. Thanks for the proposal.

@rsc
Copy link
Contributor

rsc commented Mar 30, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals (old) Mar 30, 2022
@rsc
Copy link
Contributor

rsc commented Apr 6, 2022

Does anyone object to adding this?

@rsc rsc moved this from Active to Likely Accept in Proposals (old) Apr 13, 2022
@rsc
Copy link
Contributor

rsc commented Apr 13, 2022

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc rsc moved this from Likely Accept to Accepted in Proposals (old) May 4, 2022
@rsc
Copy link
Contributor

rsc commented May 4, 2022

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: io: add an Err field to LimitedReader io: add an Err field to LimitedReader May 4, 2022
@rsc rsc modified the milestones: Proposal, Backlog May 4, 2022
@dmitshur dmitshur modified the milestones: Backlog, Go1.19 May 4, 2022
@ianlancetaylor
Copy link
Contributor

@rogpeppe Does https://go.dev/cl/410035 act as you expected? Thanks.

@rogpeppe
Copy link
Contributor

rogpeppe commented Jun 2, 2022

@ianlancetaylor Technically yes, but I think it would be preferable if we could avoid the extra Read call in most cases by adding an extra byte to the previous call.

For example, with a limit of 8 bytes, if the first Read call asked for 20 bytes, we could ask for 9 bytes from the underlying reader rather than 8 bytes and then another single byte.

I have an implementation that does that, but I haven't yet been able to make it as simple as I'd like.

Then again, maybe that could be considered to be premature optimisation. What do you think?

@ianlancetaylor
Copy link
Contributor

@rogpeppe I changed the CL as you suggest. It's a fair bit more complicated.

@ianlancetaylor
Copy link
Contributor

Note that with the new implementation in CL 410035 confusing things can happen if users change the R or N fields of the LimitedReader. That seems hard to avoid.

@earthboundkid
Copy link
Contributor

I find the changes sort of confusing and have a slight preference that LimitedReader is reverted to the 1.18 status quo ante. With so many caveats, it’s hard to know when I would use LimitedReader.Err.

@bcmills
Copy link
Contributor

bcmills commented Jun 3, 2022

Yeah, I think my main takeaway is that a LimitedWriter seems like an ultimately cleaner approach than trying to tune the LimitedReader semantics to always do something intuitive.

@rogpeppe
Copy link
Contributor

rogpeppe commented Jun 3, 2022

@bcmills In my use case at least, using a LimitedWriter was not an option - we needed to limit the amount of data incoming to a streaming parser. There was no Writer involved.

FWIW, this was as far as I got with my version of the code (I don't think it's quite there yet):

func (l *LimitedReader) Read(p []byte) (int, error) {
	if int64(len(p)) < l.N {
		// Common case: not near the limit.
		n, err := l.R.Read(p)
		if n <= 0 {
			return n, err
		}
		l.N -= int64(n)
		return n, err
	}
	if l.Err == nil || l.Err == EOF {
		// No limit-exceeded special case.
		if l.N <= 0 {
			return 0, EOF
		}
		n, err := l.R.Read(p[0:l.N])
		l.N -= int64(n)
		return n, err
	}
	// Limit-exceeded special case: need to read one more byte to
	// determine if limit was actually exceeded. We only know that
	// if we've read more than the limit (i.e. l.N is negative)
	if l.N < 0 {
		return 0, l.Err
	}
	if int64(len(p)) > l.N+1 {
		p = p[:l.N+1]
	}
	n, err := l.R.Read(p)
	l.N -= int64(n)
	if l.N < 0 {
		// Don't return the extra byte that was requested.
		n--
		err = l.Err
	}
	return n, err
}

I went with not adding any new fields to LimitedReader and using a negative N as a signifier that we have actually read past the end of input (which is arguably somewhat intuitive - we will actually have read one more than the total requested number of bytes).

@rsc
Copy link
Contributor

rsc commented Jun 3, 2022

Why are we worrying so much about the extra Read?
Isn't the point of using a limited reader that you don't expect to reach the limit?
Does the "actually hit the limit" case really need to optimize away the final read?

@rsc
Copy link
Contributor

rsc commented Jun 3, 2022

In any event, the beta is upon us, @ianlancetaylor is away, and this isn't critical -
anyone who needs a custom err from LimitedReader can easily use a copy of the code.
I'm going to send a revert CL, and we can figure out the right implementation for next cycle.

@gopherbot
Copy link

Change https://go.dev/cl/410133 mentions this issue: io: revert: add an Err field to LimitedReader

@rogpeppe
Copy link
Contributor

rogpeppe commented Jun 3, 2022

Does the "actually hit the limit" case really need to optimize away the final read?

I tend to agree that this indeed premature optimisation. I thought there might be an elegant way of doing it without significantly increasing code complexity, in which case it might be worth doing, but that doesn't appear to be the case.

@earthboundkid
Copy link
Contributor

Isn't the point of using a limited reader that you don't expect to reach the limit?

There are two cases for LimitedReader. One is that you don't want to process more than X amount of data, and then hitting the limit is unusual, and it's pretty much fine to have a small stray read. But another case is you're parsing some data directly out of a reader but you want it chunked into fixed sized units, and then overreading make you lose your place in the stream. For the second case, I think you'd mostly want to use io.EOF anyway, but it's weird if you get different behavior by setting Err.

@gopherbot
Copy link

Change https://go.dev/cl/410357 mentions this issue: doc/go1.19: remove TODO about LimitedReader

gopherbot pushed a commit that referenced this issue Jun 4, 2022
We are having a hard time deciding the exact semantics
of the Err field, and we need to ship the beta.
So revert the Err field change; it can wait for Go 1.20.

For #51115.

This reverts CL 396215.

Change-Id: I7719386567d3da10a614058a11f19dbccf304b4d
Reviewed-on: https://go-review.googlesource.com/c/go/+/410133
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
@rsc
Copy link
Contributor

rsc commented Jun 4, 2022

Rolled back the doc TODO as well. Moving to Go 1.20.

@rsc rsc modified the milestones: Go1.19, Go1.20 Jun 4, 2022
gopherbot pushed a commit that referenced this issue Jun 4, 2022
Rolled back in CL 410133.

For #51115.

Change-Id: I009c557acf98a98a9e5648fa82d998d41974ae60
Reviewed-on: https://go-review.googlesource.com/c/go/+/410357
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot gopherbot modified the milestones: Go1.20, Go1.21 Feb 1, 2023
@gopherbot gopherbot modified the milestones: Go1.21, Go1.22 Aug 8, 2023
@odeke-em odeke-em modified the milestones: Go1.22, Go1.23 Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done. Proposal Proposal-Accepted
Projects
Status: Accepted