Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffer.add_channel is unusable #6719

Closed
vicuna opened this Issue Dec 16, 2014 · 8 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link
Collaborator

vicuna commented Dec 16, 2014

Original bug ID: 6719
Reporter: @c-cube
Assigned to: @gasche
Status: closed (set by @xavierleroy on 2016-12-07T10:47:32Z)
Resolution: fixed
Priority: normal
Severity: feature
Fixed in version: 4.03.0+dev / +beta1
Category: standard library
Tags: junior_job
Has duplicate: #7136
Monitored by: @hcarty

Bug description

Buffer.add_channel behavior is flawed. In case the channel contains less than [n] bytes (where [n] is the last parameter to [Buffer.add_channel buf chan n]), some data is consumed but not read, and lost forever if the channel isn't seekable.

This makes it almost unusable in my opinion (and it looks like it's not used in trunk, as "grep -r Buffer.add_channel" shows).

Steps to reproduce

1/ echo "hello" > /tmp/some_file

2/ utop

let buf = Buffer.create 256 in
let ic = open_in "/tmp/some_file" in
(try Buffer.add_channel buf ic 256 with End_of_file -> ());
Buffer.contents buf

We obtain an empty string. In practice, unless we need how much bytes to read exactly beforehand, we just consume data from the in_channel and discard it. It makes writing a simple procedure to read all data from a channel into a Buffer impossible (except by reading char by char).

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Dec 17, 2014

Comment author: gerd

I'd say Buffer.add_channel isn't flawed because it raises an exception for a short read. There are numerous protocols where the number of bytes are prepended to payload data, and add_channel works fine for these. A short read is always an error in this case.

The problem reported by c-cube sounds more like a feature request for a second channel reading function that also appends short reads to the buffer.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Dec 17, 2014

Comment author: @damiendoligez

I'd say it's a good feature request. Discarding data is almost never the right thing to do.

Here is what I propose: change the documentation to say "reads at least [n] characters" instead of "reads exactly [n] characters", and change the implementation to put the bytes in the buffer instead of discarding them. But still raise End_of_file if there aren't enough bytes.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Dec 17, 2014

Comment author: @c-cube

sounds ok. I'll submit a PR.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Dec 18, 2014

Comment author: @hcarty

How would you identify the bytes making up the partial request? It may be necessary to specially handle an incomplete read.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Dec 18, 2014

Comment author: @gasche

Another option would be to add

add_channel_full

or

add_full_channel

that does what c-cube wants in the first place.

It can be done instead of changing add_channel, or in conjunction with it.

hcarty: if you know you need this, it's possible to workaround the fact that add_channel cannot return the number of read bytes by storing the length of the buffer before and after the request, and computing the difference.

What I like with Damien's suggestion is that it strikes a subtle balance between preserving backward-compatibility (the behavior is almost unchanged) while satisfying a new use-case. I agree (with hcarty) that if we were doing it from scratch, we would probably have a different interface¹, but there is an art in evolving APIs that is admirable here.

¹: namely, "add_channel" that returns the number of bytes read, and raises End_of_file only if it knows the channel ends, and "add_channel_really" with the current interface.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Dec 18, 2014

Comment author: @hcarty

gasche: That's true - I was hoping for something cleaner, but as you said there is an art to API evolution.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Dec 18, 2014

Comment author: @damiendoligez

FTR I wrote "at least" when I meant "at most". You did it right in the GPR (#129).

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Apr 12, 2015

Comment author: @gasche

Fixed using Damien's proposed behavior, thanks to c-cube's patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.