Original bug ID: 6719 Reporter:@c-cube Assigned to:@gasche Status: closed (set by @xavierleroy on 2016-12-07T10:47:32Z) Resolution: fixed Priority: normal Severity: feature Fixed in version: 4.03.0+dev / +beta1 Category: standard library Tags: junior_job Has duplicate:#7136 Monitored by:@hcarty
Buffer.add_channel behavior is flawed. In case the channel contains less than [n] bytes (where [n] is the last parameter to [Buffer.add_channel buf chan n]), some data is consumed but not read, and lost forever if the channel isn't seekable.
This makes it almost unusable in my opinion (and it looks like it's not used in trunk, as "grep -r Buffer.add_channel" shows).
Steps to reproduce
1/ echo "hello" > /tmp/some_file
let buf = Buffer.create 256 in
let ic = open_in "/tmp/some_file" in
(try Buffer.add_channel buf ic 256 with End_of_file -> ());
We obtain an empty string. In practice, unless we need how much bytes to read exactly beforehand, we just consume data from the in_channel and discard it. It makes writing a simple procedure to read all data from a channel into a Buffer impossible (except by reading char by char).
The text was updated successfully, but these errors were encountered:
I'd say Buffer.add_channel isn't flawed because it raises an exception for a short read. There are numerous protocols where the number of bytes are prepended to payload data, and add_channel works fine for these. A short read is always an error in this case.
The problem reported by c-cube sounds more like a feature request for a second channel reading function that also appends short reads to the buffer.
I'd say it's a good feature request. Discarding data is almost never the right thing to do.
Here is what I propose: change the documentation to say "reads at least [n] characters" instead of "reads exactly [n] characters", and change the implementation to put the bytes in the buffer instead of discarding them. But still raise End_of_file if there aren't enough bytes.
It can be done instead of changing add_channel, or in conjunction with it.
hcarty: if you know you need this, it's possible to workaround the fact that add_channel cannot return the number of read bytes by storing the length of the buffer before and after the request, and computing the difference.
What I like with Damien's suggestion is that it strikes a subtle balance between preserving backward-compatibility (the behavior is almost unchanged) while satisfying a new use-case. I agree (with hcarty) that if we were doing it from scratch, we would probably have a different interface¹, but there is an art in evolving APIs that is admirable here.
¹: namely, "add_channel" that returns the number of bytes read, and raises End_of_file only if it knows the channel ends, and "add_channel_really" with the current interface.