Follow-up on #520, with my remark about buffering being very inefficient. In this case, and often when one needs to read bytes from a pipe, socket, etc. an efficient buffer is needed. So here is my proposal (to be reviewed, probably not yet ready for merge), a circular buffer that should be efficient for adding and removing big chunks of bytes, and also allows to iterate, pop next char, etc.
new module, batBufferIO: a circular, efficient buffer of bytes
Looks good to me. But this code seems very similar to BatSubstring. Is there a way to cut the duplication a little?
EDIT: saying the code is similar is the wrong word. More like the representations are similar
EDIT2: here's what I mean: https://github.com/ocaml-batteries-team/batteries-included/blob/master/src/batSubstring.ml#L24
Substring.t ref is basically BufferIO.t
I'm not sure, the use-case is pretty different (the exposed API, too). Here the goal is to be able to use the buffer to store the data read by IO, and to consume it at the other end, which means indices have to "wrap" in some cases. Otherwise the usual Buffer module would be enough, the problem with it is that is doesn't allow to remove bytes at the beginning efficiently.
I actually agree with you in this specific case and would like to add that this module is nicely isolated for the great batteries split up. In principle however, : It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures. :)
A couple of remarks:
You're right, of course. I think that in the big split up, it should go in the "IO" sub-package. I think it should actually be used for IO in many places of the library (sharing buffers is important). The usual Buffer structure, on the other hand, should be subsumed by this one (problem is that it's not the same type).
About tests, yeah, I just added a few but it's a bit tedious to get a buffer into the "wrapped" state to test half the cases. That's WIP.
@rgrinberg actually there's a difference between bufferIO and Substring.t ref, because the former wraps around (so it would be a pair of substrings in some cases). However it's probably not useful enough to be included now.