-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arbitrary connection support #16
Comments
Sounds like this is going to be a lot of work, and we'll need to implement quite different strategies for mmap files and connections. I think this means that connections should be at least a 0.2 feature. |
Sure. I'll focus on mmap for now but keeping in mind extensibility. mmap is somewhat orthogonal to the processing. mmap is just a way to get a But anyway, if we move this to a 0.2 version, then perhaps this gives me time to worry about connections upstream in Rcpp implementations. |
Sounds good to me. I wonder if there's any support for mmapped gz/bz2 files. That's going to be a pretty common use case in my experience. |
I don't think so, which is a reason why I'd want to be able to consume a connection through R internal api. |
Apart from correct handling of the connection push back #19 I've leveraged what I need from the connection api. |
We need to leverage this: https://gist.github.com/romainfrancois/6119995
so that we can read a stream from an arbitrary connection. We would still use mmap for speed when it makes sense, but otherwise we can process the stream from a buffered connection.
In a threaded world we can imagine to separate the work into:
so that the thread(s) processing the data would not have to wait to process.
But before threads, we can sequentially retrieve data from the connection and process.
The problem I suppose is that I guess we can only read once from some connections, which might render difficult a two steps algorithm like counting the number of lines and then allocate data ...
The text was updated successfully, but these errors were encountered: