-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Wrapper/Proxy for Stream #55
Comments
IIUC, |
Yea, I believe they should, and they should still work as intended given that |
Another thing I want to mention is unified API would help us to implement downstream DataPipe:
These DataPipes are going to take file handle (stream) as input and yield either a filename or a new file stream. A unified API would help us to implement function to read data from streams. |
Doesn't For |
You are right with urls = IterableWrapper([URL])
fd = urls.open_url() # File handles
data = fd.map(fn=lambda x: x.read(), input_col=1) # <- This one is referred as Downloader in my mind
file = data.save_to_disk() We can let user to do any map function to download data from file handle. But, if we are going to implement a DataPipe to do the same thing, we need to make sure all file handles (streams) sent to this DataPipe can be read. The reason that I prefer a read method is the stream type varies:
|
In the case of |
Closing this Issue as the Wrapper has landed. |
🚀 Feature
Discussed with @NivekT about the wrapper class for all streams:
Pros:
__del__
method to close the file stream automatically when ref count becomes 0 for wrapper. It would eliminate all warnings.OnDiskCache
, I would prefer a unified API to read stream, otherwise I have to handle all different cases)read()
to read everything into memorystream=True
for large file, therequests.Response
doesn't supportread
. It only supportsiter_content
or__iter__
to read chunk by chunk.Cons:
Reference: #35 (comment), #65 (comment)
cc: @VitalyFedyunin
The text was updated successfully, but these errors were encountered: