Is there currently a way to act on parts of a text file as they are downloaded? #141
Comments
|
I'm no expert here, but it really comes down to what you expect from a stream. There are examples out there of streaming video over IPFS via HLS. That's basically just chunking files up into component parts and then downloading them in sequence. I'm pretty sure IPFS doesn't support "streaming" as it were because it has to validate that the content matches its address. I'm pretty sure this is not what you're trying to do. If you have the ability to chunk your file up like HLS, then problem solved, chunk your file up and create your own inventory of chunks, download and analyze like that. If you don't, then you're going to have to analyze them out of order. Which means reconstructing the text document yourself. You'll have to manually reconstruct the links and put them in order, and deal with any weird word boundary issues that might come up as a result of unicode. If I'm insane someone let me know. |
|
You can also use files API. ipfs files cp /ipfs/QM...AAA /myfile.txt
ipfs files read --offset=N --count=M /myfile.txtThen you will read M bytes at offset N. |
|
@Ghoughpteighbteau Thanks, especially for the tip on HLS. I didn't realize that the video streaming wasn't inherently baked-in to IPFS. @Kubuxu This only works after you have downloaded the file though, yes? Is there a way to do the same thing while the download is going? |
|
The problem, I think, is that the way IPFS downloads is sorta like bittorrent. It chunks files up and constructs an acyclic graph of them. The data could theoretically be requested in order, you could do that by manually traversing the graph, but that's on you, IPFS, like bittorrent, is just going to grab any chunk whenever it's available. I think you have to write your own stuff that works with IPFS's plumbing if you want it in sequence. Now that I've written all that. I guess it is possible. |
|
@whereswaldon it will not download while file, only the chunks you access with |
|
@Ghoughpteighbteau I actually don't care about the order of the chunks for my particular use case. I just wanted to act on them as they arrived. I'm potentially working with Gigabytes of text, and I'd like to start processing as soon as they arrive. @Kubuxu Oh, okay. I'm not sure that this fits my particular use-case, since I doing that would require many sequential requests, but it's an excellent example of how to stream data where order is sensitive. Since I don't care about order very much, I think the other approach is somewhat more promising. Thanks for bringing this up though. I wonder now whether you couldn't build a streaming service just on top of that functionality. |
|
We are going to have a rudimentary pub-sub mechanism soon that will allow 'live' streaming of data. Most of the code to do this is there, but we're not solid on the api interface yet, so it hasnt merged. |
whereswaldon commentedJun 30, 2016
I'm interested in streaming large text files and performing analysis on the text chunks as they come in. How would I go about doing this via IPFS?
The text was updated successfully, but these errors were encountered: