Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upImplement storage in browser clients #86
Comments
This comment has been minimized.
This comment has been minimized.
|
Feedback from Peersm: for now indexedDB is enough and probably the best approach (not really the choice...), even with large files, with some drawbacks/bugs as I wrote in #39 (and using something on top of it like filer can make things easier), assuming you store the files by chunks (Blobs), or use chunks as an intermediary step (please see this thread http://lists.w3.org/Archives/Public/public-webapps/2013OctDec/0657.html) Unfortunately handling partial data as described in http://lists.w3.org/Archives/Public/public-webapps/2014AprJun/0171.html and http://lists.w3.org/Archives/Public/public-webapps/2014JulSep/0332.html seems not to be for tomorrow. |
This comment has been minimized.
This comment has been minimized.
|
I am doing some experiments to persist data in the browser and reduce memory consumption. It seems to work basically, I will have a look at the performance/stability in the next days. |
This comment has been minimized.
This comment has been minimized.
Managed to get fs-chunk-store working in the browser over browserify-fs instead of fs.I had to modify:
The only remaining problem is the fact that fs-chunk-store needs the temp path to exist in order to run. Some questions:
There are some other fs replacements for the browser (some with sync methods included) but those look too big and complex. |
This comment has been minimized.
This comment has been minimized.
|
Got fs-chunk-store working without reloading. Even with this, it's still impossible to build webtorrent. Going to work on it tomorrow. |
This comment has been minimized.
This comment has been minimized.
|
@feross @Ayms @yciabaud @santiagogil |
This comment has been minimized.
This comment has been minimized.
|
My storage implementation uses indexeddb when available and seems to work pretty well. Maybe I should send a PR to replace memory-chunk-store in the browser. |
This comment has been minimized.
This comment has been minimized.
|
What's the news,?It still take much memory in browser. |
This comment has been minimized.
This comment has been minimized.
|
I'm doing a lot of research on this last few days.
Some background:
Sorry about to much commenting with no PR. I'm having time on phone to read and think but no keyboard time :( |
This comment has been minimized.
This comment has been minimized.
|
this.mem = new Array(this.storage.size % this.storage.chunkSize + 1) * |
This comment has been minimized.
This comment has been minimized.
It might be a good idea. Need to check how performant that would be though. |
This comment has been minimized.
This comment has been minimized.
|
I'm not sure but could my lib be of any interest? StreamSaver.js Writes data to the hard drive directly (asyncronus) Not possible to seek, Uses service worker to fake a response and dosn't use any local storage - so you wouldn't need to worry about "20% of user's available disk space" since its writing directly to the hard drive I was able to write 15gb of data generated on the client side without any prompts (except for choosing where you want to save) Did a hack so it works for HTTP sites also |
This comment has been minimized.
This comment has been minimized.
|
I would assume that if the user is downloading a torrent in the browser, quits the browser, then starts the download again. The download should resume where it left off, right? If we have some persistent storage in the browser, what would the implications of this be in regards to running multiple webtorrent instances that share a common persistent store? By having a persistent store, there are weird conditions where multiple instances are acting on the same store; for example, what if a user has multiple tabs open to the same webpage that has the same torrent downloading in each. I think making a indexdb-chunk-store wouldn't be too hard, but I don't know how this would work where there are multiple webtorrent instances using the same underlying storage. Does anyone know how these cases are handled in the desktop version or how they should be handled in the web version? |
This comment has been minimized.
This comment has been minimized.
|
I have written both a idb storage and a filesystem storage. Sure blobs can be made out of just memory but there is a way to move the blob off from the memory and on to the HDD by writing it do some web storage and replace it with a blob that is just a pointer to somewhere in the HDD A blob isn't made of data... look at this simplified chart how chrome describes it in a document // So if you have one 100MB blob in memory
let blob = new Blob([new ArrayBuffer(1.049e+8)])
// then you write this to the idb or the sandboxed filesystem
await storeBlob(blob)
// after that you replace the `blob` variable with the one you get from the storage
blob = await getBlob(index)The result is that you now have replaced the memory blob with the same data. The only differensen now is that the blob is a pointer to a place in the HDD. So what i'm proposing is that all abstract-chunk-store can (if it wants to) return a blob instead of a buffer get(index, opts, cb) {
cb(null, blob)
}This will result in faster startup time when webtorrent checks if all pieces exist. The indexeddb layer won't read the hole buffer, instead just give back a blob with a pointer resulting in faster lookups. so instead of the (default: 200 * 1000 * 1000 bytes) maxBlobLength Where you probably do something like this blob = new Blob([buffer_1, buffer_2, buffer_x])you could loosen up the maxBlobLength by A LOT if you did it with blobs instead of buffers blob = new Blob([blob_1, blob_2, blob_x])since the second approach just combines blob pointers w/ offsets & size I hope this is a easy fix since blob and buffer works in similar ways that it can be assembled to a final blob, and blob as well as buffers can also use |
This comment has been minimized.
This comment has been minimized.
|
I implemented this in a branch of idbkv-chunk-store if anyone wants to experiment. (check the readme for usage) https://github.com/KayleePop/idbkv-chunk-store/tree/blobs It looks like you're right about indexedDB returning a pointer. |
This comment has been minimized.
This comment has been minimized.
|
@KayleePop it don't matter much if you store it as blob or buffer if you must convert it to buffer afterwards. Webtorrent need to be able to accept a blob chunks to be able to assemble them all to one large final blob |
This comment has been minimized.
This comment has been minimized.
|
you pass an option to get() to return it as a blob. chunkStore.get(0, {returnBlob: true}, (err, blob) => console.log(blob instanceof Blob)) // outputs trueIt also means that if you get a partial chunk like this chunkStore.get(0, {length: 20, offset: 100}, (err, buffer) => console.log(buffer))It will only read the actual data that's returned, because the slice happens on the blob before it's converted into a buffer.
If it returned blobs by default, then it would break the abstract chunk store tests. |
This comment has been minimized.
This comment has been minimized.
|
I think it's unfortunately that there isn't any What is a real bottleneck is:
If using the indexedDB layer it would just be better to open up a cursor and concatinate all blob in one go. And then return the blob or ObjectURL This would be far more affective if it was used instead. |
This comment has been minimized.
This comment has been minimized.
|
See comment above #86 (comment)
A bit amazing that this is still an issue for webtorrent... This is what the Peersm project is doing since years (with additional complexity of encryption, hash, video conversion, etc), you can try it, slicing in chunks, storing the same way, streaming the chunks, reconstituting the blob, etc, all this with flow control The Peersm/node-Tor code will become open source as soon as someone funds this work |
This comment has been minimized.
This comment has been minimized.
|
What about this problem? The discussion is 5 years old and it seems that no further discussion on it has occurred for more than 1 year. Is there anything that prevents the implementation of a storage system? Has anyone managed to implement this for webtorrent? If I understand correctly, in the current state it is not possible to download a torrent larger than the available memory. For a tool that is supposed to be the torrent for the browser, this seems to be a priority. |
This comment has been minimized.
This comment has been minimized.
|
See #1767 This project is storing files/torrents in chunks using indexedDB since years (you can try http://peersm.com/peersm2), the code is now open source in clear so you might use it/get some inspiration from it to implement webtorrent browser storage |
This comment has been minimized.
This comment has been minimized.
|
I'm using this with my own store. https://web.dev/native-file-system/ |

Currently, when webtorrent is used in the browser, it stores files entirely in memory. We need a solution that uses disk whenever possible, to prevent excessive memory usage and tab crashes.
Possible approaches:
Are there other ways to do browser storage? Are there other approaches to consider? Are there better storage modules to use?
Feedback welcome!