Support downloading torrents larger than available memory without swapping #112

feross · 2014-09-21T04:37:03Z

Issue by feross
Sunday May 18, 2014 at 07:09 GMT
Originally opened as https://github.com/feross/bittorrent-client/issues/16

We'll need to come up with an eviction solution so you can download torrents which are larger than the available memory on the system without thrashing to disk!

feross · 2014-09-21T04:37:03Z

Comment by vikstrous
Wednesday Jun 11, 2014 at 19:08 GMT

I need this feature, but I care only about one specific case. I would like to download only one file from the torrent and I know that file will fit in memory. Maybe if in client.add() I could specify which file I want to download, I won't need a fancy custom eviction / manual swapping solution.

feross · 2014-09-21T04:37:04Z

Comment by vikstrous
Thursday Jun 12, 2014 at 15:17 GMT

Looks like for my use case just setting opts.nobuffer in Storage() in lib/torrent.js:507 is enough.

feross · 2014-09-21T04:37:04Z

Comment by ollym
Monday Aug 25, 2014 at 23:01 GMT

I'm seeing this problem too. "Native" non-v8-heap objects taking up most space:

  { what: 'Native',
    size_bytes: 290145924,
    size: '276.7 mb',
    '+': 56,
    '-': 0 },

I'm using read streams and I've tried:

var stream = file.createReadStream(opts);
stream.pipe(socket);

stream.on('data', function() {
  var index = stream._piece - 1;
  file.storage.bitfield.set index, false
  file.pieces[index]._reset();
});

..but no luck!

feross · 2014-09-21T04:37:05Z

Comment by feross
Tuesday Aug 26, 2014 at 06:26 GMT

The reason that memory isn't getting freed is we need to keep pieces around
to answer future piece requests from peers. As we obtain new pieces, we
send out HAVE messages and then peers can expect us to have those pieces.
So, I see three options:

offer an option to throw data away as soon as the stream has consumed it,
to support this use case
detect low memory condition and throw data away only then (too magical,
imo)
add file system support to this module. Save files to disk in temp (or
permanent, if user wants it) location so pieces will be retrievable later,
without being in ram. hesitant to add fs support since this module also
works in the browser (with WebRTC!) and it's not clear how to make fs work
there, but we should investigate this more

In the meantime, you can try using the "webtorrent" module which does write
the file to disk and see if that works better for you!

On Monday, August 25, 2014, Oliver Morgan notifications@github.com wrote:

I'm seeing this problem too. "Native" non-v8-heap objects taking up most
space:

{ what: 'Native',
size_bytes: 290145924,
size: '276.7 mb',
'+': 56,
'-': 0 },

I'm using read streams and I've tried:

var stream = file.createReadStream(opts);stream.pipe(socket);
stream.on('data', function() {
var index = stream._piece - 1;
file.storage.bitfield.set index, false
file.pieces[index]._reset();});

..but no luck!

—
Reply to this email directly or view it on GitHub
https://github.com/feross/bittorrent-client/issues/16#issuecomment-53349467
.

Feross | blog http://feross.org/ | webtorrent http://webtorrent.io/ |
studynotes http://www.apstudynotes.org/

feross · 2014-09-21T04:37:05Z

Comment by andrewrk
Tuesday Aug 26, 2014 at 07:14 GMT

Save files to disk in temp (or
permanent, if user wants it) location so pieces will be retrievable later,
without being in ram.

maybe relevant: https://github.com/andrewrk/node-fd-slicer

lets you create read and write streams from an open fd, avoiding the problem of write/read threads clobbering each other.

doesn't solve the issue of working in the browser though. IMO a bittorrent implementation in node.js will work better in node.js than the browser and the codebase shouldn't be shared, except for maybe some modules than can work in both places.

feross · 2014-09-21T04:37:06Z

Comment by ollym
Tuesday Aug 26, 2014 at 08:22 GMT

@feross Neither the FS or the magic are going to work for me, my use-case is fairly specific I guess.

I have this installed on a Raspberry Pi, which only has 512mb of ram, and writing to FS is not an option as the SD card has a limited lifespan made worse by any IO performed on it. Everything should remain in memory (with swap disabled).

Also I'm streaming content live to a HTTP server which is acting as a seek-able interface to the torrent stream beneath it (using Content-Range and partial "206" responses).

I can then download this file at lightning speeds from a computer on the same local network using a standard web browser. And if it's media content, I can actually skip forward / back.

However this leaves me with a unique set of problems:

After downloading a piece and having it sent out to the HTTP server, I need it removed. If it's requested again, it can be re-downloaded. It's assumed that my internet connection will always be faster than the rate the content is being consumed.
The download speed of the torrent needs to be throttled by how fast the pieces are being read from the readable stream. This should ideally be controlled using the highWaterMark within the readable stream interface to control how much data is being buffered internally.
If a readable stream is destroyed, (because the receiving HTTP request closed the connection prematurely) it should clean up all buffered data and return to the original state.

This basically means I'll only be seeding pieces which remain on the readable stream buffer, waiting to be consumed by the HTTP service which won't be a lot and won't be for long. Will this have any implications on the way the BitTorrent protocol works?

feross · 2014-09-21T04:37:06Z

Comment by andrewrk
Wednesday Aug 27, 2014 at 22:40 GMT

This issue is a pretty big deal. This and #39 are my current blockers for depending on this module.

The module should not try to use all the system's memory before swapping to disk. It should use as little memory as possible and use the disk to store data. If caching could improve performance, then an option should be exposed to set the maximum cache size. It's probably not necessary though, since using the disk will rely on the OS's cache of the file system, and the kernel developers have spent a lot of years writing a good caching system.

feross · 2014-09-21T04:37:06Z

Comment by ollym
Thursday Aug 28, 2014 at 09:36 GMT

@andrewrk if you want to write to disk, I suggest looking into creating your own storage engine. This is a good starting point:

https://github.com/mafintosh/torrent-stream/blob/master/lib/storage.js

Ayms · 2014-09-23T10:22:21Z

Why don't you use indexedDB as explained in #86? Despite of what I have highlighted in the links provided, this is working well.

Chrome did implement Blob support recently. Unlike what Peersm is currently doing and that will probably change with the WebRTC implementation (storing Blob pieces, then storing the whole file from all the pieces), I would suggest to store the Blob pieces only.

Belphemur · 2014-09-28T13:35:07Z

Now that Bittorrent-client has been merge in Webtorrent, I think this issue should be put as high priority especially if you intent to make it a real bittorrent-client and not only a bittorrent-streamer.

For one of my application I need a torrent client, I was first using torrent-stream that was doing quite a good job and keeping the memory use quite stable. I just wanted to switch to your solution since I like the idea of supporting WebRTC and also this project is more active.

But after some testing, I don't see how it can be used in production if you intend to download file that are more than a couple of megs. I understand that as an "in Browser torrent client" it's quite difficult to find a place where to store (#86) but as you said you want also to create a Node-Webkit application. I don't think you'll find a solution "to rule them all".

Can't you keep a buffer of a couple of megs for each torrent where you'd put the rarest pieces (or most requested) and just empty/populate it on demand. You have a whole discussion on that for transmissionbt : https://trac.transmissionbt.com/ticket/1521

Ayms · 2014-09-29T08:57:44Z

I understand that as an "in Browser torrent client" it's quite difficult to find a place where to store

I don't understand what's the issue here, storing pieces using indexedDB is easy and works for GB files, if the file has to be seeded you keep the pieces in indexedDB, if not you discard them. As I wrote it's better to keep the pieces instead of merging a GB file as a Blob that you will have to slice anyway to use it.

I have tried some other alternatives, still with indexedDB, like storing pieces as ArrayBuffers in the meantime Chrome implement Blob storage, this does not work very well and the browser will crash for large files, but Blob storage is OK.

See https://code.google.com/p/chromium/issues/detail?id=108012, giving other ideas too but I would not recommend them, the start of the thread is old, indexedDB should be used since it was designed for that purpose.

I don't think you'll find a solution "to rule them all".

Same thing, I don't see the issue, browsers will use indexedDB, other clients will use the disk, both don't have to interoperate.

tobice · 2014-10-25T12:17:59Z

Regarding this problem, what is fs-storage for? I mean of course, it saves the files being downloaded to fs but it doesn't really behave like memory storage replacement because at least according to the memory consumption, the torrented files are still completely stored in the memory. On the other hand, I skimmed through the fs-storage code and I noticed that the readBlock method does check whether the given block is available in memory and if not, falls back to fs. But it does not seem to be working properly.

feross · 2014-10-25T22:05:47Z

fs-storage definitely still has a few bugs.

manvir-singh · 2014-12-16T06:03:36Z

I ran into this problem too i cant write any large files to disk. Does anyone know of a work around?
This is how im trying to write the GB file to disk and it fails after some time: file.createReadStream().pipe(outFile);

ghost · 2015-03-20T08:43:39Z

bump. issue still present and huge problem for older computers.

feross · 2015-07-10T21:51:53Z

Duplicate of #248

carlopires · 2015-07-18T01:31:14Z

Any improvements on this? I would like to implement IndexedDB for my browser clients if this is an option supported by the official interface.

feross added the enhancement label Sep 21, 2014

feross closed this as completed Jul 10, 2015

lock bot locked as resolved and limited conversation to collaborators May 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support downloading torrents larger than available memory without swapping #112

Support downloading torrents larger than available memory without swapping #112

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

Ayms commented Sep 23, 2014

Belphemur commented Sep 28, 2014

Ayms commented Sep 29, 2014

tobice commented Oct 25, 2014

feross commented Oct 25, 2014

manvir-singh commented Dec 16, 2014

ghost commented Mar 20, 2015

feross commented Jul 10, 2015

carlopires commented Jul 18, 2015

Support downloading torrents larger than available memory without swapping #112

Support downloading torrents larger than available memory without swapping #112

Comments

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

feross commented Sep 21, 2014

Ayms commented Sep 23, 2014

Belphemur commented Sep 28, 2014

Ayms commented Sep 29, 2014

tobice commented Oct 25, 2014

feross commented Oct 25, 2014

manvir-singh commented Dec 16, 2014

ghost commented Mar 20, 2015

feross commented Jul 10, 2015

carlopires commented Jul 18, 2015