Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support downloading torrents larger than available memory without swapping #112

Closed
feross opened this issue Sep 21, 2014 · 17 comments
Closed

Comments

@feross
Copy link
Member

feross commented Sep 21, 2014

Issue by feross
Sunday May 18, 2014 at 07:09 GMT
Originally opened as https://github.com/feross/bittorrent-client/issues/16


We'll need to come up with an eviction solution so you can download torrents which are larger than the available memory on the system without thrashing to disk!

@feross
Copy link
Member Author

feross commented Sep 21, 2014

Comment by vikstrous
Wednesday Jun 11, 2014 at 19:08 GMT


I need this feature, but I care only about one specific case. I would like to download only one file from the torrent and I know that file will fit in memory. Maybe if in client.add() I could specify which file I want to download, I won't need a fancy custom eviction / manual swapping solution.

@feross
Copy link
Member Author

feross commented Sep 21, 2014

Comment by vikstrous
Thursday Jun 12, 2014 at 15:17 GMT


Looks like for my use case just setting opts.nobuffer in Storage() in lib/torrent.js:507 is enough.

@feross
Copy link
Member Author

feross commented Sep 21, 2014

Comment by ollym
Monday Aug 25, 2014 at 23:01 GMT


I'm seeing this problem too. "Native" non-v8-heap objects taking up most space:

  { what: 'Native',
    size_bytes: 290145924,
    size: '276.7 mb',
    '+': 56,
    '-': 0 },

I'm using read streams and I've tried:

var stream = file.createReadStream(opts);
stream.pipe(socket);

stream.on('data', function() {
  var index = stream._piece - 1;
  file.storage.bitfield.set index, false
  file.pieces[index]._reset();
});

..but no luck!

@feross
Copy link
Member Author

feross commented Sep 21, 2014

Comment by feross
Tuesday Aug 26, 2014 at 06:26 GMT


The reason that memory isn't getting freed is we need to keep pieces around
to answer future piece requests from peers. As we obtain new pieces, we
send out HAVE messages and then peers can expect us to have those pieces.
So, I see three options:

  • offer an option to throw data away as soon as the stream has consumed it,
    to support this use case
  • detect low memory condition and throw data away only then (too magical,
    imo)
  • add file system support to this module. Save files to disk in temp (or
    permanent, if user wants it) location so pieces will be retrievable later,
    without being in ram. hesitant to add fs support since this module also
    works in the browser (with WebRTC!) and it's not clear how to make fs work
    there, but we should investigate this more

In the meantime, you can try using the "webtorrent" module which does write
the file to disk and see if that works better for you!

On Monday, August 25, 2014, Oliver Morgan notifications@github.com wrote:

I'm seeing this problem too. "Native" non-v8-heap objects taking up most
space:

{ what: 'Native',
size_bytes: 290145924,
size: '276.7 mb',
'+': 56,
'-': 0 },

I'm using read streams and I've tried:

var stream = file.createReadStream(opts);stream.pipe(socket);
stream.on('data', function() {
var index = stream._piece - 1;
file.storage.bitfield.set index, false
file.pieces[index]._reset();});

..but no luck!


Reply to this email directly or view it on GitHub
https://github.com/feross/bittorrent-client/issues/16#issuecomment-53349467
.

Feross | blog http://feross.org/ | webtorrent http://webtorrent.io/ |
studynotes http://www.apstudynotes.org/

@feross
Copy link
Member Author

feross commented Sep 21, 2014

Comment by andrewrk
Tuesday Aug 26, 2014 at 07:14 GMT


Save files to disk in temp (or
permanent, if user wants it) location so pieces will be retrievable later,
without being in ram.

maybe relevant: https://github.com/andrewrk/node-fd-slicer

lets you create read and write streams from an open fd, avoiding the problem of write/read threads clobbering each other.

doesn't solve the issue of working in the browser though. IMO a bittorrent implementation in node.js will work better in node.js than the browser and the codebase shouldn't be shared, except for maybe some modules than can work in both places.

@feross
Copy link
Member Author

feross commented Sep 21, 2014

Comment by ollym
Tuesday Aug 26, 2014 at 08:22 GMT


@feross Neither the FS or the magic are going to work for me, my use-case is fairly specific I guess.

I have this installed on a Raspberry Pi, which only has 512mb of ram, and writing to FS is not an option as the SD card has a limited lifespan made worse by any IO performed on it. Everything should remain in memory (with swap disabled).

Also I'm streaming content live to a HTTP server which is acting as a seek-able interface to the torrent stream beneath it (using Content-Range and partial "206" responses).

I can then download this file at lightning speeds from a computer on the same local network using a standard web browser. And if it's media content, I can actually skip forward / back.

However this leaves me with a unique set of problems:

  1. After downloading a piece and having it sent out to the HTTP server, I need it removed. If it's requested again, it can be re-downloaded. It's assumed that my internet connection will always be faster than the rate the content is being consumed.
  2. The download speed of the torrent needs to be throttled by how fast the pieces are being read from the readable stream. This should ideally be controlled using the highWaterMark within the readable stream interface to control how much data is being buffered internally.
  3. If a readable stream is destroyed, (because the receiving HTTP request closed the connection prematurely) it should clean up all buffered data and return to the original state.

This basically means I'll only be seeding pieces which remain on the readable stream buffer, waiting to be consumed by the HTTP service which won't be a lot and won't be for long. Will this have any implications on the way the BitTorrent protocol works?

@feross
Copy link
Member Author

feross commented Sep 21, 2014

Comment by andrewrk
Wednesday Aug 27, 2014 at 22:40 GMT


This issue is a pretty big deal. This and #39 are my current blockers for depending on this module.

The module should not try to use all the system's memory before swapping to disk. It should use as little memory as possible and use the disk to store data. If caching could improve performance, then an option should be exposed to set the maximum cache size. It's probably not necessary though, since using the disk will rely on the OS's cache of the file system, and the kernel developers have spent a lot of years writing a good caching system.

@feross
Copy link
Member Author

feross commented Sep 21, 2014

Comment by ollym
Thursday Aug 28, 2014 at 09:36 GMT


@andrewrk if you want to write to disk, I suggest looking into creating your own storage engine. This is a good starting point:

https://github.com/mafintosh/torrent-stream/blob/master/lib/storage.js

@Ayms
Copy link

Ayms commented Sep 23, 2014

Why don't you use indexedDB as explained in #86? Despite of what I have highlighted in the links provided, this is working well.

Chrome did implement Blob support recently. Unlike what Peersm is currently doing and that will probably change with the WebRTC implementation (storing Blob pieces, then storing the whole file from all the pieces), I would suggest to store the Blob pieces only.

@Belphemur
Copy link

Now that Bittorrent-client has been merge in Webtorrent, I think this issue should be put as high priority especially if you intent to make it a real bittorrent-client and not only a bittorrent-streamer.

For one of my application I need a torrent client, I was first using torrent-stream that was doing quite a good job and keeping the memory use quite stable. I just wanted to switch to your solution since I like the idea of supporting WebRTC and also this project is more active.

But after some testing, I don't see how it can be used in production if you intend to download file that are more than a couple of megs. I understand that as an "in Browser torrent client" it's quite difficult to find a place where to store (#86) but as you said you want also to create a Node-Webkit application. I don't think you'll find a solution "to rule them all".

Can't you keep a buffer of a couple of megs for each torrent where you'd put the rarest pieces (or most requested) and just empty/populate it on demand. You have a whole discussion on that for transmissionbt : https://trac.transmissionbt.com/ticket/1521

@Ayms
Copy link

Ayms commented Sep 29, 2014

I understand that as an "in Browser torrent client" it's quite difficult to find a place where to store

I don't understand what's the issue here, storing pieces using indexedDB is easy and works for GB files, if the file has to be seeded you keep the pieces in indexedDB, if not you discard them. As I wrote it's better to keep the pieces instead of merging a GB file as a Blob that you will have to slice anyway to use it.

I have tried some other alternatives, still with indexedDB, like storing pieces as ArrayBuffers in the meantime Chrome implement Blob storage, this does not work very well and the browser will crash for large files, but Blob storage is OK.

See https://code.google.com/p/chromium/issues/detail?id=108012, giving other ideas too but I would not recommend them, the start of the thread is old, indexedDB should be used since it was designed for that purpose.

I don't think you'll find a solution "to rule them all".

Same thing, I don't see the issue, browsers will use indexedDB, other clients will use the disk, both don't have to interoperate.

@tobice
Copy link

tobice commented Oct 25, 2014

Regarding this problem, what is fs-storage for? I mean of course, it saves the files being downloaded to fs but it doesn't really behave like memory storage replacement because at least according to the memory consumption, the torrented files are still completely stored in the memory. On the other hand, I skimmed through the fs-storage code and I noticed that the readBlock method does check whether the given block is available in memory and if not, falls back to fs. But it does not seem to be working properly.

@feross
Copy link
Member Author

feross commented Oct 25, 2014

fs-storage definitely still has a few bugs.

@manvir-singh
Copy link

I ran into this problem too i cant write any large files to disk. Does anyone know of a work around?
This is how im trying to write the GB file to disk and it fails after some time: file.createReadStream().pipe(outFile);

@ghost
Copy link

ghost commented Mar 20, 2015

bump. issue still present and huge problem for older computers.

@feross
Copy link
Member Author

feross commented Jul 10, 2015

Duplicate of #248

@feross feross closed this as completed Jul 10, 2015
@carlopires
Copy link

Any improvements on this? I would like to implement IndexedDB for my browser clients if this is an option supported by the official interface.

@lock lock bot locked as resolved and limited conversation to collaborators May 6, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants