Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upMutable torrents (BEP46) #886
Comments
This comment has been minimized.
This comment has been minimized.
|
Nice! The great thing about implementing this here instead of in your fork of
When a mutable torrent gets updated, the info hash of the torrent will change. That will violate a lot of assumptions about how Updating all the places where the info hash is used in a live torrent will be hard. It's used in lots of places and this sounds hard to get right. Maybe the simplest way to implement this is in stages:
And some questions:
|
This comment has been minimized.
This comment has been minimized.
|
Sounds good. Also to consider BEP39 (centralized version of BEP46) where updates of torrents happen through an HTTP server, so these APIs should be the same. So it makes sense to have I'm a little bit ignorant about re-using the data. I think there's BEP47 which is about this, but I don't understand it yet. |
This comment has been minimized.
This comment has been minimized.
Supporting BEP39 as well sounds good to me.
If you specify the same path as the old torrent, and the new torrent has changed the filenames, then the old files will not be touched. This could confuse the user. For now, let's just leave this up to the user. They can delete the files if they want when they call |
This comment has been minimized.
This comment has been minimized.
|
I was thinking that most of the time publishers will apply updates to the underlying data their sharing. So re-downloading the whole data would be bad. For instance, imagine Archive.org publishing data dumps. As a user I wouldn't want to delete and replace the whole dump every time. Does webtorrent have capabilities to re-use data if I only appended data to my torrent? Archive.org could then share append-only data structures (CSV?) and the old pieces can be reused, right? |
This comment has been minimized.
This comment has been minimized.
I'm not sure how the We should ensure that it handles this situation gracefully. If the file name is the same, but it's just larger now, then the file size should be increased on disk without deleting the file. Then, the normal piece verification process can take care of figuring out whether the data is valid or not. But as I mentioned before, if the user specifies the same path as the old torrent but the new torrent has changed the filenames, then the old files will not be touched. That will leave files from the old torrent and new one side-by-side. The only way to make data re-use work nicely for the user is if we handle this in WebTorrent. We need to add say a If you want to tackle this in the same PR, you can give it a go. But I think it would be easier to just punt on this for now and add it in a future PR. As an initial PR, I would just fire an event and allow the user to call |
This comment has been minimized.
This comment has been minimized.
|
@feross so the first step is to parse the new magnet URI structure. I've nailed it down to this part of code: https://github.com/feross/webtorrent/blob/master/lib/torrent.js#L235 parseTorrent.remote(torrentId, function (err, parsedTorrent) {
if (self.destroyed) return
if (err) return self._destroy(err)
self._onParsedTorrent(parsedTorrent)
})So this method should do a var parsed = new ParseTorrent(opts)
parsed.once('done', (parsedTorrent) => self._onParsedTorrent(parsedTorrent))Otherwise, without breaking the API, I could send the |
This comment has been minimized.
This comment has been minimized.
|
Ok so I've added a pull request with the syntax Travis tests seem to break because I'm using |
This comment has been minimized.
This comment has been minimized.
|
I believe this is more likely a travis env issue. The default compiler must be to old to handle C++11 like said in: https://travis-ci.org/feross/parse-torrent/builds/153981472#L165 To make it work in this repo we need to add the g++ package to the travis image like @substack did : https://github.com/substack/ed25519-supercop/blob/master/.travis.yml#L10 |
This comment has been minimized.
This comment has been minimized.
|
@yciabaud I'm really bad with travis (in fact never used it before). What line should I edit exactly to make it work? Feel free submitting a PR on my fork. |
This comment has been minimized.
This comment has been minimized.
|
To get things going again @feross there's also this PR on bittorrent-dht waiting review webtorrent/bittorrent-dht#136 |
This comment has been minimized.
This comment has been minimized.
|
Any updates? :-) |
This comment has been minimized.
This comment has been minimized.
|
Seriously guys, any updates on this ? |
This comment has been minimized.
This comment has been minimized.
|
This will be very useful :) |
This comment has been minimized.
This comment has been minimized.
I've been thinking for a while that the ideal approach might be for it to be possible to continue being connected to the old swarm for the chunks that are known to hold the same unchanged data. This way the swarm would not divide its efforts. If the publisher decides to make changes that you do not want to keep, or for whatever reason you want to be able to access an old version of the torrent, you would be able to do so without the old swarm being abandoned by all the other peers who update or who are newly incorporated, even though most of the data shared is the same. This would also allow for someone to make a "fork" of any torrent he wants and add/change some data to improve the content without having to start from scratch with a new swarm and new peers for the totality of the content, which would split the community and discourage anyone willing to improve the content of a torrent. Considering how important is popularity of the content in a P2P data sharing for its speed and stability, and how easy is for less popular content or variations to die, it would be best to maximize the reusability and have better redundancy by being able to rely on more than one swarm for the same data when possible. It would be great if there was a BEP that allowed something like that... it's possible that this could result in a behavior similar as to how Git works, where you can make a commit on top of another yet keep that commit pointing to the old one, but unlike Git, there would be no need to store the history of the commits, as soon as a new version of the torrent has a change in a chunk, all the references for that chunk to other swarms could be removed. BEP46 brings a convenient "autoupdate", but I would not mind that much updating manually and having the control to decide when do I want to update. The real problem with versioning in torrents is not really that people don't update, the root of the problem is that when people don't update they do not contribute to the new swarm and there might be peers/seeders lost after every update when people do not keep up. I'm of course not opposing to BEP46, I think it's a convenient feature when you do want to be kept up to date. But the real leap to me will be when torrents can define, for particular chunks of data, to get their peers from other swarms (and only for the data of those chunks). Only then will the real obstacle for torrent "mutation" be removed. You could mutate a torrent without affecting anyone else and still be a contributor to the swarm. |
This comment has been minimized.
This comment has been minimized.
|
One use case that really excites me about this is the ability to load static websites from a browser. Sort of like what IPFS is doing, but using existing technology instead of building a new tech stack. Since webtorrent is being integrated with Brave, it's a perfect avenue for building fully decentralized web apps without needing to invest in any specific cryptocurrency. With this in place, people could create and publish p2p web sites and have them update in a fully decentralized system without needing any services for hosting their content. (Other than peers). This would essentially be a competitor to the Beaker Browser and the Dat protocol ecosystem. |
This comment has been minimized.
This comment has been minimized.
|
I recently published a library, mutable-webtorrent which wraps over the WebTorrent API and adds support for mutable torrents in magnet links as well as some helper functions for creating and updating mutable torrents. I want to revive this effort if possible and get my changes merged into the main webtorrent repo. |
Hi guys, so I was thinking of implementing BEP46 in WebTorrent. I already implemented a simple reference implementation on top of
webtorrent-cli(https://github.com/lmatteis/dmt), but I think this needs to go deeper in the modules themselves. So after couple hours of studying the code, I think the best place to put this should be in thetorrent-discovery. It already has an interval (using recursive timeouts) which announces to the DHT (the regular announcing), so it makes sense to also have an interval that gets()/puts() if it's a mutable torrent.Higher up in the hierarchy the
.add()method of WebTorrent creates anew Torrent()EventEmitter, which emits_infoHash. This could instead emit the same event every time a new torrent is found (hence should simply change from.onceto.on). Problem is whether we want to add a new torrent to the list, or should we update the current one when an update is found?Then inside
torrent.js(always WebTorrent) thenew Discovery()EventEmitter should probably emit an event for when a new torrent is found in the DHT. Something like.on('dhtTorrent')or something else.These seem like most of the parts I'd have to touch. Thoughts @feross @substack @mafintosh ?