Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upWebRTC DHT #288
WebRTC DHT #288
Comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
This is something to look at for anyone looking to implement this: https://software.intel.com/en-us/blogs/2015/03/18/meshcentral-experimental-webrtc-mesh cc @jhiesey |
This comment has been minimized.
This comment has been minimized.
|
perhaps this is related https://github.com/diasdavid/webrtc-explorer |
This comment has been minimized.
This comment has been minimized.
|
I was looking for a [Google] Chrome Browser extension for handling BitTorrent metainfo files and magnet URNs and I stumbled across this project, which seems really cool! If I understand correctly DHT is required for the handling of magnet links that don't include tracker URLs, correct? If so does that mean webtorrent doesn't work in the browser for magnet URNs w/o webrtc-dht? |
This comment has been minimized.
This comment has been minimized.
|
@smarts When we detect a magnet link without a tracker on https://instant.io, we just add the tracker wss://tracker.webtorrent.io so that there's a place to get peers from. You can do the same in your code. :-) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Thanks @feross! |
Chrome only. For webtorrent/webtorrent#327
This comment has been minimized.
This comment has been minimized.
|
Any progress on this? I really need this feature (in combination with mutable keys, but one after the other...) |
This comment has been minimized.
This comment has been minimized.
|
No progress. If someone wants to work on this, the WebTorrent project would greatly benefit! |
This comment has been minimized.
This comment has been minimized.
|
Same here, I'm very excited about this. I added $35 to the bounty... it's very few but I hope it might help in some way. |
This comment has been minimized.
This comment has been minimized.
|
I'm doing research about it a couple of hours and I think that DHT algorithm isn't fit right for WebRTC:
It's seems that WebRTC is very expensive for DHT requests. |
This comment has been minimized.
This comment has been minimized.
|
I actually spent some time thinking about this problem a few weeks ago and I think a WebRTC DHT may be feasible. I'll add my thoughts here soon. |
This comment has been minimized.
This comment has been minimized.
|
@feross what would be necessary requisites to have a DHT interesting enough to be used for the WebTorrent project. I'm asking this with the following thoughts in mind:
Happy to work of these things for the WebTorrent project :) Please free to push any notes, ideas or any other info you have thought about this. |
This comment has been minimized.
This comment has been minimized.
|
After some more thinking, I think that the signaling server is the best place for doing DHT queries. Webtorrent should have a few open signaling servers exactly for that case. Please take a look at PeerJS. We can change the server implementation and add support for DHT queries between those servers. |
This comment has been minimized.
This comment has been minimized.
|
I was thinking we could adapt the Kademila algorithm. So, normally the bucket size is K=8 or K=16. But we only need that level of redundancy in each bucket because there are no guarantees when you add a DHT node to the routing table that they'll be online when you need to contact them later. With WebRTC, we have to keep the connection active. This means we can reduce the number of nodes in each bucket to something lower, like K=2. In traditional Kademlia (K=8), for an network of 1M nodes (evenly distributed across the id space), each node will have 144 entries in routing table (check my math). For K=2 and a network of 1M nodes, we only need 40 entries in the routing table, a much more reasonable number of data channels. :) Another difference caused by WebRTC is that when other people have you in their routing table, you have to keep an open connection to them so they can contact you. These are "wasted" connections for you, since they aren't necessary for your own routing table. Nonetheless, they need to be kept open for the other node's sake. If a node is equally likely to be in every other node's routing table, then that would add 40 additional connections. But, over time Kademlia actually prefers long-running nodes since the client never evicts a responsive node from a bucket. This means that over time long-running nodes will end up in lots of client's routing tables. With the WebRTC model, that means that the long-running nodes will end up needing to keep tons of connections open to support other people's routing tables. To solve this, I think we could change the part of Kademlia where we keep longer-running nodes in the table, so every node becomes equally likely to be added to the routing table. Of course, we could ditch Kademlia entirely and come up with something better from scratch, but then it will be harder for existing torrent clients to add support. So something that's basically Kademlia with a few changes would be easier for them to support. For WebTorrent, it doesn't matter. We can add anything that works. If there was a reliable general purpose WebRTC DHT not based on Kademlia, WebTorrent would start using it immediately. Even if desktop torrent clients don't support it, it's still useful for web peers to find each other. And later, if a WebRTC DHT that is closer to Kademlia came along, we could always switch to it later :) "Store and forward" might be a good change for WebRTC since it eliminates the STUN/ICE connection overhead that would happen for each connection. It'd be awesome to hear your thoughts on these ideas, @diasdavid! |
This comment has been minimized.
This comment has been minimized.
|
@moshest Putting that logic into a central server would mean that it's no longer distributed, the D in DHT. At that point, it's basically a tracker server, and we already have that part working :) What we really want is a DHT so we can eliminate the central point of failure that is a tracker server. |
This comment has been minimized.
This comment has been minimized.
|
Thanks @feross , lots of insights! That was exactly what I was looking for, to kick things off :) I think a K bucket of 2 is a good bet on a WebRTC DHT, or I would even say that a K bucket of 1 should suffice. If I'm not missing anything, K buckets offer mainly redundancy for greater availability, having more than one peer at a specific distance means that we always have more than one shot to contact someone on that branch/bucket, which is great for non browser DHTs, since peer representations are pairs IP:Port. In a browser DHT, since we have to keep Data Channels open to avoid all of the STUN/ICE dance each time, we know which is the precise moment a peer goes away, giving us the chance to react and quickly do the handshake with a new peer. Since we need to limit the number of "xor-metric distances" (going with "xmd" for short) used, so that we avoid having too many DataChannels open, we can consider some strategies for finger distribution, for e.g:
The first case is interesting for things like PAST (or DynamoDB), because peers store data ( and not pointers), so enables data replication to closer peers. But since this is WebTorrent and we only need to do is know who is 'providing' what, the second might make more sense. Dunno, ideas :D One thing to have in mind though is that by having a xmd cap of 20 and K=2, we only set the minimum data channels needed, there still might be some cases where a peer is far enough to not be picked by another peer k bucket , but since it has to know some peer on that branch, it will open a connection to the peer that didn't picked him, creating kind of a one direction channel.
Independently of which the DHT is based on, there will always have to be something to do the Signalling. but for WebTorrent case, you already have that. Would it be interesting that instead of starting two signalling servers, we could have a way to mux between signalling for peers that will do file exchange and for peers that will do handshake for DHT? Like a protocol muxer on top of a WebSockets connection (or even WebRTC DataChannel directly with the server)? Please do tell me how could I expose better the webrtc-explorer DHT in a way that we could make that experiment :) |
This comment has been minimized.
This comment has been minimized.
|
A couple random thoughts:
The idea here is that once you've holepunched once (see 3.4 here for background), and you have a cone NAT, your public ip:port should stay the same for all future connections. Let's say you have peers A, B, C. A and B successfully holepunch to each other through a STUN server. A and B now know each others external ip:port. Lets say C comes along and connects to B, also through STUN. B and C now know each others external ip:port.
To get A and C connected, we could do STUN again, but thats expensive. We can use B to swap SDPs between A and C and have them initiate a holepunching dance without STUN. This would bypass the ICE candidate collection phase (since we are effectively caching it here) and also avoid extra roundtrips to the STUN server. |
This comment has been minimized.
This comment has been minimized.
|
According to someone on twitter, if you explicitly set ICE/STUN servers to empty arrays you can skip most external ICE checks (no way to skip internal ones though). |
This comment has been minimized.
This comment has been minimized.
|
Update: you can't skip internal ICE, because you need to find out the internal port that was created for your new SCTP session, so you can relay this port to the other peer. Since peerconnections will always have different ports (e.g. they wont do port multiplexing since its an entirely encrypted protocol), my scheme proposed above where you reuse the SDP from an earlier peer won't work because you won't know their new port without directly asking them, which ends up being the same number of round trips as the normal ICE candidate + signaling flow :( |
This comment has been minimized.
This comment has been minimized.
|
@maxogden Although that flow is the same number of round trips, it seems like a better option because we end up with "signaling peers" rather than servers, keeping the overlay network decentralised. @feross Do you think having a WebRTC backed dgram module would be a good start to the bittorrent-dht being usable in the browser, or we should be starting this from scratch? I've been working on some small prototypes for this and would be interested to hear thoughts. |
This comment has been minimized.
This comment has been minimized.
|
I think a dgram overlay network is the most exciting way to tackle this problem because it means we can use bittorrent-dht as-is and we can also implement all sorts of other distributed protocols that were implemented under an assumption of udp/tcp primitives. Unfortunately, an overlay network with forwarding means we'll need to implement routing algorithms. From what I gather, mesh routing algorithms don't scale very well (perhaps up to hundreds of nodes or maybe low thousands of nodes), so internet-scale routing systems are split up into interconnected autonomous systems (AS) with bridge nodes that advertise the autonomous system prefixes on their local network. I haven't found any research papers so far that bring together all of these ideas into a comprehensive technique, so we might need to experiment with how to glue these ideas together. Here's a good overview of how these routing systems interconnect: http://www.cc.gatech.edu/~traynor/cs3251/f13/slides/lecture13-routing.pdf As for routing protocols, babel seems like a good candidate. I've already started to implement babel with some simulations to check the results. Each AS could maintain a minimum number of connections to other AS networks so that it's very unlikely to get a disconnected graph. Which AS you join on start can be a product of the node ID and then an AS could split when it gets too big. Here's what I have so far for the babel routing protocol implementation: https://github.com/substack/babel-routing-protocol |
This comment has been minimized.
This comment has been minimized.
|
It looks like IPFS might have a similar approach separating peer routing and content routing to cope with NAT traversal: ipfs/specs#1 |
This comment has been minimized.
This comment has been minimized.
^ This would be a great place to get to.
This is an issue, but maybe not one we need to address immediately, if I understand Kademlia correctly the number of open connections we'd need to support the dht would be 160 (or whatever number of bits we use for nodeIds), the browsers maximum peer connections is 256 so this seems pretty like it could be an option What concerns me about a dgram overlay network is how we bootstrap it? Ideally it would function as the other browserify alternatives to core node modules, a drop in replacement with the same api, would we have browser only methods for doing this? |
This comment has been minimized.
This comment has been minimized.
|
Having looked at nazar-pc code, i have two criticism: 1, Why it's not written in typescript - when i'm looking at the webtorrent code and the dht - lifescript one, The signaling should be done through the K-RPC protocol, sending the RTCIceCandidates, and storing it in the K-Bucket instead of the IP:port addresses. I have investigated the technologies in the last few days, and that's what i learnt, or got as an idea, if i understood the whole thing correctly, which is to be honest very hard, processing thousands of javascript code. (webtorrent depends on many) |
This comment has been minimized.
This comment has been minimized.
There is no need to have millions of connections at the same time and even then you can just throw load balancer on it and scale as much as you need. I'd expect WebRTC to be a bottleneck here rather than WebSocket. I think it would be more appropriate to discuss implementation details in the mentioned repo to avoid distracting lots of people subscribed to this issue. Feel free to open issues there. I'll be using that implementation for one of my future projects that is not WebTorrent-related, so any feedback is definitely appreciated. |
This comment has been minimized.
This comment has been minimized.
Sadly you can't reuse SDP or ICE, you can only use it once per PeerConnections, so storing it for later won't make sense. I think that signaling over k-rpc when possible would be better than always using websockets |
This comment has been minimized.
This comment has been minimized.
In my implementation WebSocket is only used briefly during bootstrap (as we can't connect other nodes without signaling using WebRTC). As soon as WebRTC connection is established, the rest of signaling is sent using established WebRTC connections with other peers. https://github.com/nazar-pc/webtorrent-dht/blob/master/bep.rst should explain how this works in details. Storing SDP for future connections is not possible unfortunately. |
This comment has been minimized.
This comment has been minimized.
|
Have someone ever thought about using web push?
The bad parts
navigator.serviceWorker.register('sw.js').then(function(reg) {
return reg.pushManager.subscribe({userVisibleOnly: false})
})
.catch(function(err) {
// Chrome currently only supports the Push API for subscriptions that
// will result in user-visible messages. You can indicate this by calling
// pushManager.subscribe({userVisibleOnly: true}) instead.
// See https://goo.gl/yqv4Q4 for more details.
}); |
This comment has been minimized.
This comment has been minimized.
|
I was actually thinking about this a while ago. I think the problem I was seeing was that you can't send a push notification to another user from within a browser due to CORS restrictions. Unless the push notification service has the proper headers, the browser won't allow sending POSt requests to it. |
This comment has been minimized.
This comment has been minimized.
Yea, notice that too. I wonder why they didn't enable CORS. forced to use a backend/proxy server Edit: mozilla's endpoint responds with CORS headers |
This comment has been minimized.
This comment has been minimized.
|
Some sort of back end is required to tell Peer A where Peer B is (and visa versa). This is the function of a STUN/TURN server for NAT traversal in P2P networks and can be very light weight. Would something else be required to meet CORS requirements?
… On Nov 30, 2018, at 8:05 AM, Jimmy Wärting ***@***.***> wrote:
due to CORS restrictions
Yea, notice that too. I wonder why they didn't enable CORS. forced to use a backend/proxy server
if i want to build a peer to peer chat between two people that have friended eachother and exchanged token then i think they should be able to ping eachother without having to go throught a backend
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
This comment has been minimized.
This comment has been minimized.
|
A STUN/TURN server don't have to do any CORS preflight stuff as a client you never send any ajax request to those servers - it's pretty much handled for you in the background |
This comment has been minimized.
This comment has been minimized.
|
I should have updated this a long time ago. I had an informal conversation about 4 months ago with some standards/browser folks that work in this area. From their point of view, a bunch of p2p people have been asking for raw socket access and the ability to open ports in order to resolve this, which is just never going to happen in the browser. Once we talked through it, I landed on language that was really helpful: What we need is a re-usable signal for WebRTC connections. For a variety of security reasons the browser can't scrap the signal exchange flow but we could potentially create a signal that is reusable and could be added to a DHT and used by other peers for a longer period of time. |
This comment has been minimized.
This comment has been minimized.
Are you talking about ORTC? have heard about ORTC but never investigated it. |
This comment has been minimized.
This comment has been minimized.
|
Edit: Disregard that, I wasn't being constructive. |
This comment has been minimized.
This comment has been minimized.
|
I haven't spent enough time with ORTC to know if they solved this or not. In general, it's best to be a little less specific with browser vendors on feature asks like this. Describing the ask in existing WebRTC terms that don't change the security model except in one specific way which you actually need is the best way to work through a variety of possible solutions.
This isn't a given. I have my own thoughts on why the current performance is terrible but performance is a solvable problem. Nothing about WebRTC makes it inherently less performant and my current view is that the implementations are just old and nobody is really touching them, the browser vendors are just binding to a bunch of RTP libraries. People complain about the performance a lot but, according to Mozilla, nobody has ever logged a usable issue on the subject. What aspect of performance is bad? Where's a reusable test case? For the most part, people who write browsers are not also writing web applications, so you can't assume they are aware of things we are aware of unless someone has done the work of properly communicating it to them. Yes, they don't make it easy, they all have obtuse bug tracking processes that are used by literally no other project, but that's the situation we're in.
If you had a re-usable signal that lasted 24 hours you'd be fine. |
This comment has been minimized.
This comment has been minimized.
Seems like a worthy goal, but to implement re-usable signals wouldn't you need to build a DHT or central routing service anyway? We'd have to assume users are roaming around to different networks and would need to locate them as their address changes. |
This comment has been minimized.
This comment has been minimized.
|
WebRTC-based DHT is definitely possible and while performance is a concern, potential number of users in this case I think is several magnitudes higher. I've actually studied DHTs and various approaches to their construction for a while and created another WebRTC-based DHT called Detox DHT (and ES-DHT that is used as generic independent framework under the hood) that I use instead of previously mentioned version. It is built from scratch using alternative design that takes some major ideas from Kademlia and other papers, but doesn't replicate Mainline DHT with its inherent incompatibilities with WebRTC. It has somewhat different focus, but still might be of interest to others. Detox DHT repository contains source code and tests. By following link to ES-DHT it is based on, you can find framework's source code, tests, design document, specification and references to papers. |
This comment has been minimized.
This comment has been minimized.
You have this problem no matter what. DHT's that store IP addresses suffer from the exact same issue. |
This comment has been minimized.
This comment has been minimized.
|
@mikeal Right, just trying to point out that getting browsers to allow re-usable SDP just pushes the problem of implementing a DHT onto them (which maybe is what we want). |
This comment has been minimized.
This comment has been minimized.
|
Assuming you had reusable signals, how would connections work? Would it be something like: Receiver side:
Initiator side:
You could potentially have a DHT that bootstrapped by talking to an initial centralized node, and then using this reusable SDP from then on, but you're still constrained by having to rely on centralized signalling servers.
I'm 100% into this. A few months ago I was trying to convince the folks behind the Beaker Browser to provide an interface to the DHT used for Dat in order to enable different types of p2p applications. Since it'd be cheaper to have a single DHT set up per browser rather than per browser tab or origin, having a DHT API provided at the browser level helps performance, and having a standard high level API simplifies p2p application development. This is currently being replaced by a higher-level PeerSocket API that goes a step further by providing an interface that opens sockets to peers for a given channel and abstracts away the DHT or MDNS systems that are behind the scenes. |
This comment has been minimized.
This comment has been minimized.
|
What could make some progress on this is an elaborate 10-page proposal about what needs to be changed in WebRTC to make this P2P scenario possible. The suggested changes need to be simple and they should also bring some business value. The proposal could make a point about the emerging market of P2P apps or maybe eliminating the ICE step from video calls and thus making them connect faster. Then this proposal need to be presented to the decision makers and if they are convinced, we'll likely see in a year how the WebRTC team rolls out the new API. I would push for an option where a previously established RTCPeerConnection could be saved and then restored by id: WebRTC would send a UDP message to the saved ipv4:port and if the response is correct, we assume that it's safe to restore the connection and skip all the SDP handshakes. Another option is to let the desktop nodes of webtorrent act as signalling servers. |
Discussion for any developments with the webrtc dht