Add end-to-end crypto? #1813
[Originally from https://github.com/mumble-voip/mumble-iphoneos/issues/87]
A mumble server is able to wiretap every mumble client. It would be nice if the content of audio (and chat text per channel) was not available to a passive collection process. It should be possible for every client to do at least a pairwise ZRTP, if not a group ZRTP (pairwise, each pair?) as I think Silent Circle does for group calls.
SilentCircle has this group calling feature but it isn't Free Software. I've heard (but not used) that Jitsi has this with ZRTP and it is Free Software.
This would be an amazing feature and it would mean that a server would essentially just be a relay for encrypted data with minimal metadata (eg: user name, ip address, channel joined).
As a pretty heavy Mumble user I'd just like to express some support for this idea. End-to-end crypto in Mumble would bring it up to speed with OTR, adding vital private voice comms to the landscape which at the moment are pretty scarce and certainly none of them have the feature set of Mumble.
At the moment verifying identities w/ Mumble is pretty difficult since the certificate handling is pretty shoddy, and if there really is no end-to-end encryption then its utility as a safe utility for private communication is severely limited.
Mumble's probably in the best position of any VoIP stack to provide an effective privacy suite.
The text was updated successfully, but these errors were encountered:
I think piggybacking on ZRTP could work nicely.
My main concerns would be bandwidth requirements for group conversations, where, if we pairwise ZRTP between all parties, it could quickly grow into something unreasonable. But, if it's how Silent Circle does it, and it's expected that E2E VoIP group chat sends voice streams to each individual in the conversation, then I suppose it's not too bad...
I'm curious how you expect this to work from a user's and UI perspective. Would E2E mode be something you go into explicitly (I'm assuming it has to be, because of SAS auth), and your current channel is then in E2E mode? Also, what happens when new people join?
ZRTP is obviously more suited for a phone-like structure, where you explicitly make a call to a person or group of people.
So maybe entering E2E mode is equivalent to selecting people currently connected to the server, and launching a separate "E2E" channel for those people. Then it'd be completely separate from the channel-based communication that Mumble currently uses, and more like a "call". For example, it could open the E2E-enabled conversation in a new tab, so that it is obvious to users that it's a separate "call".
But then again, it also feels weird to have this be something you opt into explicitly...
How to implement in a usable way into the current scheme Mumble uses seems to be one of the bigger hurdles to overcome. I'm curious if you guys have any ideas in mind already...
I really dislike having to introduce "calls" semantics into what is by nature a group-chat software just so we can fit ZRTPs underlying assumptions. Here's how I see it:
In any case implementing such a system is a lot of work. Making sure it is safe and provides all the guarantees we think it does is even more work. Such features will also have impact on client complexity, backwards compatibility as well as what server capabilities we can provide in the future (e.g. what about positional audio).
Imho: There's already specialized software you can use if you want e2e encrypted communication whose sole design can focus on that sole purpose. Bolting it on to mumble - even though not quite a frankenstein - might really be a bit out of scope for use. However interesting it is to think about this stuff...
I have been toying with the idea of making mumble into a prototype Matrix client.
Ephemeral Matrix rooms are already created (that I know of) upon placing a call, so channels could just be hidden as well, and made to contain the necessary metadata.
The idea would be to piggy-back on the E2E ecosystem that Matrix already creates (including verification, extensible profiles, support for multiple devices -- soon with cross-signing), while adding everything that's nice about voice on Mumble. I bet a proof-of-concept wouldn't be that hard, since the Matrix Client-server protocol is quite simple.
I have been thinking about this for a while, but haven't found enough time to start implementing this. If you want, I could elaborate a bit more.
I think it could easily be a step into wrong direction, assuming you mean Matrix.org. While 1:1 Riot calls are end-to-end-encrypted, conference calls depend on Jitsi Meet.
Matrix.org is currently also bad for privacy, while Mumble currently doesn't store chat logs, the reference homeserver implementation for Matrix, Synapse stores everything forever matrix-org/matrix-doc#447 including deleted messages. Multiple other privacy issues are listed at https://github.com/privacytoolsIO/privacytools.io/issues/1049.
I don't know what you would do to Murmur, but I have understood Synapse to be very heavy especially if you have bigger rooms, while Murmur seems to run anywhere and there is µMurmur for even more limited systems.
That's indeed the protocol documented on matrix.org I am talking about (though not matrix.org's Matrix server instance, to disambiguate).
This is indeed something that has to be cleared out. And the privacy points you mentioned are indeed concerning. Synapse should get better at this itself, but this is a lesser concern if the participants are on the same homeserver, moreso if E2E is enabled.
I am not sure we really should be discussing the merits of Matrix here, as I wouldn't want to completely derail the thread. I am not advocating for completely replacing Murmur here, but to be able to connect to Matrix servers with Matrix identifiers, and optionally use that as a backend for what Murmur is currently used for.
Actually, I was more thinking of a proof-of concept where a matrix room would just contain a Mumble server address, with an authentication mechanism for connecting (publish the public key in the room, for instance).
As I see it, it could be something completely bolted on the Matrix protocol, with little modification (for starters, at least) to either Mumble or Murmur.
Maybe the proper way to build what I describe here is a "hard" (could always be merged back) fork, after all, if all of this doesn't fall in the scope for Mumble. But it could then also be argued the same about E2E ...
This is stating the obvious but just in case newcomers forget, mumble is already encrypted, so it's not a problem at all if you completely trust your host or your host is one of the parties of conversation.
Please don't misinterpret my comment as saying this idea is already in mumble, just stating for the record it already has encryption, but not E2E encryption, which absolutely has its place and should be considered.
On another note
I think the extra bandwidth could be offset by server owners getting a separate config option for the bandwidth of E2E calls, so they can set it lower than normal.
This would be a per channel solution, because mumble is a channel- and server-based software.
We create a group-key (details below) that is used for encryption for all participants in the channel.
So the client encrypts voice & chat with the group-key and sends it to the server, the server then sends it to the other clients and they decrypt it.
For chat logs (in case of server-side chat logs), the server will add the encrypted chat messages into a file (specific for that group).
Now the only remaining problem is the distribution of the key, because that would normally be send over the server, so a potencial man-in-the-middle-attack by the server is possible.
For 1 (encryption) we have the following solution:
For 2 (verification of sender/receivers) we have these solutions:
I understand that doing this "nicely", with proper, seamless, key exchange would be a lot of work. However, I think there would be a lot of value in doing it with key exchange out of band:
This could be slightly improved at very little cost by having some UI like "Yanmaani would like to enable encryption; key ID =
Lots of other small UX improvements you could make like that, without actually implementing any "big" protocol.
And then if this is implemented and works, someone could maybe start looking at key exchange more properly, or going the Unix way and having that done by a separate, external daemon. But for me, just having this simple though ugly system of pre-sharing a key would be extremely useful for my personal needs.