Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add end-to-end crypto? #1813

mkrautz opened this issue Sep 16, 2015 · 13 comments

Add end-to-end crypto? #1813

mkrautz opened this issue Sep 16, 2015 · 13 comments


Copy link

@mkrautz mkrautz commented Sep 16, 2015

[Originally from]

@ioerror says:

A mumble server is able to wiretap every mumble client. It would be nice if the content of audio (and chat text per channel) was not available to a passive collection process. It should be possible for every client to do at least a pairwise ZRTP, if not a group ZRTP (pairwise, each pair?) as I think Silent Circle does for group calls.

SilentCircle has this group calling feature but it isn't Free Software. I've heard (but not used) that Jitsi has this with ZRTP and it is Free Software.

This would be an amazing feature and it would mean that a server would essentially just be a relay for encrypted data with minimal metadata (eg: user name, ip address, channel joined).

@Tea23 says:

As a pretty heavy Mumble user I'd just like to express some support for this idea. End-to-end crypto in Mumble would bring it up to speed with OTR, adding vital private voice comms to the landscape which at the moment are pretty scarce and certainly none of them have the feature set of Mumble.

At the moment verifying identities w/ Mumble is pretty difficult since the certificate handling is pretty shoddy, and if there really is no end-to-end encryption then its utility as a safe utility for private communication is severely limited.

Mumble's probably in the best position of any VoIP stack to provide an effective privacy suite.

Copy link
Member Author

@mkrautz mkrautz commented Sep 16, 2015

I think piggybacking on ZRTP could work nicely.

My main concerns would be bandwidth requirements for group conversations, where, if we pairwise ZRTP between all parties, it could quickly grow into something unreasonable. But, if it's how Silent Circle does it, and it's expected that E2E VoIP group chat sends voice streams to each individual in the conversation, then I suppose it's not too bad...

I'm curious how you expect this to work from a user's and UI perspective. Would E2E mode be something you go into explicitly (I'm assuming it has to be, because of SAS auth), and your current channel is then in E2E mode? Also, what happens when new people join?

ZRTP is obviously more suited for a phone-like structure, where you explicitly make a call to a person or group of people.

So maybe entering E2E mode is equivalent to selecting people currently connected to the server, and launching a separate "E2E" channel for those people. Then it'd be completely separate from the channel-based communication that Mumble currently uses, and more like a "call". For example, it could open the E2E-enabled conversation in a new tab, so that it is obvious to users that it's a separate "call".

But then again, it also feels weird to have this be something you opt into explicitly...

How to implement in a usable way into the current scheme Mumble uses seems to be one of the bigger hurdles to overcome. I'm curious if you guys have any ideas in mind already...


Copy link

@hacst hacst commented Sep 17, 2015

I really dislike having to introduce "calls" semantics into what is by nature a group-chat software just so we can fit ZRTPs underlying assumptions. Here's how I see it:


  • We already have public key material in the form of our certificates which gives us some straight-forward ways to encrypt whispers in a way that isn't susceptible to passive eavesdropping.
  • In general to protect against MITM simply relying on our existing key material seems like the most logical approach. It would be a natural fit with our friends system which already uses certificate hashes and works cross-server.


  • A basic property of group-chat is that you have to trust all participants anyways as any of them can leak the whole conversation. As such multiple streams are a waste of bandwidth.
  • Unless I'm missing something a shared secret is clearly the way to go there. How that shared secret is managed and negotiated is a more difficult issue.
  • However we wouldn't have to solve that immediately as allowing groups to set a out-of-band exchanged secret is a natural first stage of such a feature anyways. You could even have that be part of proper passworded channels with the server enforcing entry restrictions without actually knowing the password.
  • To have groups negotiate a shared secret or enforce regular re-keying (mumbler's like to idle) is a bit trickier. You could do something simple with a kind of mini web-of-trust derived from friend relationships. The user would have to request the secret from a friend/someone already in said channel.


  • All of this gets more complicated if you want PFS (which we would likely want at some point or another if we bother to do E2E crypto in the first place). In that case we actually need the users to do a DH handshake for each session (aka call :( . At least that handshake can be protected by the cert...).
  • I'm not sure if we can benefit from the key continuity idea. A mumble account isn't bound to one client so how would we make sure we have the cached key material available?
  • I guess doing the SAS for the first ever DH could be beneficial usability wise compared to having to compare certificate hashes (though the latter would be stronger from what I understand).
  • Relying on the certificates also allow things like "trust everyone signed by the same CA as me" or "CA x,y and z" which might be interesting for groups signing their own certificates. They could be automatically secured without any additional configuration / interaction.

In any case implementing such a system is a lot of work. Making sure it is safe and provides all the guarantees we think it does is even more work. Such features will also have impact on client complexity, backwards compatibility as well as what server capabilities we can provide in the future (e.g. what about positional audio).

Imho: There's already specialized software you can use if you want e2e encrypted communication whose sole design can focus on that sole purpose. Bolting it on to mumble - even though not quite a frankenstein - might really be a bit out of scope for use. However interesting it is to think about this stuff...


Copy link

@Zorlin Zorlin commented Dec 21, 2017

This is a very important feature for some of my team chats, +1


Copy link

@ezi0o ezi0o commented Dec 19, 2018

in 2018 i think this is a major concern and should be implemented


Copy link

@damajor damajor commented Aug 3, 2019

Over time it became necessary to get this feature. A trade-off may be using more bandwidth but nowadays this is a minor issue.


Copy link

@ranomier ranomier commented Aug 25, 2019

Just to add something to the List to consider.

Signal messenger added a new protocol called RingRTC.

I think its not yet open source, but I'm sure it will be soon.


Copy link

@MayeulC MayeulC commented Aug 26, 2019

I have been toying with the idea of making mumble into a prototype Matrix client.
Since rooms can contain arbitrary data (and are just a persistent, distributed graph database), they could be made to store the channel tree. Channels could be shared between multiple rooms, etc.

Ephemeral Matrix rooms are already created (that I know of) upon placing a call, so channels could just be hidden as well, and made to contain the necessary metadata.
I think it could be worthwile to pursue that goal, although it could be more suitable to open an issue for the feature instead of discussing it here.

The idea would be to piggy-back on the E2E ecosystem that Matrix already creates (including verification, extensible profiles, support for multiple devices -- soon with cross-signing), while adding everything that's nice about voice on Mumble. I bet a proof-of-concept wouldn't be that hard, since the Matrix Client-server protocol is quite simple.

I have been thinking about this for a while, but haven't found enough time to start implementing this. If you want, I could elaborate a bit more.


Copy link

@Mikaela Mikaela commented Aug 26, 2019

I have been toying with the idea of making mumble into a prototype Matrix client.

I think it could easily be a step into wrong direction, assuming you mean While 1:1 Riot calls are end-to-end-encrypted, conference calls depend on Jitsi Meet. is currently also bad for privacy, while Mumble currently doesn't store chat logs, the reference homeserver implementation for Matrix, Synapse stores everything forever matrix-org/matrix-doc#447 including deleted messages. Multiple other privacy issues are listed at

I don't know what you would do to Murmur, but I have understood Synapse to be very heavy especially if you have bigger rooms, while Murmur seems to run anywhere and there is µMurmur for even more limited systems.


Copy link

@MayeulC MayeulC commented Aug 26, 2019

assuming you mean

That's indeed the protocol documented on I am talking about (though not's Matrix server instance, to disambiguate).

I think it could easily be a step into wrong direction,

This is indeed something that has to be cleared out. And the privacy points you mentioned are indeed concerning. Synapse should get better at this itself, but this is a lesser concern if the participants are on the same homeserver, moreso if E2E is enabled.

I am not sure we really should be discussing the merits of Matrix here, as I wouldn't want to completely derail the thread. I am not advocating for completely replacing Murmur here, but to be able to connect to Matrix servers with Matrix identifiers, and optionally use that as a backend for what Murmur is currently used for.
We do not have to use jitsi. There is no requirement of interoperability with the call functionality, I rather see it as being orthogonal. However, a complete p2p implementation would be a requirement if we want to avoid changing synapse (the server's TURN and STUN config can be reused), while not depending on Murmur.

Actually, I was more thinking of a proof-of concept where a matrix room would just contain a Mumble server address, with an authentication mechanism for connecting (publish the public key in the room, for instance).

As I see it, it could be something completely bolted on the Matrix protocol, with little modification (for starters, at least) to either Mumble or Murmur.


  • Streamlined onboarding? (join a Matrix room)
  • Support for more clients (if a Matrix client "just" has to implement some parts of Mumble), at least for text
  • Possibility of a Murmur-less environment if Mumble is adjusted to do so (and to facilitate the above)
  • Build on the progress being made in Matrix ecosystem: persistent chat history with asynchronous messages, bridges, and later E2E crypto with cross-signing, distributed identities, etc.
  • Do not reinvent the wheel with the above!


  • Adjustments needed for Mumble (long term, websockets support would be nice for web compatibility)
  • I feel a bit bad about Murmur, but Matrix is a superset for a lot of chat protocols, so it could always be "the" lightweight implementation.

Maybe the proper way to build what I describe here is a "hard" (could always be merged back) fork, after all, if all of this doesn't fall in the scope for Mumble. But it could then also be argued the same about E2E ...
I'm just trying to collect some feedback here, and sharing my thoughts :)


Copy link

@nifker nifker commented Sep 1, 2019

I just wanted to add, that it needs to be possible for clients to verify their conference partners fingerprints, if the server is giving out wrong public keys to the user.


Copy link

@JJRcop JJRcop commented Oct 19, 2019

This is stating the obvious but just in case newcomers forget, mumble is already encrypted, so it's not a problem at all if you completely trust your host or your host is one of the parties of conversation.

Please don't misinterpret my comment as saying this idea is already in mumble, just stating for the record it already has encryption, but not E2E encryption, which absolutely has its place and should be considered.

On another note

I think the extra bandwidth could be offset by server owners getting a separate config option for the bandwidth of E2E calls, so they can set it lower than normal.


Copy link

@toby63 toby63 commented Apr 30, 2020

I would like to add a concept for this (also considering the discussion about chat logs #2560).
Some of these ideas were of course already mentioned (especially by @hacst (see comment 2 )).

This would be a per channel solution, because mumble is a channel- and server-based software.

We create a group-key (details below) that is used for encryption for all participants in the channel.

So the client encrypts voice & chat with the group-key and sends it to the server, the server then sends it to the other clients and they decrypt it.

For chat logs (in case of server-side chat logs), the server will add the encrypted chat messages into a file (specific for that group).
Access to the chatlog could be managed with the already implemented client/user certificate (only members of the group get access).

group-key creation:
One of the most important factors is that the server is not involved in the creation.
I think the easiest solution is if one of the users (maybe the first user in the channel) creates the key (automatically).

Now the only remaining problem is the distribution of the key, because that would normally be send over the server, so a potencial man-in-the-middle-attack by the server is possible.
We need two solutions for that:

  1. encrypt the group-key for transportation (so it is not readable by the server).
  2. verify the certificate of the sender (and receivers) (for man-in-the-middle-protection).

For 1 (encryption) we have the following solution:
We use (maybe seperate/additional) user certificates (lets call them: friends-certificates) to encrypt the group-key for transmission.
(Note: The reason for additional certificates is that we maybe want to seperate between user-certificates (that would only be known between server and user) and friends-certificates (which would be known between friends, but send via the server).)

For 2 (verification of sender/receivers) we have these solutions:

  1. friends- or contact-trust:
    The mumble client could create another certificate (lets call it: trust-friends-certificate) and that is then send by the user via a different communication channel than mumble (e.g. via email) to someone else.
    This person can then import the certificate.
    The idea would be to make it as easy as possible, so the certificate could simply be a long string of characters, that the user can copy into a special text field in mumble, to import it.
    The group-key (or the friends-certificate mentioned above) is then signed with this trust-friends-certificate and so the other user(s) can trust it.
    This is also a long-term solution, because friends will be able to reuse this trust-system, if they don't change their identity etc.

  2. implement other information-directory-services or key-servers etc.:
    The idea is of course to have a third-party that we can trust instead of the mumble server.

  3. show a (rather short) code in the client, that users could compare by hand.
    Rather insecure. The idea was e.g. that a user reads this to other users in the voice chat and that they compare it then.


  • persistent keys:
    The same group could of course keep the same key (also look at Perfect-forward-secrecy below).
    This way we would not always have to create and share new keys.

  • Perfect-forward-secrecy:
    For PFS we could create session-keys derived from the group-key.

  • New participants:
    If a new participant joins the channel, a new key could/should be created.
    This way the chat log could stay private among the past participants.
    Another idea would be an option to add a new user to the group (so he/she can get chat log access), but only with permission of all group members (probably difficult to implement).
    Update: Actually not that difficult, because we have the server as a second protection of the chatlogs, so the server would ask the participants for permission and then he adds the new user to the access list.

  • How can we identify the users?
    For understanding: the problem is that if we use the same key for every participant, no one can be really sure, from whom a message was.
    (Even though the server would of course send it with the "correct" username)
    Now we should consider one thing:
    Maybe we don't want that, at least not in a total secure way, because of plausible deniability.
    That said, if you still want identification, the solution would be to sign the messages with a seperate user certificate.


Copy link

@yanmaani yanmaani commented Nov 1, 2021

I understand that doing this "nicely", with proper, seamless, key exchange would be a lot of work. However, I think there would be a lot of value in doing it with key exchange out of band:

  1. I join a room with my mates, Bob and Alice
  2. Out-of-band, e.g. in a Matrix room, we agree to use the encryption key dab427091518b7fa7ee9a18c408cd7ff068bb16b0bbfc39596fe4e4b0e7967f2
  3. I press the button "enable encryption" and enter the key
  4. My client now encrypts all outgoing voice packets with the key. To whoever doesn't have my key, I am just sending them garbage, so I am muted.
  5. Everyone I'm trying to receive voice from who isn't encrypting with my key also gets discarded
  6. If Alice and Bob entered the same key, we can talk
  7. We have PFS because we're negotiating a new session key each time

This could be slightly improved at very little cost by having some UI like "Yanmaani would like to enable encryption; key ID = a22943da1aba66c40dc1c6dcc8a29b3d" (where that's a truncated/salted hash of the session key), and only actually beginning to transmit encrypted when all the other room members have turned on the encryption properly.

Lots of other small UX improvements you could make like that, without actually implementing any "big" protocol.

And then if this is implemented and works, someone could maybe start looking at key exchange more properly, or going the Unix way and having that done by a separate, external daemon. But for me, just having this simple though ugly system of pre-sharing a key would be extremely useful for my personal needs.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet