Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add end-to-end crypto? #1813

mkrautz opened this issue Sep 16, 2015 · 10 comments


Copy link

commented Sep 16, 2015

[Originally from]

@ioerror says:

A mumble server is able to wiretap every mumble client. It would be nice if the content of audio (and chat text per channel) was not available to a passive collection process. It should be possible for every client to do at least a pairwise ZRTP, if not a group ZRTP (pairwise, each pair?) as I think Silent Circle does for group calls.

SilentCircle has this group calling feature but it isn't Free Software. I've heard (but not used) that Jitsi has this with ZRTP and it is Free Software.

This would be an amazing feature and it would mean that a server would essentially just be a relay for encrypted data with minimal metadata (eg: user name, ip address, channel joined).

@Tea23 says:

As a pretty heavy Mumble user I'd just like to express some support for this idea. End-to-end crypto in Mumble would bring it up to speed with OTR, adding vital private voice comms to the landscape which at the moment are pretty scarce and certainly none of them have the feature set of Mumble.

At the moment verifying identities w/ Mumble is pretty difficult since the certificate handling is pretty shoddy, and if there really is no end-to-end encryption then its utility as a safe utility for private communication is severely limited.

Mumble's probably in the best position of any VoIP stack to provide an effective privacy suite.


This comment has been minimized.

Copy link
Member Author

commented Sep 16, 2015

I think piggybacking on ZRTP could work nicely.

My main concerns would be bandwidth requirements for group conversations, where, if we pairwise ZRTP between all parties, it could quickly grow into something unreasonable. But, if it's how Silent Circle does it, and it's expected that E2E VoIP group chat sends voice streams to each individual in the conversation, then I suppose it's not too bad...

I'm curious how you expect this to work from a user's and UI perspective. Would E2E mode be something you go into explicitly (I'm assuming it has to be, because of SAS auth), and your current channel is then in E2E mode? Also, what happens when new people join?

ZRTP is obviously more suited for a phone-like structure, where you explicitly make a call to a person or group of people.

So maybe entering E2E mode is equivalent to selecting people currently connected to the server, and launching a separate "E2E" channel for those people. Then it'd be completely separate from the channel-based communication that Mumble currently uses, and more like a "call". For example, it could open the E2E-enabled conversation in a new tab, so that it is obvious to users that it's a separate "call".

But then again, it also feels weird to have this be something you opt into explicitly...

How to implement in a usable way into the current scheme Mumble uses seems to be one of the bigger hurdles to overcome. I'm curious if you guys have any ideas in mind already...


This comment has been minimized.

Copy link

commented Sep 17, 2015

I really dislike having to introduce "calls" semantics into what is by nature a group-chat software just so we can fit ZRTPs underlying assumptions. Here's how I see it:


  • We already have public key material in the form of our certificates which gives us some straight-forward ways to encrypt whispers in a way that isn't susceptible to passive eavesdropping.
  • In general to protect against MITM simply relying on our existing key material seems like the most logical approach. It would be a natural fit with our friends system which already uses certificate hashes and works cross-server.


  • A basic property of group-chat is that you have to trust all participants anyways as any of them can leak the whole conversation. As such multiple streams are a waste of bandwidth.
  • Unless I'm missing something a shared secret is clearly the way to go there. How that shared secret is managed and negotiated is a more difficult issue.
  • However we wouldn't have to solve that immediately as allowing groups to set a out-of-band exchanged secret is a natural first stage of such a feature anyways. You could even have that be part of proper passworded channels with the server enforcing entry restrictions without actually knowing the password.
  • To have groups negotiate a shared secret or enforce regular re-keying (mumbler's like to idle) is a bit trickier. You could do something simple with a kind of mini web-of-trust derived from friend relationships. The user would have to request the secret from a friend/someone already in said channel.


  • All of this gets more complicated if you want PFS (which we would likely want at some point or another if we bother to do E2E crypto in the first place). In that case we actually need the users to do a DH handshake for each session (aka call :( . At least that handshake can be protected by the cert...).
  • I'm not sure if we can benefit from the key continuity idea. A mumble account isn't bound to one client so how would we make sure we have the cached key material available?
  • I guess doing the SAS for the first ever DH could be beneficial usability wise compared to having to compare certificate hashes (though the latter would be stronger from what I understand).
  • Relying on the certificates also allow things like "trust everyone signed by the same CA as me" or "CA x,y and z" which might be interesting for groups signing their own certificates. They could be automatically secured without any additional configuration / interaction.

In any case implementing such a system is a lot of work. Making sure it is safe and provides all the guarantees we think it does is even more work. Such features will also have impact on client complexity, backwards compatibility as well as what server capabilities we can provide in the future (e.g. what about positional audio).

Imho: There's already specialized software you can use if you want e2e encrypted communication whose sole design can focus on that sole purpose. Bolting it on to mumble - even though not quite a frankenstein - might really be a bit out of scope for use. However interesting it is to think about this stuff...


This comment has been minimized.

Copy link

commented Dec 21, 2017

This is a very important feature for some of my team chats, +1


This comment has been minimized.

Copy link

commented Dec 19, 2018

in 2018 i think this is a major concern and should be implemented


This comment has been minimized.

Copy link

commented Aug 3, 2019

Over time it became necessary to get this feature. A trade-off may be using more bandwidth but nowadays this is a minor issue.

@Mikaela Mikaela referenced this issue Aug 11, 2019
4 of 4 tasks complete

This comment has been minimized.

Copy link

commented Aug 25, 2019

Just to add something to the List to consider.

Signal messenger added a new protocol called RingRTC.

I think its not yet open source, but I'm sure it will be soon.


This comment has been minimized.

Copy link

commented Aug 26, 2019

I have been toying with the idea of making mumble into a prototype Matrix client.
Since rooms can contain arbitrary data (and are just a persistent, distributed graph database), they could be made to store the channel tree. Channels could be shared between multiple rooms, etc.

Ephemeral Matrix rooms are already created (that I know of) upon placing a call, so channels could just be hidden as well, and made to contain the necessary metadata.
I think it could be worthwile to pursue that goal, although it could be more suitable to open an issue for the feature instead of discussing it here.

The idea would be to piggy-back on the E2E ecosystem that Matrix already creates (including verification, extensible profiles, support for multiple devices -- soon with cross-signing), while adding everything that's nice about voice on Mumble. I bet a proof-of-concept wouldn't be that hard, since the Matrix Client-server protocol is quite simple.

I have been thinking about this for a while, but haven't found enough time to start implementing this. If you want, I could elaborate a bit more.


This comment has been minimized.

Copy link

commented Aug 26, 2019

I have been toying with the idea of making mumble into a prototype Matrix client.

I think it could easily be a step into wrong direction, assuming you mean While 1:1 Riot calls are end-to-end-encrypted, conference calls depend on Jitsi Meet. is currently also bad for privacy, while Mumble currently doesn't store chat logs, the reference homeserver implementation for Matrix, Synapse stores everything forever matrix-org/matrix-doc#447 including deleted messages. Multiple other privacy issues are listed at privacytoolsIO/

I don't know what you would do to Murmur, but I have understood Synapse to be very heavy especially if you have bigger rooms, while Murmur seems to run anywhere and there is µMurmur for even more limited systems.


This comment has been minimized.

Copy link

commented Aug 26, 2019

assuming you mean

That's indeed the protocol documented on I am talking about (though not's Matrix server instance, to disambiguate).

I think it could easily be a step into wrong direction,

This is indeed something that has to be cleared out. And the privacy points you mentioned are indeed concerning. Synapse should get better at this itself, but this is a lesser concern if the participants are on the same homeserver, moreso if E2E is enabled.

I am not sure we really should be discussing the merits of Matrix here, as I wouldn't want to completely derail the thread. I am not advocating for completely replacing Murmur here, but to be able to connect to Matrix servers with Matrix identifiers, and optionally use that as a backend for what Murmur is currently used for.
We do not have to use jitsi. There is no requirement of interoperability with the call functionality, I rather see it as being orthogonal. However, a complete p2p implementation would be a requirement if we want to avoid changing synapse (the server's TURN and STUN config can be reused), while not depending on Murmur.

Actually, I was more thinking of a proof-of concept where a matrix room would just contain a Mumble server address, with an authentication mechanism for connecting (publish the public key in the room, for instance).

As I see it, it could be something completely bolted on the Matrix protocol, with little modification (for starters, at least) to either Mumble or Murmur.


  • Streamlined onboarding? (join a Matrix room)
  • Support for more clients (if a Matrix client "just" has to implement some parts of Mumble), at least for text
  • Possibility of a Murmur-less environment if Mumble is adjusted to do so (and to facilitate the above)
  • Build on the progress being made in Matrix ecosystem: persistent chat history with asynchronous messages, bridges, and later E2E crypto with cross-signing, distributed identities, etc.
  • Do not reinvent the wheel with the above!


  • Adjustments needed for Mumble (long term, websockets support would be nice for web compatibility)
  • I feel a bit bad about Murmur, but Matrix is a superset for a lot of chat protocols, so it could always be "the" lightweight implementation.

Maybe the proper way to build what I describe here is a "hard" (could always be merged back) fork, after all, if all of this doesn't fall in the scope for Mumble. But it could then also be argued the same about E2E ...
I'm just trying to collect some feedback here, and sharing my thoughts :)


This comment has been minimized.

Copy link

commented Sep 1, 2019

I just wanted to add, that it needs to be possible for clients to verify their conference partners fingerprints, if the server is giving out wrong public keys to the user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
10 participants
You can’t perform that action at this time.