Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Table of Contents
Key distribution is the hardest problem in end-to-end encryption. That is, how to allow Alice to fetch the necessary keys to encrypt messages that only Bob can read. Traditionally, OpenPGP deals with this problem by means of a decentralized key distribution system called Web of Trust, in which users would trust their friends' friends. This requires a significant amount of work by the user, and is a hard concept to grasp for average users.
During the past couple decades, other key distribution and verification systems became wildly popular, such as the one used by browsers for TLS. These have a fatal flaw, however, which is that they require the user to trust a large number of Certificate Authorities from all around the world. We've seen some of these Certificate Authorities compromised by attackers. As a result of such problems, alternate systems were created on top of the PKI model with the goal to mitigate this problem, such as Convergence and Certificate Transparency.
For End-To-End, our current approach to key distribution, is to use a model similar to Certificate Transparency, and use the email messages themselves as a gossip protocol, which allow the users themselves to keep the centralized authorities honest. This approach allows users to not have to know about keys, but at the same time, be able to make sure that the servers involved aren't doing anything malicious behind the users' back.
To allow the system to be easily distributed (across multiple identity providers), key servers can authenticate the user via existing federated identity protocols (with OpenID Connect for example). The model of a key server with a transparency backend is based on the premise that a user is willing to trust the security of a centralized service, as long as it is subject to public scrutiny, and that can be easily discovered if it's compromised (so it is still possible to compromise the user's account, but the user will be able to know that as soon as possible).
It's worth noting that End-to-End is still under active development, and we might change our approach to key distribution if we find weaknesses in this model, or if we find something else that is as easy to use, and as likely to work. Part of the reason we release this document is to seek early feedback from the community, and adapt as needed.
We also want to point out we will do our very best to continue to support existing OpenPGP users who want to manually manage and verify keys and fingerprints manually, as we understand that system has been around for a long time, and has been more battle tested than what we are proposing.
For the purpose of this description, we assume Alice wants to send an encrypted message to Bob.
As a first step, Bob needs to register with a Key Directory, which could be operated by Bob’s email provider. Google will also operate a Key Directory. This registration is analogous to Bob registering himself on a phone directory, the way that works, is that his email provider (which has to support some federated identity protocol) automatically sends a signature that identifies him or her to the Key Directory.
Once that's done, the Key Directory publishes Bob's email address and a Public Key (which is what anyone can use to send Bob encrypted emails). The special thing about this Key Directory, is that whatever is written in the directory can never be modified, that is, it's impossible to modify anything from there without having to bring the service down, or telling to everyone that's looking about what is being modified.
One more thing that happens when Bob registers, is that independent third parties, which we call Monitors, inspect all entries in the Key Directory to make sure all entries are valid. The Key Directories themselves are supposed to check that they only add valid data to the directory, but in case a Key Directory has a bug, or is compromised, the Monitors would double check just to be sure (trust, but verify ;).
Now that Bob registered on his Key Directory, Alice will try to email him. To do so, Alice will contact her local Key Directory, and obtain Bob's public key. Alice will obtain a proof from the Key Directory that demonstrates that the data is in the permanent append-only log, and then just encrypt to it.
Within the message to send to Bob, Alice includes a super-compressed version of the Key Directories that fits in a 140 characters (called STHs which stands for Signed Tree Heads). This super-compressed version can be used later on by anyone to confirm that the version of the Key Directory that Alice saw is the same as they do (she will actually include an STH for every Key Directory she knows about).
For the case of Key Directories that are managed by Identity Providers, it is possible for them to synchronize with each other, and exchange entries missing from each other, this is called Peering. It is also possible for the Key Directory to try and fetch a key for a given email address from some authoritative source and import it to the key directory during a lookup operation.
When Bob receives the message from Alice, he decrypts it and verifies Alice's signature (Bob might need to check with the Key Directory to verify the authenticity of Alice's message), and Bob checks that the super-compressed version of the directory that Alice sent, is the same or at least an older version of the directory that Bob sees today.
If Bob is unable to contact the Key Directory, then on every message he sends from that day on, he will include the super-compressed directory Alice sent him, so that eventually, someone that can contact the key directory can verify it's valid and consistent.
Bob will also communicate the super-compressed directory to third parties (Monitors), and to other clients (for example, to a browser, or a chat client) that can then gossip with HTTPS servers, or other chat users, making it so hard to hide a compromised key directory, it would almost require shutting down all internet access to the targeted victim.
Revocation is the act of marking a key as invalid (either because it was stolen, or lost). In the Certificate Transparency model, all your keys (even old revoked keys) stay in the key directory, (remember, it's append-only!). The Key Directory gives you a proof that you have the latest version with what we call a Verifiable Map, which simply allows us to demonstrate that the latest key for Bob is actually the one returned by the Key Directory.
In addition, all entries in the key directory for a given user are linked to each other, so every future key must point to it's previous most recent key. This allows the user to quickly enumerate all his keys (even old keys) in the log with a simple query, as well as allows the monitors to verify that the keys the user thinks exist are all the keys that actually exist.
In this way, a user will always know what is his most recent key, and what are all the keys his account has ever had since the beginning of time. Or at least, since he started using end-to-end.
It is possible, that because of a bug, or a compromise, a key directory might store bad entries. In that situation, the Monitors and the users would not be able to use that key directory anymore (either for being defective, malicious, or compromised). Third party independent monitors will have to be prepared to handle those situations carefully and work with the key directory maintainers to get to understand what happened. Key directory maintainers need to be as open and forthcoming as possible.
It is also possible that a key is rotated without the user's knowledge. This could indicate a compromise of the user's account (via phishing, or malware for example), or a compromise of the Identity Provider. To protect against this problem, a Monitor could notify the user whenever a key is rotated, and the user should be given a chance to revoke the key, and rotate it once again.
When we talk about the system being transparent, we mean that it is unfeasible for a Key Directory to behave differently for two users. That is, everything that is put in the Key Directory has to be the same for everyone. In this model, an account compromise doesn't compromise old encrypted emails (just potentially new emails), and the user can be notified when it happens and can quickly undo it.
The model envisioned in this document still relies on users being able to keep their account secure (against phishing for example) and their devices secure (against malware for example), and simply provides an easy-to-use key discovery mechanism on top of the user's existing account. For users with special security needs, we simply recommend they verify fingerprints manually, and we might make that easier in the future (with video or chat for example).
The model we presented above seems simple, as the user doesn't have to do anything to know about key discovery, but has some important caveats.
- The user's key directory might be offline. A client must be able to talk to multiple key directories in case one of them is offline. If all of them are offline, the user wouldn't be able to verify any STHs and won't be able to send emails to users he hasn't contacted before (and could email users on revoked and expired keys). As such, keeping the service online and having multiple redundancy points is important.
- A user could be given an out-of-date key. To solve this problem we intend to let the users gossip the key queries (email, key, STH). In the long term, we would just use verifiable maps (that provide a proof that a given key was the most current version at a given time) which are simpler.
- A user could be attacked by the key directory or his identity provider. The key directory could insert a key to the log that doesn't belong to the user. To prevent that, the user or his monitors will have to ask the key directory for his most up-to-date key as often as possible, and the client will warn the user of new or replaced keys. The warning can be omitted if the user generated the new keys (which can happen if the user enrolls a new device). It's worth noting that if the user isn't an active user of the encryption client, and hasn't signed up for notifications with monitors yet, then he wouldn't be able to detect this. To prevent this, a user could choose to register to the key directory in a way that doesn't allow key rotations (every new key must be signed by it's previous key), but it has the inconvenience of not being able to recover after losing all copies and backups of a key.
- A user could be fooled into thinking everything is consistent. This is known as the split view problem, and essentially is an attack in which a user's full internet access is filtered and controlled, so all STHs and assertions provided by his contacts are malicious or controlled by a powerful adversary. This risk is real, but it's exploitation is extremely hard and unlikely even for the most paranoid users, as it would require such an adversary to be able to MITM all gossip channels, which might include SSL sites, chat conversations, all email traffic etcetera.
- The software could have a vulnerability, the maintainer could be compromised or other weakness might exist. Unfortunately, this is true for all software, and even hardware. While we can see a possibility on using HTML5's Encrypted Media Extensions and similar technologies (putting the plaintext out of reach of our code even), and believe that the transparency model has a lot of potential for operating systems and package managers, there will always be a user trust factor no matter what code or hardware is used. Anyway, we will work on minimizing to the very minimum the trust required by users, and will always provide open source versions of all code we produce.
- Someone could pollute the key server with fake emails. This means, it would be significantly hard to monitor it, as well as keep the service up and running. To prevent that, common anti-abuse mechanisms could be employed by the Key Directories. Additionally, the key directory with some support from the clients could always "restart from scratch" if needed, as long as there are other key directories the user can use.
- Someone could harvest email addresses from the key directory. Many parts of the system require the user to include their email address, for example: OpenPGP Key Blocks need to include a self-signature that includes the user id and other metadata, if we removed the email, then we would have to split the self-signature in two (one for metadata and one for identity). We might do this, but haven't committed to it yet (we would still need to store a checksum of the email, and the benefits of storing a checksum vs. the email itself are still unclear at the moment, since while can be made expensive to bruteforce it would still be possible for most users).