New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add end-to-end encryption for push notifications #15229
Conversation
So, this is a fine start to the encryption part, but we first need to change the data model, likely in a separate PR, to create a |
67c36dd
to
102d311
Compare
5c18f5c
to
51e343f
Compare
bc499a6
to
6172f47
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @hashirsarwar I'm leaving a few change requests for the first few commits right now so we can keep this project moving. I'll add some follow up reviews on the second half of commits, the tests, etc. soon but separately since they will be a bit heavier (I don't want to suggest too many changes at once).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left just a few small change requests - nothing major. What I think we should do now is:
- Finish discussing the details of the refactor I mentioned (in PMs)
Actually perform the refactor.(if you're not comfortable with it then fine, stick to whatever is personally comfortable)- Look at the failing test and get that fixed.
4.Give this PR one more round of review and look more carefully at the tests you've added. - Get Tim Abbott and Greg Price involved for some rounds of final review.
- Get this merged.
d69ac6d
to
830d4cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few small change requests but otherwise this is pretty much ready for review by others, so I'm going to go ahead and approve it. There's still the issue of deciding if the discussed refactor needs to be made or not. But that's not going to get resolved until after the upcoming release AFAIK. So we'll wait until the release is over then get Tim Abbott and Greg Price to review this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hashirsarwar . Discussion in chat, and a few comments below.
f2ddba6
to
0c1515a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments below. Most importantly: this actually leaves much of the data unencrypted.
# AESGCM.encrypt() expects data in bytes and returns ciphertext | ||
# bytes with the 16 byte tag appended. | ||
encrypted_data = aesgcm.encrypt(nonce, data.encode('utf-8'), None) | ||
return bytes_to_b64(encrypted_data), bytes_to_b64(nonce) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly -- why base64 here?
I'd prefer to handle things in their natural form, and encode as base64 or whatever only where needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to send this information i.e. encrypted_data
and nonce
to other devices via bouncer. Sending these fields as bytes will cause the following exception when we try to covert encrypted_payloads
to JSON format.
TypeError: Type is not JSON serializable: bytes
Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. At the moment when we're sending them over the network, they need to be encoded somehow, and base64 is a fine choice there. But the best point to do that encoding is just before we send them over the network, so that inside our code they can have their natural form, which is a plain bytes
.
zerver/lib/encryption.py
Outdated
# Make sure the generated key doesn't collide with an existing key. | ||
if PushDeviceToken.objects.filter(notification_encryption_key=key_str).count() == 0: | ||
return key_str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation for this bit?
If we generate the same key twice... something is very badly wrong already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that AESGCM.generate_key
can generate the same keys twice though the chances are very little. We are not generating the encryption key through our code, instead we are relying on AESGCM.generate_key
to do so.
This change was suggested by @Hypro999 in a resolved conversation -- so he might want to add something here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really have anything to add here. This was just added as extremely defensive code. We can remove this if we want to, since it involves a DB query for a highly unlikely event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's take this out.
In the absence of a bug in how we generate the key (including in the system's random-number generator which we'll be relying on), the probability of a collision is far smaller than the probability of, say, the system going completely wild due to a hardware failure. Even if there are a billion records here, the probability of a collision is less than 1/2^68. The probability that the machine will take a direct hit from a meteor is several orders of magnitude bigger than that.
If OTOH there is a bug in the key generation... this would not be an effective way to detect that (and the Zulip app in general isn't the right layer to be attempting to defend against such a thing anyway.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
def send_android_encrypted_push_notification(encrypted_data: EncryptedData, | ||
options: Dict[str, Any], remote: bool=False) -> None: | ||
for device, payload in encrypted_data: | ||
send_android_push_notification([device], payload, options, remote) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a loop here? We discussed this in chat in a previous round. A single request with an encrypted_payloads
field would make the code simpler. It would also make it easier to manage the case where a Zulip server for some reason ends up with lots of device records for a given user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that approach would end up having us send a single large payload, rather than several small ones, in that buggy case? Probably better since there's a high minimum latency, assuming we don't risk hitting a low limit for how large a payload can be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, this isn't in the path leading to the bouncer and is only used when actually talking to FCM. I was thinking of the request the app server makes to the bouncer.
When actually talking to FCM or APNs, we have to make a separate request per distinct notification-message payload anyway, which means one per device when they're encrypted. So the loop here is appropriate -- it should just be inlined rather than being its own function, as mentioned in another comment.
def send_android_encrypted_push_notification(encrypted_data: EncryptedData, | ||
options: Dict[str, Any], remote: bool=False) -> None: | ||
for device, payload in encrypted_data: | ||
send_android_push_notification([device], payload, options, remote) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a matter of style and code clarity, if we do have this loop, the code would be easier to read if it's simply inlined at the one place this function is used -- this function doesn't really add anything. (But as mentioned in a separate comment, we should arrange the internal API so that we don't need this loop at all.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
@hashirsarwar can you also respond individually to the comments I posted? It wasn't clear to me whether your last push is expected to address them. |
The last push only addressed one of the comments. I am working on the rest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The discussion here above caused me to look closer at the bouncer side of this and at the device-registration side. Some comments on those below. In particular there are some issues at registration time that are essential to fix.
zerver/lib/push_notifications.py
Outdated
def prepare_encrypted_payload(data: EncryptedData) -> EncryptedPayload: | ||
payload = [] | ||
for device, content in data: | ||
payload.append((device.token, content)) | ||
|
||
return payload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be much simpler and clearer as a list comprehension, inlined at its one call site.
zerver/lib/push_notifications.py
Outdated
apns_payload, | ||
encrypt_apns_payload) | ||
|
||
encrypted_payloads = prepare_encrypted_payload(encrypted_gcm_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is completely missing the data for Apple devices.
def prepare_encrypted_data(devices: List[RemotePushDeviceToken], | ||
payload: EncryptedPayload) -> Tuple[EncryptedData, EncryptedData]: | ||
device_map = {d.token: d for d in devices} | ||
encrypted_android_data = [] | ||
encrypted_apple_data = [] | ||
for token, data in payload: | ||
if token in device_map: | ||
if device_map[token].kind == RemotePushDeviceToken.GCM: | ||
encrypted_android_data.append((device_map[token], data)) | ||
else: | ||
encrypted_apple_data.append((device_map[token], data)) | ||
|
||
return encrypted_android_data, encrypted_apple_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code can be made a lot simpler by taking out the if device_map[token].kind == …
conditional and producing just a single list. Then it can just be a list comprehension, which can just be inlined at the one place it's used.
Then instead of having separate functions send_android_push_notifications
and send_apple_push_notifications
which are just simple loops that are very similar to each other, those two can be a single loop, with the if device.kind == …
conditional inside that loop.
zilencer/views.py
Outdated
encrypted_android_data, encrypted_apple_data = prepare_encrypted_data(android_devices + apple_devices, | ||
encrypted_payloads) | ||
|
||
android_devices = [d for d in android_devices if not d.encrypted] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new value has a quite different meaning from the old value called android_devices
. It should therefore have a different name which expresses that different meaning.
(Same for apple_devices
below.)
Alternatively, this list comprehension could go directly in the argument to the function call, rather than give it a name at all.
zilencer/models.py
Outdated
@@ -32,6 +32,7 @@ class RemotePushDeviceToken(AbstractPushDeviceToken): | |||
server: RemoteZulipServer = models.ForeignKey(RemoteZulipServer, on_delete=models.CASCADE) | |||
# The user id on the remote server for this device device this is | |||
user_id: int = models.BigIntegerField(db_index=True) | |||
encrypted: bool = models.BooleanField(default=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this name; it sounds like it's saying the device is encrypted (whatever that would mean).
One good name could be encrypt_notifications
-- same as in the request the client makes when registering its token.
Another could be require_encryption
-- reflecting the effect of the flag, which is to say "don't send plaintext messages here."
6412287
to
5d5df13
Compare
@@ -32,6 +32,7 @@ class RemotePushDeviceToken(AbstractPushDeviceToken): | |||
server: RemoteZulipServer = models.ForeignKey(RemoteZulipServer, on_delete=models.CASCADE) | |||
# The user id on the remote server for this device device this is | |||
user_id: int = models.BigIntegerField(db_index=True) | |||
encrypt_notifications: bool = models.BooleanField(default=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is confusing -- this field represents whether notifications ARE encrypted; the push bouncer shouldn't care. We should add a comment explaining it and also pick a better name. (I'm not sure actually what this is needed for?)
5d5df13
to
7a8b247
Compare
Heads up @hashirsarwar, we just merged some commits that conflict with the changes your made in this pull request! You can review this repository's recent commits to see where the conflicts occur. Please rebase your feature branch against the |
7a8b247
to
4879f71
Compare
Heads up @hashirsarwar, we just merged some commits that conflict with the changes your made in this pull request! You can review this repository's recent commits to see where the conflicts occur. Please rebase your feature branch against the |
4ec3636
to
88b200c
Compare
Closing in favor of the more current #26262 for this issue. |
Closes #6954.