Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronization between multiple devices #99

Open
jfrederickson opened this issue Sep 20, 2014 · 22 comments
Open

Synchronization between multiple devices #99

jfrederickson opened this issue Sep 20, 2014 · 22 comments
Labels

Comments

@jfrederickson
Copy link

One of the things that seems to be missing in many serverless chat applications is a way of syncing messages/chat state between devices.

For example, I have a few laptops and a Nexus 4 running Sailfish. Ideally these would sync their message history when both of them are online, and any messages sent to me and replies I send back would be delivered to all devices.

The current Ricochet ID format is ricochet:ID. To take multiple devices into account, you could expand this by doing something like:

ricochet:device1,device2,device3

This would of course make the IDs longer, but as they're not intended to be human-readable I'm not sure that's a major issue.

For existing contacts, you would then need to send them the new ID the next time you communicate with them so that messages get sent to each device.

I'm not sure what the best UI flow for adding new devices would be. Something involving a BIG DAMN WARNING that all messages sent from this device would be readable by the others.

@dhowe
Copy link

dhowe commented Sep 24, 2014

This seems pretty critical in terms of usability... Am I correct that in the current implementation one would have to create/publish a new Ricochet-ID on a per-device basis?

@special
Copy link
Member

special commented Sep 24, 2014

Current situation

You can (manually) synchronize your client between multiple devices by syncing the 'config' folder, maybe putting it in some kind of (encrypted!) cloud storage. You can't run the same client from more than one place at a time - behavior is undefined and likely broken if you do that.

And, of course, you can use one address for each of your devices, and add all of your contacts on each one.

Scope

There are a few different things we could call "synchronization":

  • You can sign on from any of your devices, but only one at a time
  • You can sign on from multiple devices, but messages only go to the active one (XMPP-style)
  • You can sign on from multiple devices, and all messages go to all online devices

Another interesting design question is whether it's okay for your peers to know how many devices you have, and which ones are online.

Using multiple public-facing addresses

This is what @jfrederickson proposed: you have a list of addresses, and your contacts will try to reach you at all of them. We'd need something like ricochet:device1,device2 for contact requests, and then the clients would keep their contacts up to date with the current list of addresses.

Your devices have to keep their contacts lists synchronized, along with the full list of devices in the group. You will always expose which devices you're using, and they could be monitored individually for connectivity.

Using one public-facing address

We could instead give every device the same private key, and let the first device become a "master". At startup, each device would try to connect to its own address. If that succeeded, the device would synchronize state from the master. If the master disappeared, one of the other devices would take over and start publishing the service.

We'd have to be able to relay messages from the master to the others. It would be quite complicated, and probably fragile. Maybe unreliable. There would also be no effective way to revoke a device, since they all have the master private key.

A variant of this would be to only allow one device online at a time, but have them hand off that position automatically; when your desktop goes offline, mobile comes online.

Crazy ideas

If we picked up some kind of PIR scheme for discovering contact presence (instead of polling hidden services), we can attach the current .onion name to that data. Contacts can discover your address(-es), even when they've changed.

We could get sign-on-like behavior (instead of having to sync credentials from other systems) by putting an encrypted bundle of the keys and configuration somewhere. That sounds slightly frightening, and centralized.

Thoughts? What have I missed?

@dhowe
Copy link

dhowe commented Sep 25, 2014

From a UX perspective synchronization of the 3rd type would be ideal: 'sign on from multiple devices, all messages go to all online devices.' I've talked to people who use LINE (in spite of its security) just b/c of how smoothly it handles switching between devices. I haven't thought through the implications of this, but what about (optional) integration with something like https://remotestorage.io/ via Tor for configs?

@jfrederickson
Copy link
Author

I would have to agree with @dhowe on this one - the third scenario is best. As a long-time XMPP user, no method I've seen to determine which device is currently the "active" one has worked quite right.

The first one would be the easiest to accomplish technically, but personally I don't want to have to worry about logging in and out on my phone whenever I'm not at my computer, and vice-versa.

I don't mind exposing my connectivity on individual devices to my direct contacts; I already do on XMPP, after all. If anyone's concerned about this, they can sync their user directory and only use one at a time. Designating one device as the "master" brings up a lot of unnecessary issues - for example, what if my "master" device is a mobile phone with a poor connection? And, of course, the revocation issue.

GNUnet has GNS, which is a DNS-like system... but that requires GNUnet, which is unfortunately rather non-portable at this point.

I don't like the idea of storing my keys in any kind of centralized service - they might be encrypted, but that would require that they be protected only by a passphrase, else you need a key to decrypt them anyway.

Ultimately what I would like to see as well would be history synchronization between devices, but before that could happen Ricochet would need a concept of multiple devices belonging to the same user.

@special special added the idea label Mar 5, 2015
@dhowe
Copy link

dhowe commented Mar 31, 2015

Has there been further discussion of this issue elsewhere?

@hkparker
Copy link

I've been thinking about this one recently. What if ricochet clients store contacts as "People", that contain multiple "Devices" (currently ricochet IDs). When you want to contact a person, you send a message to all Devices you have saved for that contact. If you receive a message from any Device that belongs to a Person, that message goes into the chat for that Person. There could also be an option that clients could set to request a notification, so if the message in a conversation was received recently, only the "active" peer device gets a notification, while the others just receive the chat stream silently.

This syncs messages to a contact across all the contact's devices that are currently online, but more is needed to sync your messages with your devices. The ricochet clients could also maintain a list of other Devices the user owns and wishes to sync chat history with. When a user sends a message to anyone, a copy of that message is sent to all the other Devices owned by the user. The messages are received silently and placed in the appropriate chat history.

Using the above technique, chats should be synced in real time across all devices that are online. If a device comes online later it can start keeping up with the conversation from that point on, but wont be able to access history, as there is no data storage on the network.

This should be pretty easy to implement, and avoids the concept of a master device, but has some problems. It would force the user to one-time pair all their devices, which isn't that big of a deal, but it might force the user to add a contact on every device they own, which could get confusing. Also, frequently attempting to connect to devices that are offline could get noisy on the network.

Some more thought needs to go into contact management and sharing, but this could be a starting point.

@rodneyrod
Copy link

@hkparker when you say

Using the above technique, chats should be synced in real time across all devices that are online. If a device comes online later it can start keeping up with the conversation from that point on, but wont be able to access history, as there is no data storage on the network.

It would be possible to sync logs between devices if all of them are online at the same time. This could be made optional or default depending on the behaviour the user wants.

@special
Copy link
Member

special commented Apr 13, 2015

@hkparker

I agree. At the moment, that seems like the most feasible option; it's a lot less complex and breakable.

Some people in the Tor community are thinking about ways to do load balancing for HTTP hidden services, and that has overlap with this feature. If they come up with a reliable way to have multiple clients sharing one address, we might be able to use it.

What if ricochet clients store contacts as "People", that contain multiple "Devices" (currently ricochet IDs). When you want to contact a person, you send a message to all Devices you have saved for that contact.

This means we decide that we're not trying to hide "how many devices I have" or "which devices are currently online" or "which device I'm using right now" from our contacts. That is probably reasonable, considering we don't hide presence either, but it's worth noting.

it might force the user to add a contact on every device they own, which could get confusing

This is a tricky part. There's no obvious way to sync the contact list, other than having both online at once. Maybe manageable through clever UI?

Also, this makes contact requests more interesting: do you give out all of your addresses, or just some of them? How does that look?

@dhowe
Copy link

dhowe commented Apr 13, 2015

So the "active" peer device gets a notification, and other devices just receive the chat stream... Perhaps contact list syncing/merging could happen here (when multiple devices for a person are online)? That is, whenever a non-notifying chat update happens, there might be a check whether that client has 'recently' syncd...

Also, when an "active" peer device goes offline, is another peer (assuming one or more are online) selected to be active? Or does this next peer only become active when accessed by the user?

@hkparker
Copy link

@special

I've briefly read on the coming changes to the hidden services, and while they seem potentially far out, I'd support any way we can offload features onto the underlying functionality of TOR (#155 is another example of this).

I think disclosing our devices statuses to contacts is fine, though absolutely worth noting. To me, metadata collection on a large scale is the big threat, while the people I plan on chatting with I'll likely know and trust. I could see how others might have different values here however.

@special

There's no obvious way to sync the contact list, other than having both online at once.

If two devices are online at once and "paired", and a contact is added to device1, it can send a message to device2 informing it of the new contact. But if device2 is offline and missed that notification, future chat syncing messages from device1 might not make sense (this was sent to who?).

What if each device kept a queue of messages for each "paired" device. In the above situation, device1 would see device2 was offline and save the "new contact" message in the queue. When device2 checks in as online, device1 can send the queued messages in chronological order (informing the device of the new contact, then of any messages with that new contact).

This same mechanism can be used to send many events between devices, like adding new contacts, adding new devices to sync with, exchanging message history, etc. This puts a lot of trust in the other paired devices however, as each client will have to accept everything as valid.

This all makes sense with two devices, but I can see how problems could arise with three or more devices, so there's a lot more to work out here.

Also, this makes contact requests more interesting: do you give out all of your addresses, or just some of them? How does that look?

At first being able to selectively share devices seemed like a potentially nice feature, but I couldn't think of an easy way to manage that. I would expect it is not the desired behavior by the average user as well, so also with adoption in mind I'm currently thinking sharing all devices with everyone is best. Again though, I could see why someone else might have more use for that feature, this is partially just how I see myself using it.

@dhowe

So the "active" peer device gets a notification, and other devices just receive the chat stream... Perhaps contact list syncing/merging could happen here (when multiple devices for a person are online)?

I've been thinking about this, but I can't think of a way to do it without sending all the destination addresses of a message to your devices in the chat-syncing messages. Perhaps thats necessary however so that we don't need to inform other clients of new contacts for synced messages to make sense. And it would be a very small amount of data. The mechanism for keeping track of identities in the shared address book could maybe happen here in a way that simplifies catching up clients who went offline.

Also, when an "active" peer device goes offline, is another peer (assuming one or more are online) selected to be active? Or does this next peer only become active when accessed by the user?

Heres how I envisioned it: two people have clients, person1 has one device and person2 has three devices. Person1 messages person2, asking for a notification on all three devices that belong to person2. Person2 responds from device2, so any messages person1 sends to person2 (in the next n seconds) are only going to request a notification on device2. Then device2 goes offline, and messages go back to notifying on all devices. Person2 responds from device1 next, and just like before now only device1 notifies (for n seconds, before we revert to notify all). Now in "advanced settings", a user could decide if they like this system or want to ignore these requests and always notify.

Identity

Something that has come up while thinking about this is how do we define a "Person" here. Sure a person is a collection of onion addresses, but these can be dynamic over time. For example I should be able to start using a client on a device, add a second device, then later remove that first device without a problem. When sharing an address book, we've talked about a few concepts ("new contact" notifications, or contact information embedded in chat-syncing messages, or both). I think an ideal situation is a shared address book (same names and address across all devices), and what we are looking for is a reliable way for clients to come and go from this network and pick up on changes any other client has made while away. But if we want to utilize other features (like selectively sharing devices), maybe not.

@JuLx64
Copy link

JuLx64 commented May 25, 2015

I think that not having a user-friendly way for the user to share his ID among different devices is a design flaw in itself. I understand the complexity of the sync issue between devices that are synchronously online, but there should at least be an easy way to "export" your user ID to another device. Right now the only way to do this is to manually copy the "config" directory, which is not only undocumented but also not user-friendly.

And why do I think it's a design flaw ?
Imagine Alice an Bob want to chat anonymously. The tricky part here is for at least one of them to know the other person's Ricochet ID. At least one of them has to make his ID somewhat public for the other one to use it. Anyways, let's imagine that Alice somehow has access to Bob's ID, she can send him a message, where she itroduces herself as Alice, and they establish a conversation. From then on, Bob trusts Alice's ID for what it's worth.
Then Alice has to use another device, but she didn't export her "config" folder (because nowhere in the UI is she encouraged to do so, and she didn't think it through). She only has Bob's ID. And she wants to contact Bob again. So what does she do ? She has no other choice but to create a new Ricochet ID, and sends a message to Bob : "Hi, I'm Alice, from our previous messages, do you trust me ?".
Now of course you can see why it's a flaw : Bob has absolutely no way to tell if it's the same Alice or not. Of course he could try to verify it by asking questions about their previous conversations, but not only is it not foolproof, but it can also be very damaging if either Bob's ID has been compromised, or Alice is not who she claims to be.

I think the UI MUST present a function to easily export the user ID, and document an easy way to use this SAME ID from another device, asynchronoulsy. And of course saving the "config" data on a cloud service through an encrypted file would probably be the most practical solution, but even a simple local "Export ID" feature would do the trick in the meantime.

@hkparker
Copy link

Ideally there would be no concept of "exporting" your identity, heres how I imagine this happening:

Every device maintains an address book containing all the devices that belong to a contact. Alice and Bob both install the client and create brand new identities with empty address books. Bob has device b1 and Alice has device a1. Bob and Alice want to talk to each other, so they initiate a friend request, either by messaging and accepting, or scanning a QR code if they are next to each other. Bob's address book now has Alice:a1, and Alice's address book now has Bob:b1.

Now Alice wants to add a new device, a2. Alice wants to be able to talk to Bob on this new device as well as receive new incoming connections from Bob, while Bob's client transparently adds the new device. To accomplish this Bob needs his address book to contain Alice as Alice:a1,a2, and both of Alice's devices must have Bob:b1.

To install this new device, Alice needs access to another device she is already installed on, a1. Once a2 has the client installed Alice selects to join an existing pool of devices, and pairs with a1. During the pairing process, the a2 receives a1's address book. This is how both of Alice's device have Bob:b1 in their address books. At this time a2 also receives a copy of it's public key signed by a1, so that a2 can prove that it did sync with a1, and a1 queues a message to all friends informing them a2 is now a device that should be associated with their Alice record.

So the only thing remaining at this point is for Bob's device b1 to update it's Alice record from Alice:a1 to Alice:a1,a2. There are a few ways this can happen. If b1 and a1 are online when a1 and a2 pair, b1 will receive a message from a1 as soon as the pairing is complete instructing it to update the Alice record. b1 could be offline, but the next time both a1 and b1 come online at the same time they flush their message queues, informing b1 of the change.

If however b1 is never able to communicate with a1 before Alice wants to use a2 to talk to Bob, Alice can still authenticate a2 by sending Bob the signed public key. This asserts that a2 did in fact interact with a1, and the owner of a1 must also own a2. Bob's device will see this and update Alice to Alice:a1,a2. The messages that b1 will receive about a2 from a1 when a1 finally connects will be redundant and ignored.

The only problem here if that if b1 only finds out about a2 because a2 initiated a conversation, messages from Bob to Alice before Alice's first a2->b1 message will not be queued from b1 to a2. This probably isn't a big deal, but one option is to have a2 poll for b1, so that when b1 comes online it receives a message from a2 relatively soon and starts forwarding chat.

@JuLx64
Copy link

JuLx64 commented May 25, 2015

I think the concept you have in mind is probably intended for "normal" users who use several devices synchronously, but it would be no solution for :

  • A person whose device is lost or stolen. In this case all previous conversations / trusted relations would be lost as well, which could be very damaging
  • A person who has to travel but is not able to carry his device, nor any physical storage (for example if this person has to cross the border of a very intrusive country, and can't bear the risk of carrying such a suspicious item as an encrypted USB key)

In these two examples, devices a1 and a2 can never be paired. ID Export seems to be the most practical solution to me, and also the easiest to implement (it's just UI features, really).

@hkparker
Copy link

A lost or stolen device is tricky regardless of how we sync chat messages. I would think you would need the ability to revoke one of your devices, which just tells all of your contacts to remove that device as trusted, and tells the revoked device to destroy it's database. Mobile clients would likely want to encrypt their databases and require an unlock code when opened as well to reduce the threat from theft.

I think what you are suggesting is 100% compatible with the proposed method of syncing chat messages. I don't see why it isn't possible to export a1 by saving an encrypted copy of it's keys and address book, installing the client on a new device, then importing the encrypted backup of a1. In that way device a1 has moved from one piece of hardware to another without pairing. Problems might arise if someone attempts to launch two versions of the client with the same keys however. If you are getting rid of an old device its easy to wipe the old one after the migration, but if the device was stolen it would probably be better to revoke the stolen device and pair the replacement device.

Perhaps this can be supported as a "Transfer Device" feature. The client would save the encrypted data then destroy it's own copy. The file could imported to a new installation to "resume" that client.

@carepack
Copy link

maybe combine two good things? For synchronizing config, logs etc with something like:
https://github.com/syncthing/syncthing
If it's similar to btsync (haven't tried syncthing) then it's possible to remove a device from a shared folder. Of course sync over multiple devices is only possible if they will be online. But no centralized server.

I really don't know if this impacts security risks. Only an idea ;)

@hkparker
Copy link

Syncthing is accomplishing pretty different goals but is approaching a similar problem (a bunch of devices trying to communicate and converge on a set of data). Something I like about syncthing is if one installation is on all the time, it can act as an always up to date repository without changing the protocol. It would be nice to be able to run one chat client 24/7 that can inform other clients of what they missed. I think this would be pretty easy to do like this:

Device d1 receives a message from a contact c1
d1 knows that paired devices d2 and d3 are online, but d4 is currently offline
d1 can assume the message was delivered to d2 and d3, but not d4
d1 caches this message to send to d4 as soon as d4 is online

d4 comes online, and connects to its paired devices and contact devices
If c1 is online, d4 gets the message when they connect
d4 can also get the message from d1 if c1 is offline
(in reality, d4 would get two copies eventually and ignore the second)

@thebeline
Copy link

So I have been thinking about this, and while I agree, the lack of syncing between clients is probably going to put some off, I think the inconsistencies and weak points it introduces outweigh the few people it will put off.

If I am chatting on device A, then pick up device B and continue, there will likely be no notification of the change to the receiving party, which is insecure.

If I am doing the same and suddenly device B drops out, all messages will be cached on A. This is insecure. I do not care that they are encrypted, the NSA stored encrypted data all the time waiting for a breach to come, this is not secure, there should be no caching.

If I have linked two devices, and one gets stolen, device B will cache all messages for device A waiting for it to come back online, at which point it will send them, to a potentially compromised key.

Caching would likely cache outgoing messages as well, so you have all of your conversations cached with a potentially compromised key.

All of this could happen to the other user as well, without you knowing.

How will we notify the user of multiple devices under the same person sending messages?

We could make all of this configurable, HOWEVER, that is for our end, and the other end may be compromised. Further, I do not like the idea of the application even having the OPTION of distributing messages in the background, or caching. Even if these things are encrypted, this introduces too many moving parts that can fail, and a secure system must be careful to be as secure as possible, from the ground up. Introducing a half-dozen points of failure is unacceptable.

Further, let us be realistic here: Yes, many of these features can be written so that they can be disabled, however, we want these features so that the application appeals to more people, and the people this would make it appeal to are likely going to be the people who will not know enough to disable this device on all devices (if stolen) or detect if they are talking to the wrong person, etc.

Because of this, having these options will make it easy to use at first, but will lower the overall security of the application and it's use, and will also jeopardize the NAME of the application as a "Secure Messenger" when this invariably leads to breaches of security.

Also, this jeopardizes the foundation of the Tor network as an annonimizer, as it will store edge-connections between Tor fingerprints. Do you really want to build a feature that, if a users system is compromised, doesn't just leak their information and messages, but also impacts the security of the Tor network as a whole (for a subset of users)?

I don't think you really want to do that...

Tech savvy users will make it work. Tech un-savvy users will eventually figure it out.

Be sure to keep your priorities in mind.

To put it in perspective: This conversation ~= introducing huge holes in the system. I am not going to point fingers, because I do not know any of you, but "suggestions that introduce huge holes in otherwise secure systems" sounds awfully familiar...

@hkparker
Copy link

One of my favourite things about open source development, and discussing these protocol ideas in the open, is that we are all free to take what we like into our own projects and exclude what we don't like. And the only thing that can come from more open conversations is better projects. I'm taking these concepts and attempting to create a chat application on I2P because I think multi-device support is important and having projects on both anonymity networks can't hurt.

If I am chatting on device A, then pick up device B and continue, there will likely be no notification of the change to the receiving party, which is insecure.

I don't yet see the security concern here. The receiving party will know the device has changed, in the sense that the application would receive the messages from a different connection, the app just wouldn't visually notify the user, as I don't see the need. No reason it wouldn’t be possible to notify the user however.

I do not care that they are encrypted, the NSA stored encrypted data all the time waiting for a breach to come, this is not secure, there should be no caching.

I think that's a little overly simplistic. Yes, encrypted data is more scrutinized, but I really don't see why that affects anything. Its not like we should not use encryption, or limit the amount of encrypted communication. I can only think of positive results from more encrypted communication considering the increased scrutiny.

If I have linked two devices, and one gets stolen, device B will cache all messages for device A waiting for it to come back online, at which point it will send them, to a potentially compromised key.

A stolen device is always going to be a risk unfortunately. But, once you know a device has been stolen, and revoke it in another device, cached messages wont be sent even if it comes back online. Once a device is revoked nothing will be sent to it by any of the clients that know it is revoked. The only thing that should be sent would be "you've been revoked, destroy your database and cease communication please".

All of this could happen to the other user as well, without you knowing.

Sure, but this is no different than current applications. I don't really know if one of my friend's telegram devices has been stolen until they tell me, or they just revoke it. Someone stealing your device most likely doesn't care about your personal conversations, they want to wipe and sell the device. If someone is stealing your device specifically to go after your communications than what can really be done?

Because of this, having these options will make it easy to use at first, but will lower the overall security of the application and it's use, and will also jeopardize the NAME of the application as a "Secure Messenger" when this invariably leads to breaches of security.

I really don't see how multi device support invariably leads to breaches in security. And compared to centralized services, or services that are peer to peer but do not protect metadata, of course this would still be a secure messaging application.

Do you really want to build a feature that, if a users system is compromised, doesn't just leak their information and messages, but also impacts the security of the Tor network as a whole (for a subset of users)?

If a device gets stolen, the thief will know the addresses of the other devices that person owns, this is a very good point. This could aid in future attacks on the network against you. But again this would be an extremely targeted attack, not someone mugging you. If the NSA is out to get you they are going to get you, lets focus on creating systems where passive mass surveillance isn't practical, and force agencies to need to use targeted attacks.

Tech savvy users will make it work. Tech un-savvy users will eventually figure it out.

To be honest I do not think we need more tools for technical people to talk securely with each other. Those with technical skill already have some good options. I worry about those who don't understand the importance of privacy and simply use the most convenient app. I want them to have a very convenient option that also respects privacy.

Be sure to keep your priorities in mind.

Personally my priority is creating an application with similar functionality to telegram/whatsapp (including multi device support) to encourage adoption, but on top of an anonymity network to protect metadata from passive surveillance. My priority is not to create an un-hackable application, that would be impossible. I want to raise the bar so that collecting everyone's messages isn't as easy as bullying some company.

To put it in perspective: This conversation ~= introducing huge holes in the system. I am not going to point fingers, because I do not know any of you, but "suggestions that introduce huge holes in otherwise secure systems" sounds awfully familiar...

Lets not go there. One could easily make the opposite (and ridiculous) claim that by suggesting multi-device chat over anonymity networks is impractical you are discouraging the creation of an adoptable application that would significantly improve the privacy of a lot of people. We all want more secure chat here, lets talk about how to do that in a way that is accessible.

@thebeline
Copy link

Hey, thank you for the thoughtful response. I couldn't agree with you more in regards to open conversation and discussion. It is one of the things that makes OSS great, and also an opportunity to make OSS secure. It is because of that that I feel compelled to speak up here, and attempt to avoid fiasco's down the road like we dealt with OpenSSL, et al this past spring. If the discussion leads to better security, and a decent product, then it is worth it.

Last things first: I sort of went point-by-point bellow, got to the bottom, and then realized that yes, accessibility is paramount, and I should probably start on a positive note first.

A "Can Do": I do agree that what is inaccessible about secure, Tor-based, messengers is the contact management portion. Being able to name contacts is a good first-step. Being able to group contacts is a good second step, and indeed, being able to assign some sort of note or other to a contact is a good third step.

All of these things add metadata to a contact, which is generally a "bad thing." So, a secure way to do this would be to have the Contact Metadata as stateless (nothing is saved about a contact but the contact Tor hash [or whatever it is called], a verified time, and a verified hash). When the contact comes online, it sends any two bits of metadata it wants to, which would include a Contact Name, and a Device Name. These two pieces of information are not stored but in memory, and are determined by the device on the other end. This ensures that it is up to the person wanting the security what is disclosed. When not verified, the user has an alert icon. To verify the information it is up to the user to select "Verify" which calculates a hash based on the Tor fingerprint, the User Name the Device Name and the Verified Date. When the user connects again, the provided User and Device name are verified based on the fingerprint and the verified time. The verified time acts as a verifier for the user themselves (I know I verified this person this day, but it changed recently, etc).

This is secure option because it puts it into the hands of the remote user to identify themselves. Also, no meta data is stored locally, in the event that the machine gets compromised. This metadata should only be sent to users that have accepted connection requests from each other (friend requests a connection and sends the metadata, but the person being requested does not send metadata back unless they accept the request). This prevents people from just scanning IDs for information.

Perhaps also the remote user is able to also provide a grouping ID. This ID would be randomly generated on one device, and entered in the others. This ID would provide grouping of their devices on their friends contact lists. Again, ONLY when the various devices are connected.

As for syncing, see below. I do not feel syncing offers any more benefits, and only opens holes. If we can group contacts, and have device names, and user names, that can be verified, I think that will make it accessible enough.

Local history is fine, as it is locally managed. You know exactly where it is, and it does not leak any information on other devices or about device inter-contentedness. Also, if you need to delete it, it is much less complicated.

The LONG bit below really revolves around syncing, caching, and the local storage of metadata. If the above seems like a good compromise, bringing enough data to the table to make the service usable and attractive, while also ensuring security, you may not need to read further.

======RANT(ish)======

So I guess we're going to do the point/counter-point/counter-counter-point thing here, which is great. I look forward to being learned some.

Most all of my points revolve around the concept of eroding the fragile security and anonymity benefits of Tor, so let's talk about that.

The foundation of Tor is that it is impossible to be subjected to MITM attacks (well, prohibitively computationally expensive at this time), and due to the routing nature and multiple layers of encryption, nearly impossible to determine the end-to-end route of an individual packet. This may be overtly simplistic view, but these two features are what makes Tor as secure as it can be. Emphasis on the "as it can be." There are many more ways that the security and anonymity of a user can be breached, and my concern is particularly that the syncing of accounts will introduce attack vectors or undermine the perceived security of the system.

Transparent device switching: I understand that the application it's self will have verified the nodes under a particular username, and that all transmissions would be encrypted, so the over-the-wire security will not be affected by such a feature. My concern here is not that it erodes the security of "who am I talking to" but more, "what am I talking to." With the way Tor works, you are assured you are speaking to a very specific node, but without notification of the switch, you won't have that assurance. Let alone the fact that you aren't speaking to a specific node, but all nodes attached to that user. This is not strictly a breach of the cryptographic security, but more of the individual security.

Stolen devices: Stolen was not exactly the best example, but it is the most likely situation for day-to-day users. The really issue here is when an attacker (or unassuming voyeur) gets physical access to one of your devices. If I am chatting using my phone, the fact that someone (coworker, spouse, evil-doer, do-gooder) could sit at some other device and watch the conversation is a "no-go" for me. Further, the fact that you think a message kindly asking the possessor of the device to delete the DB and "cease communication, please" would have any effect at all means you are not regarding the situation in the right manner.

Someone else mucks up: Again, not necessarily about stolen devices, but physical access, now I don't have to worry about the 3 points of failure (laptop, phone, tablet, for example) on my end, but also the 3 on the other end as well. This could be a stolen device, someone sitting at a computer, malware, or even "hey, you don't have to do anything, just add us to your synced accounts and we'll take it from there" (which is why having it even available as an option smacks of a terrible idea).

These types of breaches WILL happen. If they can happen, they will. Like I said, it will make it attractive, which will bring on many users, which adds many variables, which will cause an issue, which will turn off users, etc, etc... A PR event like that is something that is difficult to recover from, which means wasted development time, etc.

"More secure" is not the same as "as secure as it can be." More secure than centralized messaging services? Certainly. Protects metadata... eh... But you are building a messaging platform on top of one of the most secure communications protocols available, of-course over the wire will be secure, the thing WE have to secure is the application and the user experience.

You are correct about targeted attacks, but it may or may not be the NSA, there are a lot of people out to get your information, or you. By allowing there to be multiple attack vectors (devices, etc) you are doing your users a disservice. If it is the NSA (or other state-sponsored entity) you are right, they will get your information. However, making it easier is not really an option in my mind. A feature such as this makes it more cost effective to attack an individual (state sponsored or otherwise). You have a larger attack surface (multiple devices) and you stand to gain more information from a single successful attack (all attached devices). This is akin to using un-salted password hashes, or a single salt for one database. Both of which could get your fired these days.

As for the encrypted bit, I am aware it is overly simplistic, but it isn't just about the encryption, it is about data integrity, data duplication, storage leaks, and a lot of other things too. You have a device that is rarely on, and all messages get cached on all of your other devices, taking up space. This is woefully inefficient, duplicated data is bad, 'm kay? Also, just storing data for a service that is supposed to be private/secure just feels uncomfortable.

There are so many edge-cases that this would erode security and privacy with, I just keep thinking of more: You are having a conversation with two devices on (A and B), device C is off, so both A and B cache the data to send to C when it comes back on. The conversation gets weird. After some time you realize you want that conversation to go away, so you clear history on device A, how do you handle this? Clear the local cache and send a clear message to B? What if B is off when you clear it? Then, when B turns on, A happens to be off, and B sends all the data to C. What if the same happens with a stolen or compromised device in the same manner?

With synchronization there are too many moving parts, too many new situations that can erode the foundations of application. I have come up with a few, and I haven't even spent much time on it. People with more brain power than I could likely come up with many more situations.

My stance remains, however: I know it is a nice to have, but there is a reason it does not exist yet, and that is because it is insecure. Security is out priority, and as such, I do not think cross-devie synchronization (or even the code to make it happen) should be a part of this application.

@hkparker
Copy link

Thank you for the long and thoughtful response. Increasing complexity is inherently dangerous I agree, both in feature design and in code structure, OpenSSL is a good example. The main reason I'm not trying to dive right into code, and instead am trying to get as much feedback as possible on this idea, is so that I can create a formal specification that can be analysed and audited separate from and before implementation.

Before responding to more specific points I want to say I think the fundamental difference in our thinking is that my main concern is over the wire metadata protection (without attempting to improve application security relative to current chat applications), while you place more emphasis on strong application security.

I want to clarify how we are referring to metadata. The basic component of an identity is the address/public key of the device, and I believe you are (correctly) referring to additional descriptions of the contact (such as their name or picture) as metadata. I had been thinking of metadata only as descriptions of chat activity (who was talking to who, on which devices, when) at the network level. Minimizing identity metadata would be important in protecting people if their friend's device was compromised, but I think that has so far been out of my threat model. There's only so much that can be done about a compromised device, my focus is on protecting users from actors who have massive network surveillance power, but who have not compromised (everyone's) devices.

I was planning on having the users choose to disclose their own name and image however they wish, but letting the recipient store that information in a local and encrypted database once they were friends, if for no other reason than to save on bandwidth. Compromising the device would then inform an attacker of the network identities of all of their friends, but if the contact information was sent every time an attacker could collect the same information over time, assuming persistent access. Without persistent access though I see the huge advantage of sending the contact's information every time and verifying, I'll have to consider that in more depth, but I think that, or some variation, could be a great addition.

Your description of a grouping ID (and storing everything in memory) would prevent one from being able to cache messages for offline devices, however sending only to online devices might be accessible enough for some (not me however). I see how you no longer have the assurance that you are talking to a specific node with transparent device switching, but I believe that you still have a strong assurance of who you are talking to, which I see as the real goal.

I also see how the possibility of other devices watching a chat in real time is concerning, but I think that's an inherent feature of multi-device support. The same thing would be possible on WhatsApp, Telegram, iMessage, etc. Asking a compromised device to clear itself isn't meant to be a perfect solution, it wont protect you from a technical actor with access to the device of course, it's more meant for a stolen device.

I think in general another reason why application security has seemed like less of a priority is that compromising a device at worst can only impact the compromised user's messages and messages with their contacts, as well as their friends network identities. This might seem like a lot, but this is in contrast to centralized services where compromising their back end (or legally pressuring them) could impact the entire userbase.

You have a device that is rarely on, and all messages get cached on all of your other devices, taking up space. This is woefully inefficient, duplicated data is bad, 'm kay?

No reason you need to store a copy of each message for each cached contact... one could just keep a list of contacts that haven't received that message yet, and only store the message once.

The A-B-C forwarding situation you pointed out actually wouldn't be possible with how I would implement message forwarding, but I haven't really expressed that well at all yet. I'm working on getting something more formal out as soon as I can, which should help explain where forwarding would take place.

Again, thank you for your input. I believe I have a lot I could improve about application security in my design, and taking the time to share your analysis is incredibly helpful.

@hkparker
Copy link

hkparker commented Feb 1, 2016

So its been a while. I've been putting some work into this, largely based on ideas here. Also this is getting started now.

@schattenphoenix
Copy link

I did actually not read all messages as it is just too much.
I just wanted to leave an idea that could help with the issue at hand but im not sure if it is suited for the task.

Would it be an option to just have multiple accounts and "connect" them?
When you receive messages the client automatically bounces them to the aforementioned connected devices.
This way you could include some kind of formatting in the bounced messages that allow to put the messages in the right conversation.
Using the normal sending functionality would bring less complications with it i suppose as you would not have to worry about distribution-infrastructure (which would not be in the spirit of this protocol?).

Just something to think about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants