Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E encryption does not handle unicode properly #1719

Closed
dluciv opened this issue Jul 5, 2016 · 7 comments
Closed

E2E encryption does not handle unicode properly #1719

dluciv opened this issue Jul 5, 2016 · 7 comments
Assignees

Comments

@dluciv
Copy link

dluciv commented Jul 5, 2016

!!! VERY PROBABLY SECURITY ISSUE, SEE EXPLANATION BELOW !!!

Problem itself

Vector fails to receive unicode messages. For example, we can send cyrillic Достоевский and then receiving side will show nothing. Receiving side browser console output for this message will be:

Caught /sync error SyntaxError: Unexpected token � in JSON at position 164
    at Object.parse (native)
    at _decryptMessage (https://vector.im/develop/bundle.js:45792:26)
    at mapper (https://vector.im/develop/bundle.js:48012:26)
    at https://vector.im/develop/bundle.js:86429:17
    at Array.map (native)
    at SyncApi._mapSyncEventsFormat (https://vector.im/develop/bundle.js:86425:24)
    at https://vector.im/develop/bundle.js:86166:36
    at Array.forEach (native)
    at SyncApi._processSyncResponse (https://vector.im/develop/bundle.js:86163:16)
    at https://vector.im/develop/bundle.js:86055:19

Technical Analysis

This problem very likely comes from Olm library itself. So I nearly know it is Olm, not Vector bug. But I see no better place to log it.

When trying to test recent olm.js from https://matrix.org/packages/npm/olm/olm-0.1.0.tgz against https://matrix.org/git/olm/tree/javascript/demo/one_to_one_demo.html, unicode is decrypted incorrectly.

When I encrtypt Byron, Olm decrypts it correctly then.
When I encrypt Достоевский, Olm decrypts it as:

Достоевский









�

So having 10 '\n' (exactly '\n') and then '\x03'.

Literally, when we debug line var plaintext = to_session.decrypt(message.type, message.body); in HTML file, we can get "Достоевский\n\n\n\n\n\n\n\n\n\n\u0003" -- just as above.

Not sure this issue alone leads to vector errors in unicode E2E, but is definitely should not be so.

SECURITY

This issue very likely comes from incorrectly considering string length-in-bytes as length-in-codepoints. For ASCII those lengths are similar (if we use UTF-8 as internal Emscripten encoding, for example), so ASCII texts are (en|de)crypted correctly. But when it comes no non-ASCII symbols, e.g. Cyrillic, those lengths are different, and this likely causes such a problem.

Other examples also show different garbage after result string -- not only '\n' and '\x03', but other bytes too. Very probably it sends an amount of Emscripten memory located after the end of the string. This garbage can contain important data (e.g. keys) and it can leak this way. So fix should be applied as close to Emscripten part as possible or maybe even in C++ code itself.

@richvdh richvdh self-assigned this Jul 5, 2016
@richvdh
Copy link
Member

richvdh commented Jul 5, 2016

Thanks for the report. I've repeated the issue and prepared a fix, which we'll land once we've reviewed it.

@dluciv
Copy link
Author

dluciv commented Jul 6, 2016

Thank you very much!
So finally, was it Olm bug? And could it be security issue as I was afraid of?
Can I see the fix too? Just intrigued what was wrong.

@ara4n
Copy link
Member

ara4n commented Jul 6, 2016

the fix looks to be pr #11

ignore me; i'm confused. the fix will be somewhere in the matrix.org/git/olm repo shortly

@richvdh
Copy link
Member

richvdh commented Jul 6, 2016

@dluciv: it is a bug in the javascript wrappers for olm.

You can see the fix on the rav/fix_decrypt_utf8 git branch. And yes, I suspect under certain circumstances it will expose random crap from the stack. The underlying Olm library goes to some effort to check that sensitive data isn't left lying around in stack memory, so I think you'd struggle to exploit the issue, but certainly we appreciate the report.

@richvdh richvdh closed this as completed Jul 6, 2016
@richvdh
Copy link
Member

richvdh commented Jul 6, 2016

The fix is now landed in olm-master, but since vector is still configured to use the broken (release) version of olm, I'm keeing this bug open for now.

@richvdh richvdh reopened this Jul 6, 2016
@dluciv
Copy link
Author

dluciv commented Jul 9, 2016

Also when releasing please be so kind to update https://matrix.org/packages/npm/olm/olm-0.1.0.tgz to https://matrix.org/packages/npm/olm/olm-0.1.1.tgz (or some other version, not sure about this) and don't forget to update everything dependent on it. I know two places (but I guess there are more) where it is mentioned:

  • README.md in master branch with Olm instruction
  • package.json in develop branch with optional dependency in olm-....tgz

Hope to see updated version of Olm at least for develop branch soon! Thank you!

richvdh added a commit that referenced this issue Jul 11, 2016
@richvdh
Copy link
Member

richvdh commented Jul 11, 2016

I have now released olm 1.0.0, and updated the develop branch of vector to use it; accordingly considering this fixed. Thanks again for the report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants