Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End-To-End Encryption: Custom Encryption Library #21144

Closed
robertjchen opened this issue Jun 20, 2023 · 53 comments
Closed

End-To-End Encryption: Custom Encryption Library #21144

robertjchen opened this issue Jun 20, 2023 · 53 comments
Assignees
Labels
NewFeature Something to build that is a new item. Task Weekly KSv2

Comments

@robertjchen
Copy link
Contributor

robertjchen commented Jun 20, 2023

cc: Margelo

Please implement a custom encryption library to be used as part of the new End to End Encryption feature in the App.

Namely, it will provide symmetric (AES) and asymmetric (RSA4096 + Kyber1024) encryption functions to be used by the App as well as in the backend.

Please refer to the planning doc for additional context!

Considerations

  • The native (TurboModule?) part should be portable enough (C/C++) such that we can use the encryption library/functionality internally for backend purposes at Expensify.
  • For Web, it should also cleanly compile to a wasm binary for direct use in the Web client.
  • The repository will be hosted under the Expensify org as we'll be using the functionality internally ✨

Proposed Interface

// synchronous mockup, but final solution may be asynchronous as well 👍

  • KEMGenKeys() - return a JSON object in the format of:
          {
        "kyber1024": {
    		"pubkey": "<base64 public key>",
    		"privkey": "<base64 private key>"
         },
        "rsa4096": {
    		"pubkey": "<base64 public key>",
    		"privkey": "<base64 private key>"
         },

For the following functions, the pubKeys and privKeys arguments should be provided in JSON format:

    privKeys : {
        "kyber1024" : {
            "privkey": "<base64 private key>",
         },
        "rsa4096" : {
            "privkey": "<base64 private key>",
        },
    }

// ---

    privKeys : {
        "kyber1024" : {
            "pubkey": "<base64 public key>",
         },
        "rsa4096" : {
            "pubkey": "<base64 public key>",
        },
    }
  • KEMEncrypt(pubKeys, dataString) - encrypts a given string RSA4096_Encrypt(Kyber1024_Encrypt(dataString)) given the pubKey set (input string should be padded behind the scenes if necessary, etc.). The pubKeyHash should be a hash of the two public keys combined. The result is the raw encrypted string in base64 format (note that this is directly encrypted by RSA4096 + Kyber1024, not AES!)

     <base64 data>
    
  • KEMDecrypt(privKeys, dataString) - decrypts a given string given the privKey set (input string should be padded behind the scenes if necessary, etc.)

    <base64 data>
    
  • KEMSign(privKeys, dataString) - signs a given string given the privKey set (see doc for additional implementation notes)

    <base64 data>
    
  • KEMVerify(pubKeys, dataString) - verifies the signature of a given data string given the pubKey set

    <base64 data>
    
  • AESDecrypt(iv, key, data) // simple symmetric encryption w/ AES-GCM

    <base64 data> // contains iv + encrypted data + etc.
    
  • AESEncrypt(iv, key, data) // simple symmetric encryption w/ AES-GCM

    <base64 data> // contains iv + encrypted data + etc.
    
@robertjchen robertjchen added Daily KSv2 NewFeature Something to build that is a new item. Task labels Jun 20, 2023
@melvin-bot melvin-bot bot added Weekly KSv2 and removed Daily KSv2 labels Jun 20, 2023
@Expensify Expensify deleted a comment from melvin-bot bot Jun 20, 2023
@robertjchen robertjchen changed the title [WIP] End-To-End Encryption: Post-Quantum Kyber + Traditional RSA [WIP] End-To-End Encryption: Custom Encryption Library Jun 20, 2023
@melvin-bot melvin-bot bot added the Overdue label Jun 28, 2023
@robertjchen robertjchen changed the title [WIP] End-To-End Encryption: Custom Encryption Library End-To-End Encryption: Custom Encryption Library Jul 5, 2023
@robertjchen
Copy link
Contributor Author

High Level spec completed, opening for evaluation/discussion/implementation

@mrousavy
Copy link
Contributor

mrousavy commented Jul 5, 2023

Hey! We @margelo are excited to get our hands dirty with this!! 💪

Are we assuming synchronous functions for everything, or should there also be async options? E.g. KEMEncrypt(..) is blocking/synchronous, and KEMEncryptAsync(..) is asynchronous/Promise based?

@robertjchen
Copy link
Contributor Author

Thanks!! I think async might actually be the "default" behavior for all of the functions on the JS side, but I'll leave it up to you all to decide what would work better with our current App design 🙏

@mrousavy
Copy link
Contributor

mrousavy commented Jul 6, 2023

I think we should add both options, but maybe async should be default and the sync one should be suffixed with Sync? It depends on how fast those funcs are. If they#re slow, people should not really use them synchronously. If they're really really fast, we can add sync options.

@robertjchen
Copy link
Contributor Author

Makes sense! It would be interesting to benchmark things to compare- I'd imagine some of the functions would be much, much slower when compiled to asm.js (for Web) while a native module would be much faster 🤔

@roryabraham
Copy link
Contributor

Just porting over these comments from slack ... it sounds like we've agreed that we should use WebAssembly for web instead of asm.js

So that means we'll use C/C++ and JSI on native and C/C++ and WebAssembly on web

@roryabraham
Copy link
Contributor

roryabraham commented Jul 8, 2023

Also, have we agreed on which Kyber implementation we'll use? Are we using the reference implementation?

Interestingly, there's a Rust implementation that says:

Compiles to WASM using wasm-bindgen and has a ready-to-use binary published on NPM

We might be able to use that w/ bindings from Rust -> C -> JSI (example), but it would probably be easier to compile another C/C++ based library for wasm

@melvin-bot melvin-bot bot added the Overdue label Jul 14, 2023
@robertjchen
Copy link
Contributor Author

Also, have we agreed on which Kyber implementation we'll use?

The reference implementation works, but it's done in a way to allow for easy testing/NIST evaluation. There's also https://github.com/PQClean/PQClean , which organizes Kyber and other encryption algorithms to have a standardized interface (for testing, library making, etc.), so it would be good to refer to that implementation as well.

@melvin-bot melvin-bot bot removed the Overdue label Jul 17, 2023
@roryabraham
Copy link
Contributor

Oh yeah, we've definitely talked about PQClean before in previous POCs. This seems like something we should predesign in slack. @mrousavy @margelo do you want to do some research and lead that predesign?

@mrousavy
Copy link
Contributor

@roryabraham Sure, lemme talk to my team! :)

@chrispader
Copy link
Contributor

@robertjchen @roryabraham i'm gonna start working on implementing the encryption library now!

Gonna keep u updated here and on Slack! 👍🚀

@melvin-bot melvin-bot bot added the Overdue label Jul 25, 2023
@robertjchen
Copy link
Contributor Author

Awesome, please keep us posted on the details!

@melvin-bot melvin-bot bot removed the Overdue label Jul 27, 2023
@robertjchen
Copy link
Contributor Author

robertjchen commented Sep 20, 2023

Yes, exactly! 👍

Great! That should work for our purposes, thanks 🙌

Also not sure if it's necessary to sign and verify the rsaEncryptedKyberCipherText, but instead mostly sign and verify the data we send.

Also as far as my research went, we can't use Kyber for signatures and verification. We'll have to use either Dillithium (by the same creators), Falcon or Sphincs.

That makes sense to me- however, the only pitfall I can think of is that you don't know if the person sending you the ciphertext is really the person that they claim to be. Since anyone is able to arbitrarily establish a shared secret with you if they know your public RSA and Kyber key without having to prove they are who they are 🤔

Without introducing a new algo, I think what we can do is add on another piece of data to prove rsaEncryptedKyberCipherText is coming from the correct sender by just a few more steps.

The sender would use their own private Kyber key to generate a new verificationCipherText and verificationSharedSecret. They would then RSA encrypt a known string or hash of the message to be sent, using their own private key, yielding rsaEncHash.

Using verificationSharedSecret, they would then AES-encrypt rsaEncHash as encryptedRSAEncHash and then send those along to the receiver, which would serve as the signature.

The receiver would get a signature composed of the following:

Signature = verificationCipherText || encryptedRSAEncHash

  • The receiver would first look up the public Kyber key of the sender/person they intend on receiving messages from, and generate the shared secret from verificationCipherText, and see if they can decrypt encryptedRSAEncHash using the shared secret.
  • If they can, that means they know the sender has the Kyber private key. ✅
  • Once decrypted, encryptedRSAEncHash yields rsaEncHash, then they can try to decrypt this using the RSA public key of the sender/person they intend on receiving messages from.
  • If they're able to do so, then they know the sender has the RSA private key. ✅

That should be enough to verify message authenticity and serve as a signature mechanism. If either RSA or Kyber was broken one of those above checks would fail, ensuring safety.

What are your thoughts on that approach? 🙏

@chrispader
Copy link
Contributor

chrispader commented Sep 21, 2023

The sender would use their own private Kyber key to generate a new verificationCipherText and verificationSharedSecret. They would then RSA encrypt a known string or hash of the message to be sent, using their own private key, yielding rsaEncHash.

The problem here is, that we can't use the Kyber private key for encapsulating a key. It's only working in the opposite direction. That's why Kyber is fundamentally not designed for signing and verifying, that's what say Dillithium is for.

The proposal seems very logical, but i don't think we can achieve this with Kyber...

@robertjchen
Copy link
Contributor Author

robertjchen commented Sep 22, 2023

Ah that makes sense! It looks like it's a one-way operation and only for key exchange 😞

@robertjchen
Copy link
Contributor Author

In that case, let's move forward with just plain encryption for now without signatures. 👍

(The server would be the one to ensure proper identity- certain classes of attacks would be feasible but could be mitigated by user awareness and other protections elsewhere in the stack. It's always a tradeoff between complexity and security and we're already made those choices for the sake of usability so I don't think we'll miss this aspect too much 😅 )

@chrispader
Copy link
Contributor

Got it! 👍

So all of the encryption functionality is already working and we should be ready to either publish it to some (private) package registry or use it directly through git.

The only thing i'm currently still working on is using the .wasm binary in web. It basically compiles already, but i haven't made it work in Expensify/App yet.

@robertjchen
Copy link
Contributor Author

Awesome, can't wait to see the .wasm part working as well! 🙌

@chrispader
Copy link
Contributor

Giving a brief update on the WebAssembly journey here...

The WebAssembly process on web principally consists of 3 parts:

  1. Compiling OpenSSL into a WASM-compatible library ⚠️
  2. Compiling the encryption lib (using OpenSSL) into a .wasm-binary ✅
  3. Creating a JS-wrapper to have consistent function signatures (with native) and simplify WASM code ✅

Steps 2 and 3 are already done and working. I created compilation scripts for compiling the library as well as compiling a local openssl git submodule. These scripts are (almost) cross-platform already, so it should be possible to compile the library on any OS. I guess, we'll automate this process in a pipeline anyway, but this might be good for development and debugging in the future.

Since WebAssembly relies on C++ code to be compiled with the Emscripten toolchain, we have to compile OpenSSL ourself, since there are no working pre-compiled binaries. Also, we want to ensure, that we're using the original library for security reasons.

I've had some major breakthroughs in this process yesterday and i'm currently fixing some memory allocation problems, when it comes to using RSA from the OpenSSL library. Other than that, everything should be done 👍

I'm hoping to finish this by the end of the week.

@robertjchen
Copy link
Contributor Author

@chrispader Great work! Let us know how it goes (if OpenSSL's complexity proves a bit too much, maybe we could consider just using the basic raw reference C implementations for RSA/AES, especially since we're just using a small part of OpenSSL?)

@melvin-bot melvin-bot bot added the Overdue label Oct 19, 2023
@roryabraham
Copy link
Contributor

@chrispader any update here?

@chrispader
Copy link
Contributor

chrispader commented Oct 22, 2023

Yes! Just made the library completely work with WebAssembly for the first time 🥳🥳🥳

There are some quirks and some weird stuff going on when compiling WebAssembly, i'd like to investigate and then document, so the whole build process is clear 👍

cc @roryabraham @robertjchen

@chrispader
Copy link
Contributor

I added a PR/branch for testing the WebAssembly library: #30146

This PR also introduces a useEncryptify hook, because WebAssembly is effectively async and has to be loaded at app start

@robertjchen
Copy link
Contributor Author

Awesome work! Can't wait to try this out locally and see what the numbers are like: #30341

@melvin-bot melvin-bot bot removed the Overdue label Oct 31, 2023
@robertjchen
Copy link
Contributor Author

Update: Ongoing work in #30146

@melvin-bot melvin-bot bot added the Overdue label Nov 14, 2023
@robertjchen
Copy link
Contributor Author

Looking forward to hardware benchmarks, next steps discussion/planning in progress.

@melvin-bot melvin-bot bot removed the Overdue label Nov 15, 2023
@melvin-bot melvin-bot bot added the Overdue label Nov 24, 2023
@robertjchen
Copy link
Contributor Author

Hardware benchmarks posted. Ongoing discussion on next steps!

@melvin-bot melvin-bot bot removed the Overdue label Nov 29, 2023
@robertjchen
Copy link
Contributor Author

Given that the library itself is completed, I think we can close this out for now! 🎉🎉 We'll sync on the main issue on next steps and create new issues from there. Great job @chrispader ! 🙇

@chrispader
Copy link
Contributor

Thanks and wuhuu! 🚀🎉

@chrispader
Copy link
Contributor

Are we still thinking about adding signing and verifying functionality with Dillithium?

@robertjchen
Copy link
Contributor Author

Great q, I think we'll still need it down the line 🤔 We'll see how things pan out in planning now that the priorities are clear

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NewFeature Something to build that is a new item. Task Weekly KSv2
Projects
None yet
Development

No branches or pull requests

5 participants