-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Argon2 for encrypted vaults #3502
Conversation
- add required packages - configure webpack to work with webAssembly - add neccessary config to `content_security_policy` to allow webAssembly
Let's set length to 32 bytes to match length expected by AES-GCM
Catch erros if vaults migration to Argon2 fails and allow to continue with old vaults encrypted with PBKDF2. Log analytics event when vaults are succesfully migrated.
For fixing tests we need to update Jest to the newest version as they apparently added support for webassembly some time ago. There is a bunch of errors after I've upgraded |
@@ -265,21 +273,29 @@ export default class InternalSignerService extends BaseService<Events> { | |||
return true | |||
} | |||
|
|||
const { vaults, version } = await migrateVaultsToArgon(password) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nitpicking here but I think we could improve our approach if we kept the concern for migrating vaults separate from retrieving them. So, rather than returning vaults here, I propose we just return a boolean indicating success or failure and continue using getEncryptedVaults
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be true internally in the encryption layer, but for the caller it's the opposite--encryption layer should be completely in charge of making sure that the data is in the latest format, just like the db objects migrate before returning data.
Vault version is basically a constant across a given run of the extension. All retrievals of encrypted data should automatically reencrypt to the latest version. All encryptions should use the latest version. That fact should be transparent to almost everyone. Migrations should not be even a little bit optional (as skipping a function call might make them). This is how we ensure calling code doesn't make weird decisions about not wanting to upgrade, or weird mistakes about sometimes not doing so, etc: we give no one a choice or a hook.
To put this in terms of separation of concerns: the encryption layer is concerned with encrypting and decrypting data, and ensuring to the extent possible that all encrypted data is in the latest format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought about this some more. The gap between this and how the internal signer was already implemented is not huge but it's gnarly. I don't think we should deal with it right now, unfortunately.
The way this would manifest is that we would have getEncryptedVaults
transparently handle vault migration, and writeLatestEncryptedVault
would be redone as a writeLatestVault
and would transparently handle encryption and fallbacks.
Again, I think we can push this to later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments. Pondering how we can push more of the migration out of the service core and into the encryption layer... Right now the two are very entwined (really an old decision, nothing to do with the Argon2 stuff other than that it adds complexity) and it's leading to a gnarly unlock
method that's starting to coordinate too much.
Not sure if it's worth refactoring further right now or not, will try to give more thoughts later today.
@@ -121,6 +126,7 @@ interface Events extends ServiceLifecycleEvents { | |||
// TODO message was signed | |||
signedTx: SignedTransaction | |||
signedData: string | |||
migratedToArgon2: never |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argon2 is an internal storage detail--it should not escape the service IMO. If we ever want to do a migration that isn't transparent to the user (eg because we want to encourage users to upgrade), we can revisit, but right now we should treat it as an implementation detail.
@@ -265,21 +273,29 @@ export default class InternalSignerService extends BaseService<Events> { | |||
return true | |||
} | |||
|
|||
const { vaults, version } = await migrateVaultsToArgon(password) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be true internally in the encryption layer, but for the caller it's the opposite--encryption layer should be completely in charge of making sure that the data is in the latest format, just like the db objects migrate before returning data.
Vault version is basically a constant across a given run of the extension. All retrievals of encrypted data should automatically reencrypt to the latest version. All encryptions should use the latest version. That fact should be transparent to almost everyone. Migrations should not be even a little bit optional (as skipping a function call might make them). This is how we ensure calling code doesn't make weird decisions about not wanting to upgrade, or weird mistakes about sometimes not doing so, etc: we give no one a choice or a hook.
To put this in terms of separation of concerns: the encryption layer is concerned with encrypting and decrypting data, and ensuring to the extent possible that all encrypted data is in the latest format.
background/main.ts
Outdated
@@ -1151,6 +1151,12 @@ export default class Main extends BaseService<never> { | |||
} | |||
}) | |||
|
|||
this.internalSignerService.emitter.on("migratedToArgon2", async () => { | |||
this.analyticsService.sendOneTimeAnalyticsEvent( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the few cases where we should just call this directly on the service IMO. Piping analytics through events makes sense when we're observing an existing action, but less so when we're wanting to track an action that is internal to a service.
@@ -149,6 +155,8 @@ const isKeyring = ( | |||
export default class InternalSignerService extends BaseService<Events> { | |||
#cachedKey: SaltedKey | null = null | |||
|
|||
#cachedVaultVersion: VaultVersion = VaultVersion.PBKDF2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should never change, right? All vaults should be argon2 by the time the internal signer service sees them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well initially for our existing users they are VaultVersion.PBKDF2
, once they are migrated successfully to Argon then yeah, this should never change again but if the migration fails then I want to be able to use old vaults so these users are not stuck with locked keys - that's why I wanted to keep that info here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see the thinking here. Here's how I think we handle that possibility:
- Every time we unlock, we should try to migrate. The migration should check that once an Argon2 vault is encrypted, it can also be decrypted and produce the same data as the PBKDF2 vault had, before writing the migration to
localStorage
. - If the migration fails, we should log an analytics event. The unlocking process should continue normally (which will use the vault's current version, PBKDF2 if there was an error with the Argon2 encryption).
- The migration process should tell us if it succeeded, i.e. we should be able to know “the internal signer data is all on the latest version of encryption”—this is just a boolean.
- When we save the vault, we should use the strongest encryption we have (i.e., Argon2). If that fails, we should log an analytics event. At this point, if all vaults are on the latest version of encryption, we should fail the same way we would have failed if PBKDF2 blew up. If all vaults are not on the latest version, we should fall back to PBKDF2. In practice, this means we never want to end up with a newer vault on an older version of encryption than we have used until now—we should treat that as a full system failure.
- After a set amount of time (maybe 6 months?) without errors in analytics, we should consider dropping the PBKDF2 fallbacks for encryption (but never for the migration for decryption).
This is more complex, but it never winds back security, and it doesn't leak encryption implementation details (vault version) outside of the encryption code.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, let me try to implement it and see how it will go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented:
- on every unlock we try to migrate
- test if the vault encrypted with Argon produces the same data as before the migration
- log analytics event on migration fail
- migration process should return a boolean if it was successful
What is not clear to me:
When we save the vault, we should use the strongest encryption we have (i.e., Argon2)....
So this point - if I understand correctly what you want to achieve here - is something I would rather avoid putting into code, because it should never be the case where we are writing new vault encrypted with Argon while old vaults are on the PBKDF2.
That is because to save new vault we have to have the service unlocked and we are trying to migrate during unlocking. If the migration fails then most likely writing a new vault with Argon will fail as well. I would probably go with saving the current encryption algorithm in the service as this is implemented right now and then we can use it to add new vaults without adding more complexity 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of notes on the reasoning, but ultimately I think we're good here ->
it should never be the case where we are writing new vault encrypted with Argon while old vaults are on the PBKDF2.
I agree this isn't ideal, but it is strictly better than having all vaults on PKDF2. The goal of having version alongside the vault is that we can have mixed versions if absolutely necessary.
If the migration fails then most likely writing a new vault with Argon will fail as well.
I don't know why argon2 would fail, so there's nothing to indicate to me whether such a failure would be transient or not. It could fail during migration due to issues loading wasm, but it could also run into transient resource limitations--again, it's unclear to me.
All that said: you are right! The piece I forgot about here is that we (correctly) don't keep the password around after the initial unlock
. This means if the initial unlock
fails to use argon2 to derive the key, and we want to try to encrypt with an argon2-derived key later… We would have to keep the password around/cached instead of just the key. A definite no! So let's put this one to rest.
- allowed Jest to fetch WebAssembly files - moved `crypto.subtle` mock to global setup
- for Jest to work with WebAssembly we need to update to next major version - to support dependencies for new Jest version we need to bump Typesript as well - let's fix problems found by new Typescript version
Allow destructuring objects to remove unwanted fields from the objects. This is pattern we are using often across the codebase.
@@ -10,6 +11,11 @@ export type EncryptedVault = { | |||
cipherText: string | |||
} | |||
|
|||
export enum VaultVersion { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: my note about not leaking vault version information: that basically means we should try to be able to not export this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see but if we need to know and save (in the service) on which version we are then we need this exported, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately discussed in #3502 (comment), since what I was pushing for is precisely not saving in the service which version we're on—but the need for that has been clarified.
- return `success` boolean - make sure decrypted vaults match - send event on migration fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handful of final tidbits here:
- Let's rename to
migrateVaultsToLatestVersion
. - Let's return the error message from the exception if there is a failure.
- Let's bubble the error message to the migration failure analytics event.
- Let's update the analytics event to
VAULT_MIGRATION
alongside the above. MIGRATION_FAILED
shouldn't be a one-time event, we should see it every time a migration fails. It'll help us better understand whether we're seeing transient or permanent failures.- Last but not least: I think the
success
flag should actually be amigrated
flag, which should betrue
if the vaults (a) needed migration and (b) succeeded at migrating. I think the error message should be the true indicator that there was an error. So we would send the migration eventif (migrated)
, and send the failure eventif (error !== undefined)
. Thoughts?
Pushed the above changes to accelerate a little; we can always walk the commit back if necessary. The main thing I noticed while adding a test for the error messages is that (and this was true before my changes) if we track migration failures, we track migration failures due to incorrect passwords as well as ones that might be due to an internal technical issue. There are a few ways we could mitigate this, but my instinctive reaction is that we roll it out as-is and see how it goes. |
Vault migration is no longer tracked as Argon2 specifically, but generically for all migrations. Already-migrated vaults are not tracked, and the migration function return value reflects that no migration was performed. Additionally, error messages are bubbled out of the migration function and reported up to the caller. The main outcome here is that PostHog migration events include the migrated-to version, and PostHog migration failure events include the error message. This will leave us open to future migrations, and will let us know if there are certain failures that are happening broadly that we may be able to do something about. Notably, wrong passwords will be tracked as migration errors if a wrong password is typed with an older vault version in the mix. Mitigating this may or may not be a good idea.
cfa620e
to
d0e2908
Compare
ugh 😞 so what exactly do we want to measure with these events? I think info on how many users are migrated is the most important one anyway so we are good here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All righty, good to go here. Since I pushed some code, going to let @jagodarybacka do a final sanity check and then merge.
Secondarily, I think it'll be useful to know potentially know what we need to look at if they aren't being migrated, or at least how many failures in migrations we're seeing. I'm comfortable shipping without being sure we can get that info though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QA went fine, let's 🚢
## What's Changed * Add private key onboarding flow by @jagodarybacka in #3119 * Private key JSON import by @jagodarybacka in #3177 * Allow export of private keys and mnemonics by @jagodarybacka in #3248 * Export private key form by @jagodarybacka in #3255 * Unlock screen for the account backup by @kkosiorowska in #3257 * Show mnemonic menu by @jagodarybacka in #3259 * Fix background blur issue by @jagodarybacka in #3265 * Account backup UI fixes by @jagodarybacka in #3270 * Fix unhiding removed accounts by @jagodarybacka in #3282 * New error for incorrectly decrypted JSON file by @jagodarybacka in #3293 * Export private keys from HD wallet addresses by @jagodarybacka in #3253 * Refactor keyring redux slice to remove `importing` field by @jagodarybacka in #3309 * 📚 Accounts backup by @kkosiorowska in #3252 * Catch Enter keypress on Unlock screen by @jagodarybacka in #3355 * Rename `keyring` to `internal signer` and other improvements by @jagodarybacka in #3331 * 🗝 QA - Accounts backup and private key import by @jagodarybacka in #3266 * Remove private key signers if they are replaced by accounts from HD wallet by @jagodarybacka in #3377 * RFB 4: One-Off Keyring Design by @Shadowfiend in #3372 * Copy to clipboard warning by @kkosiorowska in #3488 * Allow setting custom auto-lock timer by @hyphenized in #3477 * Use Argon2 for encrypted vaults by @jagodarybacka in #3502 * 👑 Private keys import and accounts backup by @jagodarybacka in #3089 * Untrusted assets should not block the addition of custom tokens by @kkosiorowska in #3491 * Flip updated dApp connections flag by @Shadowfiend in #3492 * v0.41.0 by @Shadowfiend in #3531 * Switch to a given network if adding a network that is already added. by @0xDaedalus in #3154 * Remove waiting for Loading Doggo component in E2E tests by @jagodarybacka in #3541 * Squeeze content to better fit on Swaps page by @jagodarybacka in #3542 * Refactor of terms for verified/unverified assets by @kkosiorowska in #3528 * Fix ChainList styling by @fulldecent in #3547 * Update release checklist by @jagodarybacka in #3548 * Fix custom asset price fetching by @hyphenized in #3508 * Sticky Defaults: Make Taho-as-default replace MetaMask in almost all cases by @Shadowfiend in #3546 ## New Contributors * @fulldecent made their first contribution in #3547 **Full Changelog**: v0.41.0...v0.42.0 Latest build: [extension-builds-3549](https://github.com/tahowallet/extension/suites/14268975651/artifacts/801826435) (as of Thu, 13 Jul 2023 09:51:56 GMT).
Resolves #3470
What
Let's use Argon2 instead of PBKDF2 🔑
What was already done:
content_security_policy
to allow webAssembly - without'wasm-eval'
we are not able to use argon implementation in the extensionTesting
main
, add some HD wallets, checkout this branch, reload and unlock the wallet, make sure you don't see the error about failed migration in the background console, check if analytics event has been emitted, lock and unlock more than one timeLatest build: extension-builds-3502 (as of Sun, 02 Jul 2023 21:28:25 GMT).