I've set out to understand how Bitwarden keeps secrets, with an end goal of decrypting them. I am working with my secrets, and I obviously know my passphrase - this is not a demonstration of a weakness or vulnerability. Additionally, I am not a cryptographic expert, so this should not be considered a review for strength or integrity.
You are able to get hold of the encrypted secrets with suitable access to the MSSQL server, and the secrets are held by clients in the same state. For this reason, I am reviewing the command line interface sources - the server has nothing to give on this front.
First of all, we have to understand what a Bitwarden secret looks like - they refer to them as a "CipherString".
Fundamentally, a CipherString
contains the following information:
encType
- the encryption type used for this secret- A numeric ASCII representation of the
EncryptionType
enum
- A numeric ASCII representation of the
ciphertext
- the encrypted payloadiv
- the initialization vector (optional)mac
- the message authentication code (optional)
The ciphertext
, iv
and mac
fields are encoded using base64.
All of my secrets appear to have all of these fields present, I imagine that secrets produced with an older client may not.
The encType
field is separated from the rest of the string using a period (.
), while the others are separated from each other using a pipe (|
).
To get an example CipherString
, run:
jq -r '."ciphers_\(.userId)" | to_entries | .[0].value.name' < '~/.config/Bitwarden CLI/data.json'
The user's encryption and message authentication keys are also stored in this format:
jq -r '.encKey' < '~/.config/Bitwarden CLI/data.json'
The command line client also stores sensitive run-time data in the JSON datastore using a __PROTECTED__
prefix (for example __PROTECTED__key
).
jq -r '.__PROTECTED__key' < '~/.config/Bitwarden CLI/data.json'
This data is the same as the secrets described above, but for some reason has been stored as a base64-encoded blob with the fields shuffled.
Sensitive data stored in this format can be decrypted using keys provided in the ${BW_SESSION}
variable.
Before we can go any further, we must produce two keys - the "Encryption Key", and the "Message Authentication Key".
!!! bug At this point, I'd like to voice my opinion that the Bitwarden sources are quite tangled and overly complex. This isn't a problem in itself, though does open the possibility for mis-handling data when passing things around the application. It took me quite some time to produce a functional model of the procedure, and this wasn't helped by the naming scheme... There are a number of "keys" coming up, and they do not have clear / unambiguous names in the Bitwarden sources. My first attempt to do this a few months ago failed, largely due to this. I'll try to keep them clear here.
This write up doesn't cover generating and encrypting the keys to begin with, but does outline the steps required to re-produce the keys.
There are two methods to produce the "Source Key" - used to derive keys that provide access to the "Encryption Key" and "Message Authentication Key".
- Using
__PROTECTED__key
(see above) and the${BW_SESSION}
environment variable These are setup / provided when you runbw unlock
, and removed when you runbw lock
- Using the master password, the user's email, and the appropriate Key Derivation Function (KDF)
The following diagram omits verification for brevity.
!!! tip
Before attempting to follow this, you must run bw unlock
, and have the __PROTECTED__key
and ${BW_SESSION}
variables.
The ${BW_SESSION}
variable is a 64-byte value that is encoded using base64 - it holds the intermediate keys that allow access to the data in __PROTECTED__key
.
The first 32-bytes are the encryption key, the last 32-bytes are the message authentication key.
To verify the session, concatenate the iv
, and ciphertext
, and feed it through a HMAC SHA-256, along with the message authentication key.
The output should match the 32-byte message authentication code held in the __PROTECTED__key
.
To Decrypt the "Source Key", take the iv
and ciphertext
, and feed them into an AES-256 CBC, using the encryption key from ${BW_SESSION}
.
Don't forget to discard the padding.
If you instead have access to the user's master password and email, you can produce the "Source Key" from there instead using PBKDF2 / SHA-256.
In this situation, the user's email address is used as the salt, and their master password is used as the passphrase. The iteration count is stored in the user's data and defaults to 100,000 (but it is configurable).
Once you've got the "Source Key", you must derive the intermediate encryption and message authentication keys. The keys are derived using the HKDF Expand, with SHA-256, and thus output length of 32-bytes.
- Intermediate encryption key - using an "info" input of
b'enc'
- Intermediate message authentication key - using an "info" input of
b'mac'
Using the intermediate keys produced above, it is possible to decrypt the user's actual keys - both for encryption and message authentication.
The secret (a standard CipherString
) is stored in the .encKey
variable of ~/.config/Bitwarden CLI/data.json
.
The procedure is exactly the same as all other secrets, but uses the intermediate keys instead of the user's final keys.
The decrypted value should be 64-bytes in length - a 32-byte encryption key, followed by a 32-byte message authentication key.
To verify a secret, use the appropriate message authentication key, along with the other elements of the CipherString
.
As before, the IV and Ciphertext should be concatenated.
Finally, to decrypt a secret, use the appropriate encryption key, along with the other elements of the CipherString
.
I have produced a python library to aid in decrypting Bitwarden secrets, available on GitHub.
An example utility to decrypt the name of the first entry is given below:
from getpass import getpass
from bitwarden.util import load_user_data
from bitwarden.user_key import UserKey
from bitwarden.crypto_engine import CryptoEngine
user_data = load_user_data()
# gather data
email = user_data['userEmail'].encode('utf-8')
kdf = user_data['kdf']
kdf_iterations = user_data['kdfIterations']
master_password = getpass().encode('utf-8')
# grab a secret
ciphers_key = 'ciphers_%s' % ( user_data['userId'] )
ciphers = iter(user_data[ciphers_key].values())
thing = next(ciphers)['name']
# produce the encryption key
uk = UserKey(email, master_password, kdf, kdf_iterations)
encryption_key = uk.user_key
# decrypt the secret
ce = CryptoEngine(encryption_key, user_data['encKey'])
plaintext = ce.decrypt(thing)
print('%s' % ( plaintext.decode('utf-8') ))