The encryption key is derived from the user password and salt using the PBKDF2 (Password-Based Key Derivation Function 2) algorithm. Here's how it works:
-
Password and Salt: The user's password and a randomly generated salt are input to the PBKDF2HMAC function.
- The salt is a random string of bytes, ensuring that even if two users have the same password, their keys will still be different.
-
Key Derivation:
- The PBKDF2 algorithm is used to generate a cryptographic key based on the password and salt.
- SHA-256 is used as the hashing algorithm within PBKDF2.
- The function applies multiple iterations (default is 100,000) of the hash function to make it more computationally expensive and harder to brute-force.
-
Generated Key: The output of the PBKDF2HMAC function is the derived cryptographic key, which is used for encryption and decryption operations (in this case, AES encryption).
This key is not the same as the user password, but it is derived from it, and it will be used to encrypt and decrypt data securely.
kdf = PBKDF2HMAC(
algorithm=hashes.SHA256(), # SHA-256 hash function
length=32, # The length of the derived key (256 bits for AES-256)
salt=salt, # Salt used to make the key derivation unique
iterations=iterations, # Number of iterations to make key derivation slow and secure
backend=default_backend() # Cryptographic backend to use for operations
)
# Derive the key (this is a 256-bit AES key)
derived_key = kdf.derive(password.encode()) # The password is encoded to bytes
In this code:
- The salt ensures that the same password will result in a different key each time it's used.
- The iterations make it computationally expensive to generate the key, protecting against brute-force attacks.
The derived key will then be used with AES encryption (AES-GCM in this case) for encrypting and decrypting data securely.
Even though the salt is random each time, you store it alongside the encrypted data. That way, when the user wants to decrypt something, you:
- Retrieve the stored salt (which was used during encryption),
- Use the userβs password + that specific salt to re-derive the same key,
- Decrypt the data with that key.
For each message, you're doing:
salt = os.urandom(16)
key = derive_key(password, salt)
ciphertext = encrypt(data, key)
store(salt, ciphertext)
Then during decryption:
salt = load_stored_salt()
key = derive_key(password, salt)
data = decrypt(ciphertext, key)
You donβt reuse the salt β you just store it per message or file. That way, even if every piece of data has a different salt (and thus different keys), each one can still be decrypted because its salt is known.
- Salt = public
- Password = secret
- Key = derived from both
- Salt must be saved with the data for decryption
Yes, you're right in observing that a new key is generated each time you use a new salt.
Let me clarify how it works:
Every time you do this:
key = PBKDF2HMAC(..., salt=salt, ...).derive(password)
The key is derived from both the password and the salt.
- The salt prevents precomputed attacks (like rainbow tables).
- Even if two users use the same password, different salts β different keys.
- It makes brute-force attacks much harder.
β οΈ But yes β if you use a different salt next time, the same password will produce a different key.
- When you create a key, generate a random salt.
- Then store the salt securely alongside the encrypted data (e.g., in a database, or even prepended to the ciphertext).
- Later, when decrypting, retrieve the same salt and use it again with the same password to regenerate the same key.
That way:
same password + same salt β same key β successful decryption β
But:
same password + different salt β different key β decryption fails β
- Yes, using a different salt each time results in a different key, even with the same password.
- This is by design β and why the salt must be stored if you want to reproduce the key later.
Would you like a clear code example that shows encrypting with a salt and saving it, then decrypting using the saved salt?
Great catch β and you're absolutely right to be curious. Let's walk through this carefully:
There are two types of AES encryption modes to consider here:
AES Mode | Requires Manual Padding? | Authenticated (Integrity-Checked)? |
---|---|---|
ECB / CBC | β Yes (must pad manually) | β No |
GCM / CCM / EAX | β No padding needed | β Yes |
You're using AES-GCM (AESGCM
class from cryptography.hazmat.primitives.ciphers.aead
). This is a stream cipher mode, and it does not require padding. Here's why:
AES-GCM handles plaintext of any length, because it encrypts in chunks internally β similar to a stream cipher. It does not expect the input to be block-aligned (like 16, 32, 48 bytes etc.).
So this line works just fine:
ciphertext = aesgcm.encrypt(iv, plaintext.encode(), None)
Even if plaintext.encode()
is 7 bytes, 23 bytes, or 75 bytes β no padding needed.
If you had used CBC mode like:
cipher = Cipher(algorithms.AES(key), modes.CBC(iv))
Then you must pad manually, otherwise youβll get:
ValueError: The length of the provided data is not a multiple of the block length.
- Your original code uses AES-GCM, which doesn't require padding.
- That's why you could encrypt any-length string and it still worked.
- If you switched to ECB or CBC, you'd need to manually pad using
Crypto.Util.Padding
or similar.
Great question.
When we say an encryption mode is integrity-checked, we mean:
It verifies during decryption that the data hasnβt been modified, corrupted, or tampered with.
This is a crucial security feature, especially if encrypted data is sent over a network or stored where someone might try to alter it.
- If you just encrypt data (like with AES-CBC), it protects confidentiality (the attacker canβt read it).
- But if someone modifies even 1 byte of the ciphertext (or IV), CBC will decrypt garbage β and you wonβt know it.
- With AES-GCM, decryption fails completely if the ciphertext or IV is tampered with. It detects and rejects corrupted data.
This is what makes AES-GCM an "authenticated encryption mode" (AEAD).
Letβs say we encrypt "The eagle has landed."
with AES-GCM and later someone modifies the encrypted text.
# When decrypting with AES-GCM
plaintext = aesgcm.decrypt(iv, tampered_ciphertext, None)
You'll get:
cryptography.exceptions.InvalidTag: Authentication tag mismatch
This error tells you something was altered β and the system refuses to give you any decrypted data.
Mode | Keeps Data Secret? | Detects Tampering? |
---|---|---|
AES-CBC | β Yes | β No |
AES-GCM | β Yes | β Yes |
If you're storing user conversations or sensitive data, always prefer AES-GCM or another AEAD (authenticated encryption with associated data) mode.
Install the required library:
pip install cryptography