Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughCentralizes encryption by adding an encryption_utils module (DEK cache, get_encryption_key, encrypted_type, cache reset), and refactors many ORM models and utilities to use encrypted_type()/get_encryption_key() instead of inline StringEncryptedType/AesGcmEngine/CONFIG usage. Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(200,200,255,0.5)
participant App as Application / ORM usage
end
rect rgba(200,255,200,0.5)
participant Model as ORM Model / Column
end
rect rgba(255,200,200,0.5)
participant EncUtil as encryption_utils.encrypted_type
end
rect rgba(255,255,200,0.5)
participant KeyProv as encryption_utils.get_encryption_key
end
App->>Model: read/write encrypted column
Model->>EncUtil: delegate (encrypted_type) for encrypt/decrypt
EncUtil->>KeyProv: call to obtain DEK (callable key)
KeyProv-->>EncUtil: return DEK
EncUtil-->>Model: perform AES-GCM encrypt/decrypt using DEK
Model-->>App: return plaintext / persist ciphertext
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
Greptile SummaryThis PR is a clean, mechanical refactor that centralises all Key observations:
Confidence Score: 4/5
Important Files Changed
Last reviewed commit: 2270927 |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/api/db/test_encryption_utils.py (1)
71-74: Consider adding a test for padding configuration.The
encrypted_type()factory setspadding="pkcs5". For completeness, you might want to verify this configuration is correctly applied, similar to the engine test.💡 Optional test for padding
def test_uses_pkcs5_padding(self): """Padding should be pkcs5.""" col_type = encrypted_type() assert col_type.padding == "pkcs5"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/api/db/test_encryption_utils.py` around lines 71 - 74, Add an assertion to verify the padding configuration on the produced ColumnType: call encrypted_type() in the test (same place as test_uses_aesgcm_engine) and assert that the returned object's padding attribute equals "pkcs5" (e.g., within a new test_uses_pkcs5_padding or by extending test_uses_aesgcm_engine to include this check) so the factory's padding setting is explicitly validated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fides/api/db/encryption_utils.py`:
- Around line 12-28: The current implementation freezes the DEK by caching
CONFIG.security.app_encryption_key in the module-level _cached_dek inside
get_encryption_key(), causing mid-process rotations to be ignored; remove the
global caching so get_encryption_key() always resolves and returns
CONFIG.security.app_encryption_key at call time (and delete or repurpose
_cached_dek and _reset_encryption_key_cache), or alternatively move caching
responsibility into the ORM type factory only (so only that factory uses a local
cached value) while keeping get_encryption_key() uncached for direct callers
like aes_gcm_encryption_util.py and webhook.py.
---
Nitpick comments:
In `@tests/api/db/test_encryption_utils.py`:
- Around line 71-74: Add an assertion to verify the padding configuration on the
produced ColumnType: call encrypted_type() in the test (same place as
test_uses_aesgcm_engine) and assert that the returned object's padding attribute
equals "pkcs5" (e.g., within a new test_uses_pkcs5_padding or by extending
test_uses_aesgcm_engine to include this check) so the factory's padding setting
is explicitly validated.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0b4e9c9a-1c2b-4fd9-b6fd-831ea1ae0247
📒 Files selected for processing (23)
changelog/7588-encrypted-type-factory.yamlsrc/fides/api/db/encryption_utils.pysrc/fides/api/db/util.pysrc/fides/api/models/application_config.pysrc/fides/api/models/chat_config.pysrc/fides/api/models/connection_oauth_credentials.pysrc/fides/api/models/connectionconfig.pysrc/fides/api/models/fides_user.pysrc/fides/api/models/identity_salt.pysrc/fides/api/models/masking_secret.pysrc/fides/api/models/messaging.pysrc/fides/api/models/openid_provider.pysrc/fides/api/models/policy/policy.pysrc/fides/api/models/privacy_preference.pysrc/fides/api/models/privacy_request/privacy_request.pysrc/fides/api/models/privacy_request/provided_identity.pysrc/fides/api/models/privacy_request/request_task.pysrc/fides/api/models/privacy_request/webhook.pysrc/fides/api/models/sql_models.pysrc/fides/api/models/storage.pysrc/fides/api/util/consent_encryption_migration.pysrc/fides/api/util/encryption/aes_gcm_encryption_util.pytests/api/db/test_encryption_utils.py
| _cached_dek: str | None = None | ||
|
|
||
|
|
||
| def get_encryption_key() -> str: | ||
| """Return the DEK. In legacy mode, this is CONFIG.security.app_encryption_key. | ||
| Cached for the lifetime of the process (called on every encrypt/decrypt).""" | ||
| global _cached_dek | ||
| if _cached_dek is not None: | ||
| return _cached_dek | ||
| _cached_dek = CONFIG.security.app_encryption_key | ||
| return _cached_dek | ||
|
|
||
|
|
||
| def _reset_encryption_key_cache() -> None: | ||
| """Reset the cached DEK. For testing only.""" | ||
| global _cached_dek | ||
| _cached_dek = None |
There was a problem hiding this comment.
Don't freeze the encryption key after first use.
get_encryption_key() caches CONFIG.security.app_encryption_key in _cached_dek, so after the first encrypt/decrypt in a worker the key is no longer resolved at use time. That undermines the callable-key goal of this refactor and also changes the new direct callers in src/fides/api/util/encryption/aes_gcm_encryption_util.py and src/fides/api/models/privacy_request/webhook.py: a mid-process key override/rotation will keep using the stale value until restart. Please either remove the cache here or scope the cached path to the ORM type factory only.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/fides/api/db/encryption_utils.py` around lines 12 - 28, The current
implementation freezes the DEK by caching CONFIG.security.app_encryption_key in
the module-level _cached_dek inside get_encryption_key(), causing mid-process
rotations to be ignored; remove the global caching so get_encryption_key()
always resolves and returns CONFIG.security.app_encryption_key at call time (and
delete or repurpose _cached_dek and _reset_encryption_key_cache), or
alternatively move caching responsibility into the ORM type factory only (so
only that factory uses a local cached value) while keeping get_encryption_key()
uncached for direct callers like aes_gcm_encryption_util.py and webhook.py.
JadeCara
left a comment
There was a problem hiding this comment.
🔑 One Key At A Time
thirty-six scattered
keys converge to one factory
callable, not called
Overall looks really solid! The only test gap I saw was maybe on get_encryption_key if config key is empty. The cache guard if _cached_dek is not None means an empty string "" would be cached and reused. This is probably fine (CONFIG validation should prevent it), but a test asserting behavior with an empty/blank key would make the contract explicit — especially since this is the single source of truth for encryption keys going forward.
| Return the encryption key set in CONFIG.security.app_encryption_key. | ||
| Cached for the lifetime of the process (called on every encrypt/decrypt). | ||
|
|
||
| TODO: implement support for other key managers like AWS KMS |
There was a problem hiding this comment.
Everytime I see a TODO (sorry) - I have to ask, do we have this ticketed?
There was a problem hiding this comment.
yes we do! we have a whole epic for the encryption changes :)
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fides/api/db/encryption_utils.py`:
- Around line 23-33: The caching logic in encryption_utils.py incorrectly
conflates "not cached yet" with a cached value of None/empty; change the
sentinel for _cached_dek to a unique object (e.g., _UNSET = object()) and
initialize _cached_dek = _UNSET, then in the getter check "if _cached_dek is not
_UNSET: return _cached_dek", set _cached_dek =
CONFIG.security.app_encryption_key (keeping the global declaration), and update
the warning to trigger when the cached value is falsy (e.g., if _cached_dek in
(None, "") or if not _cached_dek) so a None config is cached once and the
warning is emitted.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 434a4b37-adce-4608-a742-5db203569019
📒 Files selected for processing (1)
src/fides/api/db/encryption_utils.py
| global _cached_dek | ||
| if _cached_dek is not None: | ||
| return _cached_dek | ||
|
|
||
| _cached_dek = CONFIG.security.app_encryption_key | ||
| if _cached_dek == "": | ||
| logger.warning( | ||
| "App encryption key is empty, this may lead to unexpected issues" | ||
| ) | ||
|
|
||
| return _cached_dek |
There was a problem hiding this comment.
Cache is ineffective and warning is skipped when encryption key is None.
If CONFIG.security.app_encryption_key returns None:
- The
is not Nonecheck on line 24 passes (because_cached_dekstarts asNone) _cached_dekgets assignedNonefrom config- The warning check
== ""doesn't matchNone - On subsequent calls, line 24's check fails again since
None is not NoneisFalse
This causes repeated config reads and no warning for a None key. Consider using a sentinel value to distinguish "not yet cached" from "cached as None/empty".
🔧 Proposed fix using a sentinel value
+_NOT_SET = object() # sentinel to distinguish "not cached" from "cached as None"
-_cached_dek: str | None = None
+_cached_dek: str | None | object = _NOT_SET
def get_encryption_key() -> str:
"""
Return the encryption key set in CONFIG.security.app_encryption_key.
Cached for the lifetime of the process (called on every encrypt/decrypt).
TODO: implement support for other key managers like AWS KMS
"""
global _cached_dek
- if _cached_dek is not None:
+ if _cached_dek is not _NOT_SET:
return _cached_dek
_cached_dek = CONFIG.security.app_encryption_key
- if _cached_dek == "":
+ if not _cached_dek: # catches both None and empty string
logger.warning(
"App encryption key is empty, this may lead to unexpected issues"
)
return _cached_dek🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/fides/api/db/encryption_utils.py` around lines 23 - 33, The caching logic
in encryption_utils.py incorrectly conflates "not cached yet" with a cached
value of None/empty; change the sentinel for _cached_dek to a unique object
(e.g., _UNSET = object()) and initialize _cached_dek = _UNSET, then in the
getter check "if _cached_dek is not _UNSET: return _cached_dek", set _cached_dek
= CONFIG.security.app_encryption_key (keeping the global declaration), and
update the warning to trigger when the cached value is falsy (e.g., if
_cached_dek in (None, "") or if not _cached_dek) so a None config is cached once
and the warning is emitted.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/fides/api/db/encryption_utils.py (1)
13-33:⚠️ Potential issue | 🟡 MinorCache is ineffective when encryption key is
None.The caching logic cannot distinguish between "not cached yet" and "cached as
None". IfCONFIG.security.app_encryption_keyreturnsNone:
_cached_dek is not None(line 24) evaluates toFalse_cached_dekgets assignedNonefrom config- On subsequent calls, line 24's check still fails → repeated config reads
Additionally, the return type annotation
-> stris incorrect since the function can returnNone.🔧 Proposed fix using a sentinel value
+_NOT_SET = object() # sentinel to distinguish "not cached" from "cached as None" -_cached_dek: str | None = None +_cached_dek: str | None | object = _NOT_SET def get_encryption_key() -> str | None: """ Return the encryption key set in CONFIG.security.app_encryption_key. Cached for the lifetime of the process (called on every encrypt/decrypt). TODO: implement support for other key managers like AWS KMS """ global _cached_dek - if _cached_dek is not None: + if _cached_dek is not _NOT_SET: return _cached_dek _cached_dek = CONFIG.security.app_encryption_key if not _cached_dek: logger.warning( "App encryption key is empty, this may lead to unexpected issues" ) return _cached_dek🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fides/api/db/encryption_utils.py` around lines 13 - 33, The cache logic in get_encryption_key incorrectly treats None as "not cached", causing repeated CONFIG.security.app_encryption_key reads and the wrong return type; change the function to use a sentinel (e.g. a unique object or a separate boolean) or make _cached_dek an Optional and a separate flag so that None can be cached, update the function signature to -> str | None (or Optional[str]) to reflect possible None, and ensure subsequent calls return the cached sentinel/None without re-reading CONFIG; reference symbols: _cached_dek, get_encryption_key, and CONFIG.security.app_encryption_key.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/fides/api/db/encryption_utils.py`:
- Around line 13-33: The cache logic in get_encryption_key incorrectly treats
None as "not cached", causing repeated CONFIG.security.app_encryption_key reads
and the wrong return type; change the function to use a sentinel (e.g. a unique
object or a separate boolean) or make _cached_dek an Optional and a separate
flag so that None can be cached, update the function signature to -> str | None
(or Optional[str]) to reflect possible None, and ensure subsequent calls return
the cached sentinel/None without re-reading CONFIG; reference symbols:
_cached_dek, get_encryption_key, and CONFIG.security.app_encryption_key.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0c87e6be-1e52-40f2-8aa2-c5cd26ed4671
📒 Files selected for processing (1)
src/fides/api/db/encryption_utils.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fides/api/models/privacy_request/webhook.py`:
- Line 12: The JWE helper code in webhook.py is using the process-wide cached
get_encryption_key which can return a stale DEK; update the JWE helper functions
(the token issuance code in webhook.py) to read the current key at runtime
instead of the cached getter—either call CONFIG.security.app_encryption_key
directly or add a new non-caching accessor (e.g., get_current_encryption_key)
and use that in place of get_encryption_key in the token creation/verification
functions so key rotation and test patches to CONFIG take effect immediately.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a4b7ada7-8fe5-4d83-8335-45b52fb8a4bf
📒 Files selected for processing (1)
src/fides/api/models/privacy_request/webhook.py
|
|
||
| from pydantic import BaseModel, ConfigDict | ||
|
|
||
| from fides.api.db.encryption_utils import get_encryption_key |
There was a problem hiding this comment.
Avoid the process-wide DEK cache in these JWE helpers.
src/fides/api/db/encryption_utils.py caches CONFIG.security.app_encryption_key for the life of the worker. These functions used to read CONFIG at token-issuance time, so this change introduces a stale-key path for runtime key overrides/rotation and tests that patch CONFIG: new callback/download tokens can keep being minted with the old key until the cache is reset or the process restarts. That semantic change is broader than the ORM StringEncryptedType use case this PR is targeting.
🔐 Minimal fix
-from fides.api.db.encryption_utils import get_encryption_key
@@
return generate_jwe(
json.dumps(jwe.model_dump(mode="json")),
- get_encryption_key(),
+ CONFIG.security.app_encryption_key,
)
@@
return generate_jwe(
json.dumps(jwe.model_dump(mode="json")),
- get_encryption_key(),
+ CONFIG.security.app_encryption_key,
)
@@
return generate_jwe(
json.dumps(jwe.model_dump(mode="json")),
- get_encryption_key(),
+ CONFIG.security.app_encryption_key,
)
@@
return generate_jwe(
json.dumps(jwe.model_dump(mode="json")),
- get_encryption_key(),
+ CONFIG.security.app_encryption_key,
)Also applies to: 68-71, 83-86, 99-102, 126-129
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/fides/api/models/privacy_request/webhook.py` at line 12, The JWE helper
code in webhook.py is using the process-wide cached get_encryption_key which can
return a stale DEK; update the JWE helper functions (the token issuance code in
webhook.py) to read the current key at runtime instead of the cached
getter—either call CONFIG.security.app_encryption_key directly or add a new
non-caching accessor (e.g., get_current_encryption_key) and use that in place of
get_encryption_key in the token creation/verification functions so key rotation
and test patches to CONFIG take effect immediately.
Ticket ENG-2857
Description Of Changes
This is a pure refactor (zero behavior change) that centralizes all
StringEncryptedTypecolumn definitions behind a singleencrypted_type()factory function and switches from a string key captured at import time to a callable key resolved at query time. This will enable future changes as part of our migration to an envelope encryption strategy.Previously, ~36
StringEncryptedType(...)calls across 16 model files each independently referencedCONFIG.security.app_encryption_keyas a string literal at import time. This PR:fides/api/db/encryption_utils.pywith:get_encryption_key()— returnsCONFIG.security.app_encryption_key(cached for process lifetime)encrypted_type(type_in=...)— builds aStringEncryptedTypeusingget_encryption_keyas a callable keyStringEncryptedType(...)constructions in model files withencrypted_type(...)optionally_encrypted_type()indb/util.pyto delegate toencrypted_type()aes_gcm_encryption_util.pyto useget_encryption_key()instead of direct CONFIG accessconsent_encryption_migration.pyto useencrypted_type()for its encryptorsqlalchemy-utilsnatively supports callable keys — ifkeyis callable, it invokes it on every encrypt/decrypt via_update_key(). This means encryption/decryption behavior is identical; the only difference is when the key is resolved (query time vs import time).registration.pyis excluded as it will be removed entirely in a follow-up.Code Changes
src/fides/api/db/encryption_utils.py—get_encryption_key()andencrypted_type()factoryencrypted_type()instead of inlineStringEncryptedType(...)db/util.py—optionally_encrypted_type()now delegates toencrypted_type()aes_gcm_encryption_util.py— usesget_encryption_key()for key resolutionconsent_encryption_migration.py— usesencrypted_type()for encryptor constructionprivacy_request/webhook.py— usesget_encryption_key()for JWE generationtests/api/db/test_encryption_utils.py— unit tests for the new modulechangelog/7588-encrypted-type-factory.yamlSteps to Confirm
Pre-Merge Checklist
CHANGELOG.mdupdatedmaindowngrade()migration is correct and worksSummary by CodeRabbit