Skip to content

ENG-2852 Update encryption mechanism for Organization columns#7554

Merged
erosselli merged 9 commits intomainfrom
erosselli/ENG-2852
Mar 6, 2026
Merged

ENG-2852 Update encryption mechanism for Organization columns#7554
erosselli merged 9 commits intomainfrom
erosselli/ENG-2852

Conversation

@erosselli
Copy link
Contributor

@erosselli erosselli commented Mar 3, 2026

Ticket ENG-2852

Description Of Changes

This PR migrates the Organization model's 3 encrypted columns (controller, data_protection_officer, and representative) from the legacy PGEncryptedString (PostgreSQL pgcrypto) to StringEncryptedType(AesGcmEngine), the encryption mechanism used by the rest of the app.

It includes an Alembic migration that decrypts existing data with pgp_sym_decrypt and re-encrypts with AES-GCM, correctly converting pgcrypto-encrypted JSON "null" values to true SQL NULL. The PR also removes the now-unused PGEncryptedString class.

Steps to Confirm

  1. Create an organization if you don't have one already
  2. Pull down this branch and run the migration
  3. Ensure you can access the organization in the Organizations Admin UI view
  4. Run the downgrade migration
  5. Run the query SELECT pgp_sym_decrypt(controller, 'test_encryption_key') FROM ctl_organizations; to make sure the value was re-encrypted properly

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration This indicates that a change includes a database migration label to the entry if your change includes a DB migration
    • Add a high-risk This issue suggests changes that have a high-probability of breaking existing code label to the entry if your change includes a high-risk change (i.e. potential for performance impact or unexpected regression) that should be flagged
    • Updates unreleased work already in Changelog, no new entry necessary
  • UX feedback:
    • All UX related changes have been reviewed by a designer
    • No UX review needed
  • Followup issues:
    • Followup issues created
    • No followup issues
  • Database migrations:
    • Ensure that your downrev is up to date with the latest revision on main
    • Ensure that your downgrade() migration is correct and works
      • If a downgrade migration is not possible for this change, please call this out in the PR description!
    • No migrations
  • Documentation:
    • Documentation complete, PR opened in fidesdocs
    • Documentation issue created in fidesdocs
    • If there are any new client scopes created as part of the pull request, remember to update public-facing documentation that references our scope registry
    • No documentation updates required

Summary by CodeRabbit

  • Documentation

    • Added encryption overview documentation covering current encryption algorithms, key management, and configuration
    • Added encryption implementation plan outlining multi-phase architecture improvements with roadmap
  • Chores

    • Migrated Organization sensitive contact fields to enhanced encryption standards with reversible database migration
    • Added tests to verify encrypted data integrity and null value handling

@vercel
Copy link
Contributor

vercel bot commented Mar 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
fides-plus-nightly Ready Ready Preview, Comment Mar 6, 2026 3:21pm
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
fides-privacy-center Ignored Ignored Mar 6, 2026 3:21pm

Request Review

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked this one and the implementation plan into version control since there's quite a number of tickets for the encryption epic

@erosselli erosselli marked this pull request as ready for review March 3, 2026 21:27
@erosselli erosselli requested a review from a team as a code owner March 3, 2026 21:27
@erosselli erosselli requested review from johnewart and removed request for a team March 3, 2026 21:27
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 3, 2026

Greptile Summary

This PR migrates the Organization model's three encrypted columns (controller, data_protection_officer, representative) from the legacy PGEncryptedString (PostgreSQL pgcrypto) to StringEncryptedType(AesGcmEngine), unifying them with the rest of the application's encryption approach. It also removes the now-unused PGEncryptedString class and adds two new round-trip tests.

Key changes:

  • Alembic migration (upgrade): Uses an expand-contract pattern — adds _new Text columns, decrypts existing pgcrypto data row-by-row using pgp_sym_decrypt, correctly converts pgcrypto-encoded JSON "null" values to true SQL NULL via _is_json_null, re-encrypts with AesGcmEngine, then drops the old BYTEA columns and renames the new ones.
  • Alembic migration (downgrade): Symmetric reverse — adds _old BYTEA columns, decrypts AES-GCM values and re-encrypts with pgp_sym_encrypt, then swaps column names back.
  • Model update: Organization columns now use StringEncryptedType(JSONTypeOverride, CONFIG.security.app_encryption_key, AesGcmEngine, "pkcs5") matching the rest of the codebase.
  • Documentation: Adds two new internal docs (overview.md and implementation-plan.md) describing the current encryption state and a roadmap toward envelope encryption. However, overview.md still describes Organization as using PGEncryptedString in three separate sections, which will be immediately stale once this PR merges.

Confidence Score: 4/5

  • Safe to merge — migration logic is correct and the expand-contract pattern prevents data loss. Only non-critical issues found.
  • The migration strategy is well-designed and the logic is correct in both directions. The JSON-null edge case is properly handled. The only issues are a single-character loop variable (c) in the migration generator expression and the newly-added overview.md being immediately stale regarding PGEncryptedString usage for Organization — neither is a runtime correctness concern.
  • docs/fides/docs/encryption/overview.md — three sections still describe Organization as using PGEncryptedString, which is removed by this PR.

Important Files Changed

Filename Overview
src/fides/api/alembic/migrations/versions/xx_2026_03_03_1400_bf12f05ef8eb_migrate_organization_to_aesgcm.py Expand-contract migration correctly adds temporary _new/_old columns, decrypts pgcrypto data with pgp_sym_decrypt, handles the JSON-null edge case, re-encrypts with AES-GCM, then drops and renames columns. Downgrade path is symmetric. One minor style issue: the generator expression uses a single-character variable c instead of col.
src/fides/api/models/sql_models.py Removes PGEncryptedString and replaces the three Organization column definitions with StringEncryptedType(JSONTypeOverride, CONFIG.security.app_encryption_key, AesGcmEngine, "pkcs5"), correctly unifying them with the rest of the codebase. Unused imports (TypeDecorator, cast, type_coerce, BYTEA) also cleaned up.
tests/ctl/models/test_sql_models.py Adds two round-trip tests for the new AES-GCM encryption on Organization columns — one for non-null data and one for null values. Uses db.expire() to force a real DB reload. Clean and appropriate coverage.
docs/fides/docs/encryption/overview.md New "current state" reference document, but it still describes PGEncryptedString as the active mechanism for Organization columns in three separate sections. These will be immediately outdated once this PR merges — needs updating to reflect the AES-GCM migration.

Last reviewed commit: 9ad728c

@erosselli
Copy link
Contributor Author

@greptile re-review

@coderabbitai
Copy link

coderabbitai bot commented Mar 4, 2026

Caution

Review failed

The head commit changed during the review from 9aff78c to b17c43f.

📝 Walkthrough

Walkthrough

This PR migrates Organization's encrypted columns (controller, data_protection_officer, representative) from PostgreSQL pgcrypto to AES-GCM using StringEncryptedType. The change includes documentation, a database migration script, updated model definitions, and validation tests.

Changes

Cohort / File(s) Summary
Documentation
docs/fides/docs/encryption/overview.md, docs/fides/docs/encryption/implementation-plan.md
New docs covering current encryption state (algorithms, key management, limitations) and a multi-PR roadmap for envelope encryption architecture with KEK/DEK hierarchy and KeyProvider abstraction.
Database Migration
src/fides/api/alembic/migrations/versions/xx_2026_03_03_1400_bf12f05ef8eb_migrate_organization_to_aesgcm.py
Alembic migration script that decrypts pgcrypto values from Organization columns and re-encrypts them using AES-GCM, with reversible downgrade path using temporary columns.
Model Updates
src/fides/api/models/sql_models.py
Removed PGEncryptedString TypeDecorator; migrated Organization's controller, data_protection_officer, and representative fields to StringEncryptedType with AesGcmEngine; updated imports.
Tests & Changelog
tests/ctl/models/test_sql_models.py, changelog/7554-migrate-organization-encryption-to-aesgcm.yaml
Added TestOrganizationEncryptedFields with round-trip and null-handling tests; added changelog entry documenting the db-migration.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Hopping through columns with care and delight,
From pgcrypto's grasp to AES-GCM's bright light,
We encrypt and decrypt with elegant grace,
A migration so smooth, not a data out of place!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically identifies the main change: migrating Organization column encryption from PGEncryptedString to AES-GCM.
Description check ✅ Passed The description covers the main changes, includes an Alembic migration explanation, provides clear verification steps, and addresses most pre-merge checklist items including changelog and migration downgrade considerations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch erosselli/ENG-2852

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
src/fides/api/models/sql_models.py (1)

476-503: Reduce repeated encryption type declarations in Organization.

These three column definitions duplicate the same StringEncryptedType(...) configuration and can drift. Extract a small helper that returns the type and reuse it.

♻️ Proposed refactor
+def _organization_contact_encrypted_type() -> StringEncryptedType:
+    return StringEncryptedType(
+        JSONTypeOverride,
+        CONFIG.security.app_encryption_key,
+        AesGcmEngine,
+        "pkcs5",
+    )
+
 class Organization(Base, FidesBase):
@@
-    controller = Column(
-        StringEncryptedType(
-            JSONTypeOverride,
-            CONFIG.security.app_encryption_key,
-            AesGcmEngine,
-            "pkcs5",
-        ),
-        nullable=True,
-    )
+    controller = Column(_organization_contact_encrypted_type(), nullable=True)
@@
-    data_protection_officer = Column(
-        StringEncryptedType(
-            JSONTypeOverride,
-            CONFIG.security.app_encryption_key,
-            AesGcmEngine,
-            "pkcs5",
-        ),
-        nullable=True,
-    )
+    data_protection_officer = Column(
+        _organization_contact_encrypted_type(), nullable=True
+    )
@@
-    representative = Column(
-        StringEncryptedType(
-            JSONTypeOverride,
-            CONFIG.security.app_encryption_key,
-            AesGcmEngine,
-            "pkcs5",
-        ),
-        nullable=True,
-    )
+    representative = Column(_organization_contact_encrypted_type(), nullable=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fides/api/models/sql_models.py` around lines 476 - 503, The three
Organization column definitions (controller, data_protection_officer,
representative) duplicate the same StringEncryptedType(...) config; create a
small helper function (e.g., _encrypted_json_type or get_encrypted_json_type) in
the same module that returns StringEncryptedType(JSONTypeOverride,
CONFIG.security.app_encryption_key, AesGcmEngine, "pkcs5") and then replace the
repeated constructors with Column(_encrypted_json_type(), nullable=True) for
each of those fields so the encryption config is centralized and reusable.
src/fides/api/alembic/migrations/versions/xx_2026_03_03_1400_bf12f05ef8eb_migrate_organization_to_aesgcm.py (1)

73-77: Avoid fetchall() to reduce plaintext residency in memory.

Iterate directly over the cursor result instead of materializing all rows at once.

♻️ Proposed change
-    rows = bind.execute(
+    rows = bind.execute(
         text(f"SELECT fides_key, {null_coalesce} FROM ctl_organizations"),
         {"key": CONFIG.user.encryption_key},
-    ).fetchall()
+    )
@@
-    rows = bind.execute(
+    rows = bind.execute(
         text(
             "SELECT fides_key, controller, data_protection_officer, representative "
             "FROM ctl_organizations"
         )
-    ).fetchall()
+    )

Also applies to: 121-127

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/fides/api/alembic/migrations/versions/xx_2026_03_03_1400_bf12f05ef8eb_migrate_organization_to_aesgcm.py`
around lines 73 - 77, The code currently calls bind.execute(...).fetchall() and
assigns to rows, which materializes all plaintext rows in memory; change to
iterate the execute() result directly (e.g., for fides_key, col in
bind.execute(text(...), {"key": CONFIG.user.encryption_key}): ...) instead of
using .fetchall(), and update the analogous block referenced around lines
121-127 to also iterate the execute() cursor/result iterator rather than calling
fetchall(), ensuring processing is done row-by-row and no large list of
plaintext rows is retained in memory.
tests/ctl/models/test_sql_models.py (1)

52-70: Strengthen this test to verify encryption at rest, not only round-trip behavior.

The current assertions can still pass if values are stored plaintext and simply deserialized correctly. Add a raw SQL read of the stored columns and assert plaintext markers are not present.

🧪 Proposed test hardening
+import json
 import pytest
+from sqlalchemy import text
@@
     def test_encrypted_fields_round_trip(self, db, contact_details):
@@
         reloaded = (
             db.query(Organization).filter_by(fides_key="test_encryption_org").first()
         )
 
         assert reloaded.controller == contact_details
         assert reloaded.data_protection_officer == contact_details
         assert reloaded.representative == contact_details
+
+        raw = db.execute(
+            text(
+                "SELECT controller, data_protection_officer, representative "
+                "FROM ctl_organizations WHERE fides_key = :fides_key"
+            ),
+            {"fides_key": "test_encryption_org"},
+        ).mappings().one()
+
+        # quick at-rest sanity checks (raw ciphertext should not include plaintext JSON tokens)
+        assert '"name": "Jane Controller"' not in raw["controller"]
+        assert '"name": "Jane Controller"' not in raw["data_protection_officer"]
+        assert '"name": "Jane Controller"' not in raw["representative"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ctl/models/test_sql_models.py` around lines 52 - 70, The
test_encrypted_fields_round_trip currently only verifies deserialization, so add
a raw SQL read of the stored columns to confirm values are encrypted at rest:
after creating Organization (Organization.create) and reloading, run a raw
SELECT on the organizations table for the row with fides_key
"test_encryption_org" to fetch the raw persisted values of controller,
data_protection_officer, and representative (using the same db/session used in
the test), then assert those raw stored values do not contain plaintext markers
from contact_details (e.g., contact_details["email"] or contact_details["name"])
and are not equal to the JSON/text representation of contact_details; keep the
existing round-trip assertions as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/fides/docs/encryption/implementation-plan.md`:
- Around line 25-46: The fenced diagram blocks in the markdown use plain
triple-backticks without a language tag (e.g., the KEK/DEK diagram and the other
fenced sections noted around lines 50-82 and 352-368); update each opening fence
from ``` to ```text so the blocks declare a language and satisfy MD040 (leave
the closing ``` unchanged); search for the triple-backtick fences in
docs/fides/docs/encryption/implementation-plan.md (the diagram blocks and the
other listed fenced areas) and add the language identifier "text" to each
opening fence.

In `@docs/fides/docs/encryption/overview.md`:
- Around line 16-17: Update the overview to mark PGEncryptedString as
legacy/migration-only and reflect that Organization columns now use
StringEncryptedType with AesGcmEngine; specifically, change the table entry
referencing PGEncryptedString and the rows mentioning Organization model fields
to indicate "Legacy / Migration-only" (or similar) and add a note that current
behavior uses StringEncryptedType(AesGcmEngine) for Organization columns; search
for occurrences of PGEncryptedString, AesEngine, AesGcmEngine, and
StringEncryptedType to update lines 16-17 and the other locations referenced
(around lines 94-100 and 146-147).

In
`@src/fides/api/alembic/migrations/versions/xx_2026_03_03_1400_bf12f05ef8eb_migrate_organization_to_aesgcm.py`:
- Around line 51-76: In upgrade(), add explicit pre-checks for the required
encryption keys (CONFIG.security.app_encryption_key and
CONFIG.user.encryption_key) before adding columns or reading rows: validate both
keys are present/non-empty and raise a clear exception (e.g., RuntimeError with
a descriptive message) to abort the migration if either is missing; do this
before creating encryptor, before executing bind.execute/select, and reference
the symbols upgrade(), CONFIG.security.app_encryption_key,
CONFIG.user.encryption_key, encryptor, and bind so the guard runs early and
prevents any data rewrite when keys are absent.

---

Nitpick comments:
In
`@src/fides/api/alembic/migrations/versions/xx_2026_03_03_1400_bf12f05ef8eb_migrate_organization_to_aesgcm.py`:
- Around line 73-77: The code currently calls bind.execute(...).fetchall() and
assigns to rows, which materializes all plaintext rows in memory; change to
iterate the execute() result directly (e.g., for fides_key, col in
bind.execute(text(...), {"key": CONFIG.user.encryption_key}): ...) instead of
using .fetchall(), and update the analogous block referenced around lines
121-127 to also iterate the execute() cursor/result iterator rather than calling
fetchall(), ensuring processing is done row-by-row and no large list of
plaintext rows is retained in memory.

In `@src/fides/api/models/sql_models.py`:
- Around line 476-503: The three Organization column definitions (controller,
data_protection_officer, representative) duplicate the same
StringEncryptedType(...) config; create a small helper function (e.g.,
_encrypted_json_type or get_encrypted_json_type) in the same module that returns
StringEncryptedType(JSONTypeOverride, CONFIG.security.app_encryption_key,
AesGcmEngine, "pkcs5") and then replace the repeated constructors with
Column(_encrypted_json_type(), nullable=True) for each of those fields so the
encryption config is centralized and reusable.

In `@tests/ctl/models/test_sql_models.py`:
- Around line 52-70: The test_encrypted_fields_round_trip currently only
verifies deserialization, so add a raw SQL read of the stored columns to confirm
values are encrypted at rest: after creating Organization (Organization.create)
and reloading, run a raw SELECT on the organizations table for the row with
fides_key "test_encryption_org" to fetch the raw persisted values of controller,
data_protection_officer, and representative (using the same db/session used in
the test), then assert those raw stored values do not contain plaintext markers
from contact_details (e.g., contact_details["email"] or contact_details["name"])
and are not equal to the JSON/text representation of contact_details; keep the
existing round-trip assertions as well.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d223b098-d550-4fbc-b77c-0ca7293889cf

📥 Commits

Reviewing files that changed from the base of the PR and between a8084fc and 9366cb6.

📒 Files selected for processing (6)
  • changelog/7554-migrate-organization-encryption-to-aesgcm.yaml
  • docs/fides/docs/encryption/implementation-plan.md
  • docs/fides/docs/encryption/overview.md
  • src/fides/api/alembic/migrations/versions/xx_2026_03_03_1400_bf12f05ef8eb_migrate_organization_to_aesgcm.py
  • src/fides/api/models/sql_models.py
  • tests/ctl/models/test_sql_models.py

Comment on lines +25 to +46
```
┌──────────────────────────────────────────────────────────────┐
│ KEK (Key Encryption Key) │
│ ───────────────────────── │
│ Owned by the operator. Lives in an env var (local provider) │
│ or never leaves the KMS (AWS KMS provider). │
│ │
│ Wraps / unwraps the DEK. │
└──────────────────────┬───────────────────────────────────────┘
│ wraps
┌──────────────────────────────────────────────────────────────┐
│ DEK (Data Encryption Key) │
│ ───────────────────────── │
│ = the current app_encryption_key value │
│ Stored encrypted (wrapped) by the provider. │
│ Unwrapped at runtime, cached in memory. │
│ │
│ Used by all StringEncryptedType columns, AES-GCM utils, │
│ JWE tokens, etc. — the entire existing encryption surface. │
└──────────────────────────────────────────────────────────────┘
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language identifiers to fenced blocks to satisfy markdownlint (MD040).

The diagram/code fences should declare a language (e.g., text) so lint checks pass consistently.

🧹 Proposed lint fix
-```
+```text
 ┌──────────────────────────────────────────────────────────────┐
 ...
-```
+```

-```
+```text
                      ┌──────────────────┐
 ...
-```
+```

-```
+```text
 PR 0a (migrate Organization)    PR 0b (migrate UserRegistration)
 ...
-```
+```

Also applies to: 50-82, 352-368

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 25-25: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/fides/docs/encryption/implementation-plan.md` around lines 25 - 46, The
fenced diagram blocks in the markdown use plain triple-backticks without a
language tag (e.g., the KEK/DEK diagram and the other fenced sections noted
around lines 50-82 and 352-368); update each opening fence from ``` to ```text
so the blocks declare a language and satisfy MD040 (leave the closing ```
unchanged); search for the triple-backtick fences in
docs/fides/docs/encryption/implementation-plan.md (the diagram blocks and the
other listed fenced areas) and add the language identifier "text" to each
opening fence.

Comment on lines +16 to +17
| **PGEncryptedString** | PGP symmetric | PostgreSQL `pgcrypto` | Organization model fields |
| **AesEngine** | AES-CBC | `sqlalchemy-utils` | Legacy (UserRegistration only) |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

This overview still presents PGEncryptedString as active current-state behavior.

After this PR, Organization columns are migrated to StringEncryptedType(AesGcmEngine). Please mark PGEncryptedString as legacy/historical (or migration-only) to avoid stale operational guidance.

📝 Suggested doc adjustment
-| **PGEncryptedString** | PGP symmetric | PostgreSQL `pgcrypto` | Organization model fields |
+| **PGEncryptedString** | PGP symmetric | PostgreSQL `pgcrypto` | Legacy mechanism (historical / migration downgrade path) |
@@
-### PostgreSQL pgcrypto Encryption via `PGEncryptedString`
+### PostgreSQL pgcrypto Encryption via `PGEncryptedString` (Legacy)
@@
-| `Organization` | `controller`, `data_protection_officer`, `representative` |
+| _No active model columns_ | Legacy historical usage only |
@@
-Additionally, `CONFIG.user.encryption_key` (from `UserSettings`) is used exclusively for `PGEncryptedString`.
+Additionally, `CONFIG.user.encryption_key` (from `UserSettings`) remains relevant for legacy pgcrypto compatibility/migration paths.

Also applies to: 94-100, 146-147

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/fides/docs/encryption/overview.md` around lines 16 - 17, Update the
overview to mark PGEncryptedString as legacy/migration-only and reflect that
Organization columns now use StringEncryptedType with AesGcmEngine;
specifically, change the table entry referencing PGEncryptedString and the rows
mentioning Organization model fields to indicate "Legacy / Migration-only" (or
similar) and add a note that current behavior uses
StringEncryptedType(AesGcmEngine) for Organization columns; search for
occurrences of PGEncryptedString, AesEngine, AesGcmEngine, and
StringEncryptedType to update lines 16-17 and the other locations referenced
(around lines 94-100 and 146-147).

Comment on lines +51 to +76
def upgrade() -> None:
for col in COLUMNS:
op.add_column(
"ctl_organizations",
sa.Column(f"{col}_new", sa.Text(), nullable=True),
)

bind = op.get_bind()

encryptor = StringEncryptedType(
type_in=sa.Text(),
key=CONFIG.security.app_encryption_key,
engine=AesGcmEngine,
padding="pkcs5",
)

null_coalesce = ", ".join(
f"CASE WHEN {c} IS NOT NULL "
f"THEN pgp_sym_decrypt({c}, :key) "
f"ELSE NULL END AS {c}"
for c in COLUMNS
)
rows = bind.execute(
text(f"SELECT fides_key, {null_coalesce} FROM ctl_organizations"),
{"key": CONFIG.user.encryption_key},
).fetchall()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail fast on missing encryption keys before rewriting data.

This migration rewrites encrypted data but does not explicitly validate key presence first. Add explicit guards so the migration aborts early with a clear error before touching rows.

🛡️ Proposed safeguard
 COLUMNS = ("controller", "data_protection_officer", "representative")
 
 
+def _validate_required_keys() -> None:
+    if not CONFIG.user.encryption_key:
+        raise RuntimeError(
+            "Missing CONFIG.user.encryption_key required for pgcrypto decrypt/encrypt."
+        )
+    if not CONFIG.security.app_encryption_key:
+        raise RuntimeError(
+            "Missing CONFIG.security.app_encryption_key required for AES-GCM encryption."
+        )
+
+
 def _is_json_null(value: str) -> bool:
@@
 def upgrade() -> None:
+    _validate_required_keys()
     for col in COLUMNS:
@@
 def downgrade() -> None:
+    _validate_required_keys()
     for col in COLUMNS:

Also applies to: 105-126

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/fides/api/alembic/migrations/versions/xx_2026_03_03_1400_bf12f05ef8eb_migrate_organization_to_aesgcm.py`
around lines 51 - 76, In upgrade(), add explicit pre-checks for the required
encryption keys (CONFIG.security.app_encryption_key and
CONFIG.user.encryption_key) before adding columns or reading rows: validate both
keys are present/non-empty and raise a clear exception (e.g., RuntimeError with
a descriptive message) to abort the migration if either is missing; do this
before creating encryptor, before executing bind.execute/select, and reference
the symbols upgrade(), CONFIG.security.app_encryption_key,
CONFIG.user.encryption_key, encryptor, and bind so the guard runs early and
prevents any data rewrite when keys are absent.

Copy link
Collaborator

@johnewart johnewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good - the question of other tables needing to be migrated like this was answered offline and is "no"

@erosselli erosselli added this pull request to the merge queue Mar 6, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 6, 2026
@erosselli erosselli enabled auto-merge March 6, 2026 15:19
@erosselli erosselli added this pull request to the merge queue Mar 6, 2026
Merged via the queue into main with commit 503bdfe Mar 6, 2026
56 of 57 checks passed
@erosselli erosselli deleted the erosselli/ENG-2852 branch March 6, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants