Skip to content

ENG-3630: Consent identity enrichment via DB graph traversal#8039

Merged
galvana merged 9 commits into
mainfrom
ENG-3630/consent-identity-enrichment
Apr 29, 2026
Merged

ENG-3630: Consent identity enrichment via DB graph traversal#8039
galvana merged 9 commits into
mainfrom
ENG-3630/consent-identity-enrichment

Conversation

@galvana
Copy link
Copy Markdown
Contributor

@galvana galvana commented Apr 28, 2026

Ticket ENG-3630

Description Of Changes

Adds consent identity enrichment that resolves missing email or external_id before consent propagation. Gated behind FIDES__CONSENT__IDENTITY_ENRICHMENT config flag (default: disabled).

When a consent request arrives with only a subset of required identities (e.g., external_id from a Bloomreach webhook), Fides cannot propagate consent to connectors that need a different identity (e.g., Iterable needs email). This change adds a two-tier enrichment strategy:

  1. Fast path: Look up the missing identity from CurrentPrivacyPreference records using indexed hashed identity columns (single Postgres query)
  2. Fallback: Query consent-enabled DB integrations (e.g., BigQuery, Postgres users table) via the existing connector graph traversal

Key design decisions:

  • Feature gated behind FIDES__CONSENT__IDENTITY_ENRICHMENT (default False) -- no behavior change until explicitly enabled
  • Fast path uses CurrentPrivacyPreference.hashed_email / hashed_external_id (indexed) for O(1) lookup
  • Falls back to DB connector queries only when no local match exists
  • Opt-in only: ConnectionConfig.enabled_actions must explicitly include ActionType.consent on non-SaaS DB integrations
  • SaaS connectors excluded from enrichment (they don't have useful read endpoints for identity resolution)
  • Uses each connector's own QueryConfig for correct SQL generation (Postgres quoting, BigQuery schema prefix, etc.)
  • Graceful fallback: if enrichment fails, proceeds with original identity data

Code Changes

  • Added FIDES__CONSENT__IDENTITY_ENRICHMENT config flag to ConsentSettings
  • Added _enrich_from_current_preferences() -- fast path lookup via CurrentPrivacyPreference hashed identity columns
  • Refactored enrich_identities_for_consent() -- config gate, preference fast path, DB connector fallback
  • Extracted _cache_and_log_enrichment() and _enrich_from_db_connectors() helpers
  • build_consent_identity_enrichment_graph() in graph_task.py -- filters to non-SaaS connectors with consent in enabled_actions
  • Unit tests with enable_identity_enrichment fixture and real DB session
  • Integration tests for preference-based enrichment with real CurrentPrivacyPreference records

Steps to Confirm

  1. Run unit tests: pytest --no-cov tests/ops/task/test_consent_identity_enrichment.py
  2. Run integration tests: pytest --no-cov tests/ops/integration_tests/test_consent_identity_enrichment.py
  3. E2E testing, seed script, and manual test plan in companion PR: ethyca/fidesplus#3472

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration This indicates that a change includes a database migration label to the entry if your change includes a DB migration
    • Add a high-risk This issue suggests changes that have a high-probability of breaking existing code label to the entry if your change includes a high-risk change
    • Updates unreleased work already in Changelog, no new entry necessary
  • UX feedback:
    • No UX review needed
  • Followup issues:
    • No followup issues
  • Database migrations:
    • No migrations
  • Documentation:
    • Dev docs included in PR
    • No public documentation updates required

…aversal

Before consent propagation, resolve missing user identities by querying
consent-enabled database integrations. When a consent request arrives with
only a subset of required identities (e.g., external_id from a webhook),
enrichment queries a DB users table to discover the missing identity
(e.g., email), enabling consent to propagate to all configured connectors.

Enrichment is opt-in: ConnectionConfig.enabled_actions must explicitly
include ActionType.consent for non-SaaS DB integrations. Uses each
connector's own QueryConfig for correct SQL generation across DB types.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Apr 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Actions Updated (UTC)
fides-plus-nightly Ignored Ignored Preview Apr 29, 2026 2:34am
fides-privacy-center Ignored Ignored Apr 29, 2026 2:34am

Request Review

@@ -0,0 +1,283 @@
from typing import Any, Dict, Generator
Comment thread tests/ops/task/test_consent_identity_enrichment.py Fixed
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

❌ Patch coverage is 83.12500% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.96%. Comparing base (3556ef7) to head (fcbe180).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/fides/api/task/consent_identity_enrichment.py 81.50% 18 Missing and 9 partials ⚠️

❌ Your patch status has failed because the patch coverage (83.12%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.
❌ Your project status has failed because the head coverage (84.96%) is below the target coverage (85.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@           Coverage Diff            @@
##             main    #8039    +/-   ##
========================================
  Coverage   84.96%   84.96%            
========================================
  Files         632      633     +1     
  Lines       41472    41632   +160     
  Branches     4836     4869    +33     
========================================
+ Hits        35237    35374   +137     
- Misses       5136     5149    +13     
- Partials     1099     1109    +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Remove _determine_missing_identities check (always query reachable
  collections for robustness with bidirectional consent integrations)
- Fix cache_identity to write directly to DSR cache store instead of
  going through Identity Pydantic model validation
- Update doc to reference correct fidesplus system connection endpoint
  for setting enabled_actions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread tests/ops/task/test_consent_identity_enrichment.py Fixed
@galvana galvana marked this pull request as ready for review April 28, 2026 04:13
@galvana galvana requested a review from a team as a code owner April 28, 2026 04:13
@galvana galvana requested review from Linker44 and nreyes-dev and removed request for a team and nreyes-dev April 28, 2026 04:13
galvana and others added 2 commits April 27, 2026 21:38
- Document which QueryConfig attributes _EnrichmentNode must satisfy
- Use connector.cursor_result_to_rows() instead of static call so
  MariaDB and other connectors use their own row conversion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add cursor_result_to_rows to connector mock (instance method now)
- Mock get_dsr_cache_store via autouse fixture instead of asserting
  on cache_identity (no longer called directly)
- Remove test_updates_privacy_request_cache (tested implementation detail)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread tests/ops/task/test_consent_identity_enrichment.py Fixed
galvana and others added 2 commits April 28, 2026 07:52
Writes to the activity timeline so enrichment is visible in the
privacy request UI. Logs whether identities were discovered or not.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fast path resolves missing email/external_id from
CurrentPrivacyPreference before falling back to DB connector
queries. Gated behind FIDES__CONSENT__IDENTITY_ENRICHMENT.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mock_namespace_meta.return_value = None
mock_connector = create_autospec(SQLConnector, instance=True)
mock_connector.client.side_effect = Exception("Connection refused")
mock_connector.query_config = lambda node: SQLQueryConfig(node)
Copy link
Copy Markdown
Contributor

@Linker44 Linker44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ship it

galvana and others added 2 commits April 28, 2026 19:11
The early return guard skips enrichment when both email and external_id
are present, so phone_number was never populated from the DB. Pass all
three identities in the input to actually test the no-overwrite behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@galvana galvana enabled auto-merge April 29, 2026 02:34
@galvana galvana added this pull request to the merge queue Apr 29, 2026
Merged via the queue into main with commit 5378b1c Apr 29, 2026
67 of 71 checks passed
@galvana galvana deleted the ENG-3630/consent-identity-enrichment branch April 29, 2026 02:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants