Eng 2503 add auditing dataseets by Vagoasdf · Pull Request #7964 · ethyca/fides

Vagoasdf · 2026-04-20T15:44:26Z

Ticket 2503

Description Of Changes

Adds Audit Events to the edition of SaaS Datasets. Now whenever we edit, create or update a SaaS Dataset, we would register a new

Code Changes

New Event Audits objects for datasets
generate_dataset_audit_event_details() function added on events_audit utils
implementing _create_dataset_audit_event() function on Dataset Service

Steps to Confirm

Create a SaaS dataset via PATCH /api/v1/connection/{saas_connection_key}/datasets → expect dataset.created row in event_audit
Update it (re-send PATCH with a change) → expect dataset.updated
Delete it via DELETE /api/v1/connection/{saas_connection_key}/dataset/{key} → expect dataset.deleted
Implicit bulk delete via PUT /api/v1/connection/{saas_connection_key}/dataset-configs with an empty/shorter list → expect one dataset.deleted per removed config (this was the bug fixed in this branch)
Non-SaaS connection — create/delete on a Postgres connection → expect no rows in event_audit (SaaS-only guard)

Pre-Merge Checklist

* Restricted to Saas integrations

…onfigs

vercel · 2026-04-20T15:44:38Z

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments

Project	Deployment	Actions	Updated (UTC)
fides-plus-nightly	Ignored	Preview	Apr 21, 2026 9:20pm
fides-privacy-center	Ignored		Apr 21, 2026 9:20pm

codecov · 2026-04-20T15:54:21Z

Codecov Report

❌ Patch coverage is 93.75000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.95%. Comparing base (a52d50a) to head (22d483a).
⚠️ Report is 18 commits behind head on main.

Files with missing lines	Patch %	Lines
...rc/fides/service/dataset/dataset_config_service.py	89.18%	4 Missing ⚠️

❌ Your patch status has failed because the patch coverage (93.75%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.
❌ Your project status has failed because the head coverage (84.95%) is below the target coverage (85.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7964      +/-   ##
==========================================
- Coverage   85.04%   84.95%   -0.09%     
==========================================
  Files         631      630       -1     
  Lines       41217    41139      -78     
  Branches     4807     4775      -32     
==========================================
- Hits        35053    34951     -102     
- Misses       5070     5106      +36     
+ Partials     1094     1082      -12

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Vagoasdf · 2026-04-20T20:38:30Z

/code-review

claude

Dataset Auditing for SaaS Integrations — Code Review

Overall this is a clean, well-structured addition. The pattern mirrors existing connection/taxonomy audit event work closely, test coverage is thorough, and the failure-isolation (audit errors never bubble up to callers) is handled correctly. A few things to address before merging:

Bug

Wrong PR number in changelog (changelog/7964-add-dataset-auditng.yaml:3): pr: 1964 should be pr: 7964. The filename also has a typo — auditng → auditing (cosmetic, but worth fixing).

Minor Issues

Unused import (tests/util/test_event_audit_util.py:4): FideslangDataset is imported but never referenced in the new test class or anywhere else in the file.
TOCTOU in create_or_update_dataset_config (dataset_config_service.py:135-151): The "is this a create or update?" pre-check is a separate DB round-trip from the actual create_or_update, so a concurrent insert between the two could cause a dataset_created event to be emitted for what is functionally an update. Low probability in practice, but worth at minimum a comment explaining the known race. A cleaner fix would be to infer create-vs-update from the result of create_or_update rather than a preflight query.
Lost atomicity in put_dataset_configs (dataset_config_endpoints.py:175-179): The original single-SQL batch DELETE was atomic; the new per-key loop commits each deletion independently. An unexpected mid-loop failure (not DatasetNotFoundException) would leave partial state. The DatasetNotFoundException swallowing for concurrent deletes is correct, but it's worth noting the changed semantics in a comment.

What looks good

EventAuditType values (dataset.created, dataset.updated, dataset.deleted) follow the existing naming convention.
_create_dataset_audit_event correctly no-ops for non-SaaS connections and when no audit service is injected.
delete_dataset_config emits the audit event after the delete succeeds, so no phantom events for failed deletes.
get_dataset_config_service dependency ordering in deps.py is correct.
Test coverage is comprehensive: create, update, delete, non-SaaS no-op, broken audit service, missing audit service, concurrent delete, and non-existent key all have dedicated test cases.

🔬 Codegraph: connected (47107 nodes)

💡 Write /code-review in a comment to re-run this review.

claude · 2026-04-20T20:41:44Z

@@ -0,0 +1,4 @@
+type: Added
+description: Adds Dataset Auditing to Saas integrations
+pr: 1964


changelog/7964-add-dataset-auditng.yaml:3

Wrong PR number: pr: 1964 should be pr: 7964.

Also a minor nit: the filename itself has a typo — auditng instead of auditing.

claude · 2026-04-20T20:41:49Z

 """Tests for event_audit_util.py functions."""

+import pytest
+from fideslang.models import Dataset as FideslangDataset


tests/util/test_event_audit_util.py:4

FideslangDataset is imported here but never used anywhere in the new TestGenerateDatasetAuditEventDetails tests (or elsewhere in the file). This import can be removed.

claude · 2026-04-20T20:41:58Z

+            # Determine create vs. update before persisting
+            fides_key = data_dict["fides_key"]
+            existing = DatasetConfig.filter(
+                db=self.db,
+                conditions=(
+                    (DatasetConfig.connection_config_id == connection_config.id)
+                    & (DatasetConfig.fides_key == fides_key)
+                ),
+            ).first()
+            event_type = (
+                EventAuditType.dataset_updated
+                if existing
+                else EventAuditType.dataset_created
+            )
+
            # Create or update using unified method
            dataset_config = DatasetConfig.create_or_update(self.db, data=data_dict)


src/fides/service/dataset/dataset_config_service.py:135-151

There's a TOCTOU (time-of-check / time-of-use) window here: the existing pre-check and the actual create_or_update are two separate DB operations, so a concurrent request that inserts the record in between would cause this request to emit dataset_created for what is effectively an update.

In practice this is a low-frequency edge case, but it's worth noting. A cleaner alternative would be to inspect the result of DatasetConfig.create_or_update (e.g. check if the returned object has a freshly-generated created_at equal to updated_at, or have the model method return a created/updated flag) instead of the pre-flight query. Even adding a comment explaining the known race would help set expectations for future readers.

I can see how that can happen. Including it

claude · 2026-04-20T20:42:07Z

+    for key in config_keys_to_remove:
+        try:
+            dataset_config_service.delete_dataset_config(connection_config, key)
+        except DatasetNotFoundException:
+            pass  # already deleted; desired state reached


src/fides/api/v1/endpoints/dataset_config_endpoints.py:175-179

The original code deleted all config_keys_to_remove in a single SQL DELETE + one db.commit(), which was atomic. The new loop calls delete_dataset_config once per key, each of which commits independently. If an unexpected exception (i.e. something other than DatasetNotFoundException) occurs partway through the loop, the earlier keys will already be deleted while later ones are not, leaving partial state.

This is unlikely to matter in normal operation (each delete is independent and the DatasetNotFoundException swallowing handles the concurrent-delete case correctly), but it's a subtle behavioral change worth documenting in a comment. If strict all-or-nothing semantics are required, consider wrapping the loop in a single transaction or reverting to a bulk delete when no audit events are needed.

The plan was to separate them so we could properly log an audit for each delete, otherwise, these would be drowned on the noise and we could only log one dataset erasure event.
Im not sure if this is a valid comment, but its something to take in consideration, yes. Will consider an update if the reviewer agrees

Linker44 · 2026-04-21T15:32:22Z

Base implementation looks good but i worry about the usefulness of the logging. Users will be able to see that something changed but not exactly what changed, this differs from how we Eventaudit connection secrets and saas configs were we log exactly what is it that changed and mask any sensitive values.

I incline on this needing a refactor to include that information.

Linker44

lgtm. I would rather that on update we only log exactly what changed on the dataset instead of all of it, but for now this is good we can improve upon it later

Vagoasdf added 11 commits April 17, 2026 15:41

Adding datasets Operations on Event Audits

a2c8821

Adding function to generate dataset value

3bfe609

Adding function to Dataset service

032c24d

* Restricted to Saas integrations

Updating: TestDatasetConfigServiceAuditEvents Fixing tests

816ca17

Adding event service throught dependencies

3f77a3d

Updating delete dataset. And fixing implicit bulk delete in dataset c…

50b3f34

…onfigs

Moving deps to below event audit service

d229069

Updating tests

836eb4b

SPlitting up tests for second phase

f77d20c

Using exception instead of error

d0aa71e

Avoiding concurrency error in case dataset is deleted already

9814daa

Vagoasdf added 7 commits April 20, 2026 12:18

Falling back to Connection in case connector is empty

b2c9704

Detailing Docstring. Improving tests

f1bd69c

Running Ruffus 1

a5e86b3

Running Ruffus 2

ac74d1d

Adding tesst for the concurrency errors

989e7c7

Removing verbose

a55c846

Clean removal of leftover config

f8c81e6

Vagoasdf marked this pull request as ready for review April 20, 2026 20:14

Vagoasdf requested a review from a team as a code owner April 20, 2026 20:14

Vagoasdf requested review from vcruces and removed request for a team April 20, 2026 20:14

Vagoasdf added 2 commits April 20, 2026 16:37

running ruff

a2652df

Update Dataset

aecbc14

claude Bot reviewed Apr 20, 2026

View reviewed changes

Vagoasdf added 2 commits April 20, 2026 16:45

Fixing typo

987cb46

Fixing typos

35574b1

Vagoasdf added 3 commits April 20, 2026 16:50

Removing unused import

c558a4f

Adressing PR comment

9378526

Merge branch 'main' into ENG-2503-add-auditing-dataseets

3852cd5

vercel Bot deployed to Preview – fides-plus-nightly April 21, 2026 13:58 View deployment

Vagoasdf requested review from Linker44 and removed request for vcruces April 21, 2026 15:03

Linker44 assigned Vagoasdf Apr 21, 2026

Vagoasdf added 3 commits April 21, 2026 13:01

Adding dataset definitions inside

5017571

Updating test

829e406

Running ruff

64e4292

Linker44 approved these changes Apr 21, 2026

View reviewed changes

Vagoasdf added 2 commits April 21, 2026 17:15

Logging changed fields

39da91d

Running ruff

22d483a

Vagoasdf added this pull request to the merge queue Apr 22, 2026

Merged via the queue into main with commit 2350cce Apr 22, 2026
66 of 69 checks passed

Vagoasdf deleted the ENG-2503-add-auditing-dataseets branch April 22, 2026 14:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eng 2503 add auditing dataseets#7964

Eng 2503 add auditing dataseets#7964
Vagoasdf merged 30 commits intomainfrom
ENG-2503-add-auditing-dataseets

Vagoasdf commented Apr 20, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

Vagoasdf commented Apr 20, 2026

Uh oh!

claude Bot left a comment

Uh oh!

claude Bot Apr 20, 2026

Uh oh!

claude Bot Apr 20, 2026

Uh oh!

claude Bot Apr 20, 2026

Uh oh!

Vagoasdf Apr 20, 2026

Uh oh!

claude Bot Apr 20, 2026

Uh oh!

Vagoasdf Apr 20, 2026

Uh oh!

Linker44 commented Apr 21, 2026 •

edited

Loading

Uh oh!

Linker44 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Vagoasdf commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description Of Changes

Code Changes

Steps to Confirm

Pre-Merge Checklist

Uh oh!

vercel Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Vagoasdf commented Apr 20, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Dataset Auditing for SaaS Integrations — Code Review

Bug

Minor Issues

What looks good

Uh oh!

claude Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Vagoasdf Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Vagoasdf Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Linker44 commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Linker44 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Vagoasdf commented Apr 20, 2026 •

edited

Loading

vercel Bot commented Apr 20, 2026 •

edited

Loading

codecov Bot commented Apr 20, 2026 •

edited

Loading

Linker44 commented Apr 21, 2026 •

edited

Loading