-
Notifications
You must be signed in to change notification settings - Fork 0
GDPR Compliance
DataVigil has built-in support for GDPR at the field level. You pick which properties on which entities need protection, and the library handles the transformations automatically. There are two independent layers - storage policies and retrieval policies - plus a right-to-erasure mechanism for Article 17 requests.
Storage policies transform data before it reaches the database. The original value is never persisted, so these transformations are permanent and irreversible.
builder.Services.AddAuditTrail(options =>
{
options.Gdpr.ForEntity<Customer>(e =>
{
e.ExcludeOnStorage(c => c.CreditCard);
e.MaskOnStorage(c => c.Email);
e.AnonymizeOnStorage(c => c.FullName);
e.HashOnStorage(c => c.Ssn);
e.TransformOnStorage(c => c.Phone, val => $"+***-***-{val[^4..]}");
});
});| Action | Method | What happens | Example |
|---|---|---|---|
| Exclude | ExcludeOnStorage(field) |
Property removed from the audit entry entirely |
CreditCard property is gone |
| Mask | MaskOnStorage(field) |
First char + *** + last char |
"alice@mail.com" -> "a***m"
|
| Anonymize | AnonymizeOnStorage(field) |
Replaced with a fixed marker |
"Alice Smith" -> "[ANONYMIZED]"
|
| Hash | HashOnStorage(field) |
SHA-256 hex digest (64 characters) |
"123-45-6789" -> "a1b2c3...f6"
|
| Custom | TransformOnStorage(field, func) |
Your own Func<string, string>
|
Whatever you write |
Null values are never transformed. If the field value is null going in, it stays null.
GdprProcessor.ApplyStoragePolicies(AuditEntry) iterates through each property on the entry, checks if there's a matching storage rule in the GdprPolicyRegistry, and applies the transformation. For Exclude, the property is removed from the collection entirely. For everything else, the OldValue and NewValue are both transformed.
The method returns a tuple: (AuditEntry entry, bool applied, bool fullyAnonymized). The fullyAnonymized flag is true only when every applied rule is either Anonymize or Exclude. If any Mask, Hash, or Custom rule was used, it's false. This is how the pipeline decides whether to set GdprStorageState.FullyAnonymized vs PartiallyProcessed.
Retrieval policies don't change stored data. They're applied when querying audit records, and they control what the caller sees based on their roles and claims.
options.Gdpr.ForEntity<Order>(e =>
{
e.MaskOnRetrieval(o => o.CustomerEmail, access => access
.AllowRoles("Admin", "Auditor"));
e.AnonymizeOnRetrieval(o => o.CustomerPhone, access => access
.AllowClaim("gdpr", "full"));
});When a query comes in with a GdprRetrievalContext, the processor checks each retrieval rule:
- Does the caller have any of the
AllowedRoles? -> show raw value - Does the caller have any of the
AllowedClaims? -> show raw value - Neither? -> apply mask or anonymize
The logic is OR-based. One matching role or one matching claim is enough to grant access. You don't need both.
When calling IAuditStore.QueryAsync(), pass a context describing the current user:
var context = new GdprRetrievalContext
{
UserRoles = new[] { "Admin" },
UserClaims = new Dictionary<string, string> { ["gdpr"] = "full" }
};
var result = await store.QueryAsync(
new AuditTransactionQuery { Skip = 0, Take = 50 },
gdprRetrievalContext: context);If you pass null (or don't provide a context), every retrieval policy applies and all sensitive fields come back masked or anonymized. This is the safe default.
In an ASP.NET Core controller, you'd typically build the context from the current user's claims:
var context = new GdprRetrievalContext
{
UserRoles = User.Claims
.Where(c => c.Type == ClaimTypes.Role)
.Select(c => c.Value),
UserClaims = User.Claims
.ToDictionary(c => c.Type, c => c.Value)
};You can apply storage and retrieval policies to the same field for layered protection:
options.Gdpr.ForEntity<Order>(e =>
{
// Layer 1: mask before writing
e.MaskOnStorage(o => o.CustomerEmail);
// Layer 2: mask again on read (unless Admin)
e.MaskOnRetrieval(o => o.CustomerEmail, a => a.AllowRoles("Admin"));
});With this configuration:
- The raw email is never stored (only the masked version
"a***m") - When an Admin queries, they see
"a***m"(the stored masked value) - When a non-Admin queries, they see
"a***m"masked again ->"a***m"(double-masked is the same as single-masked for this pattern)
This matters more with different action types. For example, you might hash on storage but anonymize on retrieval for non-privileged users - so they see [ANONYMIZED] instead of the hash.
Every AuditTransaction has a GdprState field that tracks what happened to its data:
| Value | Meaning |
|---|---|
Original (0) |
No GDPR policies were applied. Data is stored as-is. |
PartiallyProcessed (1) |
Some fields were transformed, but at least one used Mask, Hash, or Custom (not fully anonymized). |
FullyAnonymized (2) |
Every field with a GDPR rule was either Anonymized or Excluded. No identifying data remains. |
Erased (3) |
Right-to-erasure was invoked. User identity fields have been replaced with [ERASED]. |
The pipeline sets this automatically based on what GdprProcessor reports. If multiple entries are in a single transaction, all of them must be fully anonymized for the transaction to get the FullyAnonymized state. One partially-processed entry downgrades the whole transaction to PartiallyProcessed.
Entries without any GDPR rules don't affect the determination - they're considered "clean" and don't prevent a FullyAnonymized state.
When a user invokes their right to be forgotten, call:
var store = serviceProvider.GetRequiredService<IAuditStore>();
await store.AnonymizeByUserAsync("user-123");This walks through every AuditTransaction where UserId == "user-123" and replaces:
| Field | Before | After |
|---|---|---|
UserId |
"user-123" |
"[ERASED]" |
UserName |
"Alice Smith" |
"[ERASED]" |
IpAddress |
"192.168.1.42" |
"[ERASED]" |
GdprState |
whatever it was | Erased |
The audit entries themselves (entity changes, timestamps, actions) remain intact. You keep the structural audit trail for compliance, but the person behind the actions can no longer be identified.
Policies are stored in GdprPolicyRegistry, a singleton that gets populated during startup from your GdprOptions.ForEntity<T>() calls. The registry supports lookup by both Type and entity name (string), which is important because EF Core interceptors sometimes only know the entity name from SQL metadata, not the CLR type.
Each entity's policy (EntityGdprPolicy) contains two separate lists of FieldGdprRule - one for storage, one for retrieval. The GdprProcessor consults these lists at the appropriate moment.
RzR.DataVigil · Source Code · NuGet Packages · Built with .NET Standard 2.1
Getting started
Core features
Reference
Resources