Skip to content
Matteo Galacci edited this page Jan 28, 2022 · 46 revisions

Introduction

The idea behind this project is to make a CQRS + ES system compliant, specifically implemented through the Broadway library, with the general data protection regulation (GDPR), in particular with the right to be forgotten.

Basic concepts

CQRS

CQRS (Command Query Responsibility Segregation) is a pattern that aims to separate responsibilities for queries and for commands. It is a pattern that separates the read and write operations on a given model. This, in practice, leads to different concrete objects, separated in write models and read models. So, this pattern can lead to different tables or data stores where the data is separated based on whether it is a command (write) or a query (read). Separation apart, the last state of the model will still be persisted as it happens in traditional CRUD systems.

Event Sourcing

ES (Event Sourcing), used together with CQRS, "transforms" the writing part of the CQRS models into a succession of events that are persisted in an Event Store, a specific table or data store that acts as a chronological and immutable register of events. The idea is that commands executed on a model lead to the issue of events which are stored in the Event Store. In this table are persisted all events issued by a Model, with a specific incremental index, sometimes called playhead, that represents order in which the events have been issued; for each new Model (Aggregate), its events starts with playhead = 0. Event Store is therefore recording system for all events that, if re-applied to the model in same order of generation, bring it to its last state. Or it might be possible to see a previous status of a model. These events then, if listened to specific Listeners, can project views (Read Models) or generate new commands (Processor). The views will then be the models (persisted in tables other than the Event Store or even a different Data Store) used by the read queries.

CQRS+ES diagram

Event Store immutability

So using whole CQRS+ES pattern, we have an Event Store in which all events will be written in chronological order and grouped for each model using aggregate id. Event Store is immutable by its nature; after writing an event, it can never change. If necessary, compensation events will be issued to compensate the previous events. Imagine a bank account and its list of transaction, and think of a compensation event as a reversal.

Projections

In a CQRS + ES system there are usually projections. If the event store is the chronological register of all the writing operations that took place on a specific Aggregate, then a projection is a specific view of the data; for a single Aggregate we could have as many views as there are our needs. So, after an event is issued, an event listener could listen that event in order to project a view of it. Multiple event listeners can listen same event to project different representations of the same data set.

Art. 17 GDPR -Right to be forgotten

The lay says: The data subject shall have the right to obtain from the controller the erasure of personal data concerning him or her without undue delay and the controller shall have the obligation to erase personal data without undue delay... Read the complete legislation

CQRS + ES + GDPR

We have said that in CQRS+ES pattern the Event Store is immutable and we have also said that to be compliant with the GDPR, a user can be request cancellation of his data. Thus said it seems a paradox, right? Because deleting user's data in a CQRS+ES system would mean either deleting events from Event Store or modifying existing events. Both things we cannot do. Compensation events cannot useful in this case as by going back in history, we could always recover user's data.

The proposal

This library proposes a solution to the problem above.

Event Store

Instead of thinking in terms of deleting or modifying events, the idea is to persist from the very beginning of the history, events in which payload (user information, or in general the event containing sensitive data) is encrypted by an encryption key specific for each Aggregate. As long as the key is present, the data can be encrypted and decrypted. When the key will be deleted (following a user request), the events will remain in the Event Store, but the payload, originally encrypted, will remain encrypted without the possibility of decryption. Thus, the story will remain unchanged, but the data is not understandable.

Normal payload

{
    "class": "SensitiveUser\\User\\Domain\\Event\\UserRegistered",
    "payload": {
        "id": "b0fce205-d816-46ac-886f-06de19236750",
        "name": "Matteo",
        "surname": "Galacci",
        "email": "m.galacci@gmail.com"
        "occurred_at": "2022-01-08T14:22:38.065+00:00",
    }
}

Sensitized payload

{
    "class": "SensitiveUser\\User\\Domain\\Event\\UserRegistered",
    "payload": {
        "id": "b0fce205-d816-46ac-886f-06de19236750",
        "name": "Matteo",
        "surname": "#-#2Iuofg4NKKPLAG2kdJrbmQ==:bxQo+zXfjUgrD0jHuht0mQ==" //Sensitized
        "email": "#-#OFLfN9XDKtWrmCmUb6mhY0Iz2V6wtam0pcqs6vDJFRU=:bxQo+zXfjUgrD0jHuht0mQ==", //Sensitized
        "occurred_at": "2022-01-08T14:22:38.065+00:00",
    }
}

Projections

The sensitization operation is performed at a different time from the event projection, so the views will have the data decrypted to allow the read operations to work correctly. When a user makes use of the right to be forgotten, you should do three things:

  1. Delete his encryption key
  2. Delete the views that contain his data
  3. Reproject events to regenerate views with encrypted data. (This will be easy as since there is no encryption key for a specific Aggregate, reading it from the Event Store will be hydrated with the sensitized data. This obviously involves particular checks in the Value Objects or in the Aggregate itself)

Event Store diagram

Important note

It's important understand that the idea behind this project is not about general security or data leak. The idea behind this implementation is rather to make a CQRS + ES system compliant with the user's right of asking at any time to be forgotten, while keeping the system consistent.