Why Event Sourcing?

David Brower edited this page Apr 13, 2016 · 23 revisions

The tldr

While subject to the same trade-offs involved in the adoption of any new model or tool, event sourcing can help us avoid the pitfalls of prematurely modelling the state of a software system by, instead, focusing on the domain events that mutate that state. By capturing these events we gain the flexibility to transform data into meaningful views in response to current and future business needs.

Contents

Introduction

The what of event sourcing is not hard to find with a cursory search of the internet. A event in this context can be understood as an action or an occurrence that results in a mutation of system state. In a sentence, event sourcing is a model whereby events are regarded as the first-class citizens of a software system and whatever stateful view is required of the data is generated by applying some transformation to those events. As Greg Young has commented, "current State is a left fold of previous facts".

The why of event sourcing is perhaps harder to grasp insofar as it marks a departure from a habit of thinking that has been reinforced by decades of line of business systems (with the notable exception of financial systems) focusing on the current state of the data and treating how the system got to that state as relatively unimportant.

With the greater proportion of new LOB applications still being modelled in the 'traditional' state-first way, event sourcing has still to make the kind of landfall on the experience of most developers as Object Oriented Programming did in the mid-90's. That being said, the huge potential benefits of event sourcing warrant consideration for any software team faced with developing a new non-trivial system or refactoring such an existing system.

It should also be emphasised that adopting event sourcing is by no means a trivial endeavour, and the stakeholders in the development or refactoring of a system where event sourcing is being considering need to be aware of the inevitable trade-offs associated with it.

It might be helpful in understanding the potential advantages of event sourcing by looking at the disadvantages that software teams encounter with systems built in the state-first way.

Premature Data Modelling

When we start to develop a non-event-sourced system we often find that we are pushed to prematurely model the state of the system in our database schema. In the early stages of development we do not - as yet - have a clear understanding of what data the users will find useful and/or important once the system has gone will live. And yet, we usually find that we need to commit to some form of fixed schema earlier rather than later in the project, changes to which become ever more expensive in terms of rework.

Domain-Driven Design has led many developers to rethink their approach to the relationship between software and the data that it works with. Over the last 15 years we have seen a greater move towards a Model-First approach over a Data-First approach. But even with DDD we find that we must commit at some point to a database schema that probably does not best represent what users will find most important or useful once they start working with it.

Event Sourcing, while by no means a panacea, allows us to avoid this kind of premature modelling by delaying the creation of the schema of the data that users will see until the last responsible moment. Rather, event sourcing focuses on modelling state changes at a much more granular level, that is, at the level of individual events. The model of the data that can be presented to the user (the schema) i.e. the read model can be developed separately from the modelling events which mutate the state i.e. the write model.

As you will see the next section, because events are the first-class citizens of an event sourced system, we can provide a stateful view of the system based on frequently changing requirements and not the pay the price of having built the system around one (most likely only temporary) view of the data.

You might be thinking that we have gotten into the territory of Command Query Responsibility Segregation and you would be right! CQRS and Event Sourcing work readily together in building a system where the read model and the write model are designed and developed separately.

Loss of Data in Non-Event-Sourced Systems

One of the consequences of committing to a schema into which the state of the data must be projected, is that the system, by necessity, suffers a massive and on-going loss of data. This loss is two-fold:

  • Data that does not fit the schema. Most systems are capable of capturing much more types of data than are in fact captured. If we have a fixed database schema we tend to throw away data that does not fit that schema. The problem is that we shall only be able to provide readings from the point at which a new software update was rolled out - but not the relevant data that was emitted before the update.

    With event sourcing streams are cheap and we can simply publish as many events to the event store as we want and worry about how to represent them later.

  • The facts that have led to the current state. With the notable exception of financial systems, non-event-sourced systems tend to store only the current snapshot of the system data. The state-mutating events that brought about the current state are effectively discarded in favour of a record that represents the accumulation of these events. As previously mentioned, we can understand current state as a transformation on events, a projection. In the case of the balance of a bank account, we can understand that the current balance is the result of events that successively credited or debited the balance:

    State

    Of course, in banking it would be unthinkable that these events would be discarded for the same reasons that accountants don't use erasers. However, outside of financial systems, the events that brought about the current state tend to be thrown away. What this means, however, is that we lose the ability to use those events to project the data in a different way in the future.

    With event sourcing, the state-mutating events that brought about the current state of the system are preserved, meaning that we can effectively replay the history of a certain business entity - like a bank account - and get the state of that entity at a certain period in time. Further, we can create read models that show the state of the system at a certain point in the past.

Two-Phase Commit

(Pending)