Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Why Event Sourcing?
While subject to the same trade-offs involved in the adoption of any new model or tool, event sourcing can help us avoid the pitfalls of prematurely modelling the state of a software system by, instead, focusing on the domain events that mutate that state. By capturing these events we gain the flexibility to transform data into meaningful views in response to current and future business needs.
The what of event sourcing is not hard to find with a cursory search of the internet. An event in this context can be understood as an action or an occurrence - "something that happens" - that results in a mutation of system state. In a sentence, event sourcing is a model whereby events are regarded as the first-class citizens of a software system and whatever stateful view is required of the data is generated by applying some transformation to those events. Hence the koan of Greg Young: "current State is a left fold of previous facts".
The why of event sourcing is perhaps harder to grasp insofar as it marks a departure from a habit of thinking that has been reinforced by decades of line of business systems (with the notable exception of financial systems) focusing on the current state of the data and dismissing how the system got to that state as relatively unimportant.
With the greater proportion of new LOB applications still being modelled in the 'traditional' state-first way, event sourcing has still to make the kind of landfall on the experience of most developers as Object Oriented Programming did in the mid-90's. That being said, the huge potential benefits of event sourcing warrant consideration for any software team faced with developing a new non-trivial system or refactoring such an existing system.
Nota bene: It should also be emphasised that adopting event sourcing is by no means a trivial endeavour, and the stakeholders in the development or refactoring of a system where event sourcing is being considering should be aware of the inevitable trade-offs associated with it.
It might be helpful in understanding the potential advantages of event sourcing by looking at the disadvantages that software teams encounter with systems built in the state-first way.
Premature Data Modelling
When we start to develop a non-event-sourced system we often find that we are pushed to prematurely model the state of the system in our database schema. In the early stages of development we do not - as yet - have a clear understanding of what data the users will find useful and/or important once the system has gone will live. And yet, we usually find that we need to commit to some form of fixed schema earlier rather than later in the project, changes to which become ever more expensive in terms of rework.
Domain-Driven Design has led many developers to rethink their approach to the relationship between software and the data that it works with. Over the last 15 years we have seen a move towards a Model-First approach over a Data-First approach. But even with DDD we find that, even if we don't commit yet to a database schema, we nevertheless find ourselves committing to a domain model that consists of entities and value objects that almost inevitably will be translated directly into a database schema.
The weakness of both of these approaches - Model-First and Data-First - is that we find ourselves near the beginning of the project committing to a set of abstract concepts (be it a database table or a domain entity) that we believe accurately represents the reality of the domain i.e. when our understanding of the domain is still relatively immature and liable to assumptions. As we continue to work with the domain, our mental model begins to shift and encompass the (sometimes messy) reality of the actual domain. However, as we approach delivery of the project we find that we have built our system around a set of domain entities (or tables) that more closely reflects our earlier assumptions about the domain than our later deeper understanding of it. And, thus, we find we have built a software system that only partially corresponds to the real requirements of the business.
Event Sourcing, while by no means a panacea, allows us to avoid this kind of premature modelling by delaying the creation of the schema of the data that users will see until the last responsible moment. Event sourcing focuses on modelling state changes at a much more granular level, that is, at the level of individual events. The model of the data that can be presented to the user (the schema) i.e. the read model can be developed separately from modelling events which mutate the state i.e. the write model. We effectively decouple the capturing of data from design decisions about how we will represent that data.
As you will see in the next section, because events are the first-class citizens of an event sourced system, we can provide a stateful view of the system based on frequently changing requirements and not the pay the price of having built the system around one (most likely only temporary) view of the data.
You might be thinking that we have gotten into the territory of Command Query Responsibility Segregation and you would be right. CQRS and Event Sourcing work readily together in building a system where the read model and the write model can be designed and developed separately.
Loss of Data in Non-Event-Sourced Systems
One of the consequences of committing to a schema into which the state of the data must be projected, is that the system, by necessity, suffers a massive and on-going loss of data. This loss is two-fold:
Data that does not fit the schema. Most systems are capable of capturing far more kinds of data than are in fact captured. If we have a fixed database schema we effectively ignore data that does not fit that schema. If, then, at a later point we are asked to change how the state is modelled we need to start capturing the relevant data. The problem is that we shall only be able to provide the new model of the state from the point at which a new software update was rolled out - we cannot retroactively show data from the past according to the new model.
With event sourcing streams are cheap and we can simply publish as many events to the event store as we want and worry about how to represent them later.
The facts that have led to the current state. With the notable exception of financial systems, non-event-sourced systems tend to store only the current snapshot of the system data. The state-mutating events that brought about the current state are effectively discarded in favour of a record that represents the accumulation of these events. As previously mentioned, we can understand current state as a transformation on events, a projection. In the case of the balance of a bank account, we can understand that the current balance is the result of events that successively credited or debited the balance:
Of course, in banking it would be unthinkable that these events would be discarded for the same reasons that accountants don't use erasers. However, outside of financial systems, the events that brought about the current state tend to be thrown away. What this means, however, is that we lose the ability to use those events to project the data in a different way in the future.
With event sourcing, the state-mutating events that brought about the current state of the system are preserved, meaning that we can effectively replay the history of a certain business entity - like a bank account - and get the state of that entity at a certain period in time. Further, we can create read models that show the state of the system at a specified point in the past that we are interested in.
Greg Young's article on event sourcing