Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Deciders, multiple stream partitions FAQs #299

Merged
merged 25 commits into from
Nov 12, 2021

Conversation

bartelink
Copy link
Collaborator

@bartelink bartelink commented Nov 10, 2021

Transplants an edited version of a response to an excellent series of questions posed by @rmaziarka on https://github.com/ddd-cqrs-es/slack-community
(The term and concept of a Decider is thanks to @thinkbeforecoding)

@bartelink bartelink changed the title Add FAQ re multiple streams in a CosmosStore partition Add Deciders, multiple stream partitions FAQs Nov 11, 2021
@bartelink bartelink marked this pull request as ready for review November 11, 2021 14:27
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
bartelink and others added 12 commits November 11, 2021 18:41
Co-authored-by: Rambert Yan <ragiano215@gmail.com>
Co-authored-by: Rambert Yan <ragiano215@gmail.com>
Co-authored-by: Rambert Yan <ragiano215@gmail.com>
Co-authored-by: Rambert Yan <ragiano215@gmail.com>
Co-authored-by: Rambert Yan <ragiano215@gmail.com>
Co-authored-by: Rambert Yan <ragiano215@gmail.com>
Co-authored-by: Rambert Yan <ragiano215@gmail.com>
Co-authored-by: Rambert Yan <ragiano215@gmail.com>
Co-authored-by: Rambert Yan <ragiano215@gmail.com>
Co-authored-by: Rambert Yan <ragiano215@gmail.com>
Co-authored-by: Rambert Yan <ragiano215@gmail.com>
Co-authored-by: Rambert Yan <ragiano215@gmail.com>
@bartelink bartelink merged commit 92b1b37 into master Nov 12, 2021
@bartelink bartelink deleted the substreams-in-partitions branch November 12, 2021 16:17
@bartelink
Copy link
Collaborator Author

Thanks for all the feedback and suggestions received, both here on the PR, and out of band on the DDD-CQRS-ES slack


#### In general

While the concept of a Decider plays well with Event Sourcing and many different types of Stores, it's important to note that neither storage or event sourcing is a prerequisite. A lot of the value of the concept is that you can and should be able to talk about and implement one without reference to any specific store implementation (or even thinking about it ever being stored - it can also be used to manage in-memory structures such as UI trees etc).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COMMENT: I like that paragraph. Having a decider may be helpful also outside of Event Sourcing. EDA in general for sure, but also I see a lot of similarities to e.g. Redux (however, there the difference is that's upside down, as it's more command sourcing).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - @thinkbeforecoding calls this out in his talk. Definitely open to any wording improvements to this paragraph as I don't know enough about Redux type things to use the correct terms.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can even use if for load/save state 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice observation - Added this and the other point - I hope this represents what you mean more or less a5100f4

<a name="what-is-a-decider"/></a>
### What is a Decider? How does the Equinox `type Decider` relate to Jérémie's concept of one? :pray: [@rmaziarka](https://github.com/rmaziarka)

The single best treatment of the concept of a Decider that's online at present is [this 2h45m video](https://www.youtube.com/watch?v=kgYGMVDHQHs) on [Event Driven Information Systems](https://www.youtube.com/channel/UCSoUh4ikepF3LkMchruSSaQ) with [Jérémie Chassaing, @thinkb4coding](https://twitter.com/thinkb4coding). As teased in that video, there will hopefully be a body of work that will describe the concept in more detail ... eventually (but hopefully not [inevitably](https://github.com/ylorph/The-Inevitable-Event-Centric-Book))...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: It might be worth doing some "elevator pitch" here with a TLDR for the lazy readers, so what you get, why and where. (and/or shortly why the methods are named Transact).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would require me re-watching it so as not to misrepresent it tho - I'd love if you or someone would contribute it - all of this is/was written from memory ;)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think also that its better if this is kept abstract (if somone wants to they can follow the link and summarize, great!)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But after all my protests, this prompted me to write a summary of Transact and Query which I think will be good...


_When applying the concept of a Decider to event sourcing, the consistency requirement means [there's more to the exercise than emitting events into a thing those marketing centers on Events](https://domaincentric.net/blog/eventstoredb-vs-kafka)_. There needs to be a way in the overall processing of a decision that manages a concurrency conflict by taking the state that superseded the one you based the original decision on (the _origin state_), and re-running the decision based on the reality of that conflicting _actual state_. The resync operation that needs to take place in that instance can be managed by reloading from events, reloading from a snapshot, or by taking events since your local state and `fold`ing those Events on top of that.

#### The ingredients

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: It might be worth adding some real-world examples of each ingredient, e.g. what may be stated (e.g. current balance, items in the shopping cart, etc.), what decide (decision if you can withdraw money, remove product item from the shopping cart, etc.).
My observation is that it's easier to keep focus if you see things that you can relate to, while learning a new concept. Of course it's needed to be cautious to not suggest to much, but still it's worth doing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decent idea, did something


Finally, I'd say that a key thing the Decider concept brings is a richer way of looking at event sourcing than the typical event sourcing 101 examples you might see:
- de-emphasizing one-way things that map commands to events without deduplicating and/or yielding a result (not saying that you shouldn't do the simplest thing -- you absolutely should)
- de-emphasizing all projection handlers only ever just sitting there looking for `MyThingCreated` and doing an `INSERT` with a try/catch for duplicate inserts as a way to make it idempotent (and every stream design requiring a Created event as a way to enable that)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SMALL SUGGESTION: I think breaking a long sentence like that into shorter sentences would help the newbie get the point.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would appreciate a suggestion as I have no idea what to say! You can supply a suggested edit to go with the PR when I post it ;)

The missing part beyond that basic anemic stuff is where the value lies:
- any interesting system *makes _decisions_*:
- a decision can yield a result alongside the events that are needed to manifest the state change
- any decision process can and should consider [idempotency](https://en.wikipedia.org/wiki/Idempotence) - if you initiate a process/request something, a retry can't be a special case you don't worry about, considering it is a fundamental baseline requirement and thinking tool

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QUESTION: Does the decider pattern have something specific around idempotency?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add something. I would say Decider doesnt intrinsically bring it into the picture, much like how conflicts and storage are external. But it does control the state and decision, and dealing with things idempotently is something that you should be looking to do in a way you can test and reason about easily - in the logic of a decider can be that place (but in some cases things like the ESDB idempotent write dedup facility can address this concern in other ways - sometimes thats better, mostly its pretty subjective I would say)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now:|

any decision process can and should consider idempotency - if you initiate a process/request something, a retry can't be a special case you don't worry about. Taking correct handling of such retry and/or replay scenarios into consideration should not be an afterthought, but instead be a concern on your day to day checklist when writing a decision function. Of course idempotency can be handled in many ways - sometimes before processing gets to the Decider, sometimes internally (e.g. a decision can yield the unique id generated the first time the request was triggered on every subsequent invocation), sometimes it can be handled externally (e.g. one might not maintain the state that would be necessary to fully deduplicate triggerings and rely on the EventStoreDB and SqlStreamStore idempotent write deduplication mechanism to deduplicate the writes just in time)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add idempotency on a decider. You can make a generic function
D<Cmd,Event,State> -> D<(IdCmd), Event, (State(Id Set))> where Id is a command identifier.
In the decide function it check whether the id is in the set. In the evolve function, it adds the id to the set.

Copy link
Collaborator Author

@bartelink bartelink Nov 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI there is a #301 PR open with some updates. That's a good point; will add...

- any decision process can and should consider [idempotency](https://en.wikipedia.org/wiki/Idempotence) - if you initiate a process/request something, a retry can't be a special case you don't worry about, considering it is a fundamental baseline requirement and thinking tool
- it can let you drive a set of reactions in a fault-tolerant and scalable (both perf/system size, and management/separation of complexity) manner

Quite frequently, a Decider will also internally be a Process Manager, encapsulting a state machine.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QUESTION: Could you elaborate on that part? Do you mean that grouping multiple decisions will create effectively Process manager?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elaborated, wil push later. No, this is more an extension of not having anemic aggregates - there needs to be a reason for a thing to exist, invariants being enforced. Not everytihng needs that, but something does. wrt this overall question, you're right that there is a conflict/concern - if you have 2 deciders with 2 sets of rules, those can be independent, but if e.g. you do conflict checking based on expected version oinly, thy confkict and trigger retries for no reason. Not sure where that would be good to mention in the flow, and this is all very long altready.


Quite frequently, a Decider will also internally be a Process Manager, encapsulting a state machine.

### With `Equinox.CosmosStore`, it seems it should be possible to handle saving multiple events from multiple streams as long as they share the same partition key in Cosmos DB. But it does not seem to be possible via `Equinox.Decider.Transact` ? :pray: [@rmaziarka](https://github.com/rmaziarka)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QUESTION: Does all strategies for different stores are always really transactional? Also, it'd be clear to say what do you mean by Transactional. Is it really a transaction or more atomicity?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully rewording improved it. this was definitely lazy

Taking particular solution patterns off the table from the off is definitely something you need to be careful to avoid.
As an analogy: Having lots of classes in a system can make a mess. But collapsing it all into as few as possible can result in ISP and SRP violations and actually make for a hard to navigate and grok system, despite there being less files and less lines of code (aka complexity). Coupling things can sometimes keep things Simple, but can also sometimes simply couple things.
In my personal experience
1) Sagas and PMs can be scary, and there are not many good examples out there in an event sourcing context

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QUESTION: I think that we have already discussed that before, but where do you draw the line between Saga and PM?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me a saga is stateless and typically has semantics that just ensure related things happen.
A PM has state and often a richer processing workflow
Here the intent is to convey "the stuff you are avoiding to stay on the KISS side"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded to convey that intent better


>In the case above we could assume that data inside a single city will be so small, that even with the long usage it won't complete the whole CosmosDB partition. So we could use it to handle saving 2 events in the same time.

For avoidance of doubt: being able to write two events at the same time is a pretty valid thing to want and need to do (Equinox itself, and any credible Event Store supports it)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: ESDB only for the particular stream.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarifying this to make it less ambiguous

- You need to do etag-checked reads and writes or go home
- Each Item/Document costs you about 500 bytes in headers and space in all the indexes so you need to strongly consider >1 event/document
- Queries are way more costly than point reads. It's called a Document database because the single thing it does best on this planet is read or update ONE document. Read/write cost rule of thumb is per KB, but has logarithmic properties too, i.e. 50K is not always 50x the cost of 1K
- Keep streams as small as possible but no smaller. 20GB max in Cosmos, but in practice, the latency to read that much stuff is preposterous

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QUESTION: 20GB is a limit for the document?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

20GB is total size of content and indexes for a single logical partition. Writes are refused when you hit that. I believe some other section mentions max doc size is 2MB but in some contexts things are limited to 1MB

bartelink added a commit that referenced this pull request Nov 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants