Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question related to the allocate model function #314

Open
cgabard opened this issue Oct 12, 2020 · 8 comments
Open

Question related to the allocate model function #314

cgabard opened this issue Oct 12, 2020 · 8 comments

Comments

@cgabard
Copy link

cgabard commented Oct 12, 2020

Hello.

Sorry if my question seems naive or if I missed something from the book. But I have difficulty understanding how the allocation is really committed to the db level. It is the only thing I have difficulty to understand in your book that is very interesting.

If I used the last version of the code (on master):

This code should persist the added OrderLine so, in order to commit this change, I guess that it is SQLAlchemy mapping magic that ensures the persistence of this change? Is it correct?

If so, I have another question. This behavior rely on some SQLAlchemy black magic. If I want to avoid them and did not use the mapper, to have an explicit committing line, I am not sure how I should implement it. Where update generated by the domain model should be added? Domain model does not have any direct link to the underlying repository used.

Basically, if I used raw SQL or basic sqlalchemy core api (without mapper), where model update sql statement should be added in the code base?

Thank you!

@hjwp
Copy link
Contributor

hjwp commented Oct 13, 2020

I guess that it is SQLAlchemy mapping magic that ensures the persistence of this change? Is it correct?

yes that's right. to reassure myself that this works, I might add a test in test_repository.py . I think earlier in the book we have an explicit test_orm.py for checking that fiddly details of sqlalchemy magic works (example from chatper 6) but i'd have them in the repository integration tests in "real life" i think.

if you want to avoid using black magic (an idea i thoroughly endorse!) then you have two choices: either you need to do the orm "track-changes" work yourself/manually, or you need to add an extra explicit call to some sort of update() method. as always neither is the perfect solution and it's a trade-off. if you track changes manually, maybe by saving a snapshot of each model as it's loaded by the repository, or using some sort of python descriptor protocol voodoo, then you're back into magic territory. if you have an explicit call to .update() then you've broken the elegance of the repository pattern, and you'll need to figure out how to unit test for that, to make sure you don't forget...

@cgabard
Copy link
Author

cgabard commented Oct 14, 2020

Thanks for the details!

Indeed that's what I'm afraid of, it would be complicated to get rid of ORM voodoo. Maybe a solution can be to add on objects models a list of "DB event to be committed" that are completed by the model method so the repository or uow can collect them before committing. A bit like the "sync directory" example, provide a list of action that can be easily checked on tests and collected/committed on the higher level.

In any case, thanks for the feedback.

@hjwp
Copy link
Contributor

hjwp commented Oct 15, 2020

one possible solution that occurs to me is to make your domain models immutable. if you ever need to make changes, you actually need to create a new object (or the models can do this themselves). then the semantics of repo.get() and repo.add() or perhaps repo.add_or_update() make sense, and you can probably find a reasonably sane way of building a FakeRepository that can keep track of what has been added/updated separately from what was in there before the test starts.... idk, i haven't thought it through, but it sounds worth trying.

maybe other people out there have tried? functional programming is popular with DDD people, so maybe in one of those functional-domain-modelling books. or here's what i found with a quick search https://stackoverflow.com/questions/751437/repository-and-immutable-objects

@stefanondisponibile
Copy link

I think @cgabard quite nails the real "problem" of this fantastic book. But I'd love to know your opinion on this: doesn't CQRS solve this problem? In a way, everything "happening" to a model/aggregate is published in the form of events. So basically, what are the real drawbacks of treating db persistence as eventual too? Why commit-ing before publishing events? It's more a fact of publishing and command-processing order, from this perspective.

@hjwp
Copy link
Contributor

hjwp commented Dec 14, 2020

a) thanks very much for the word "fantastic" in "fantastic book", i see it and i very much appreciate it! 😊

b) do you mean the "problem" is that model changes being persisted "magically" when we call uow.commit() thanks to sqlalchemy's session magic?

c) you ask if CQRS would solve the problem, but did you mean eventsourcing? the terms are often used together, but in the book we do talk about CQRS but we do not talk about eventsourcing.

...which is not to say that i think eventsourcing is a bad idea! i've never fully committed to it but every time i've dipped my toes in it it's made a lot of sense to me... but i can't say i've been able to rigorously think thru all the persistence + transaction semantics that might ensue.

maybe check out @johnbywater's eventsourcing book and github project? I haven't had time to take a proper look yet, and I gather it's very opinionated, but I know John has spent a lot more time thinking about it than I have!

@johnbywater
Copy link

johnbywater commented Dec 14, 2020

Thanks for the mention @hjwp!

Firstly, in case anybody needs links, the Python eventsourcing project is here and the book Event Sourcing in Python is here.

Regarding the original question, the book which covered these concerns quite well is Martin Fowler's Patterns of Enterprise Application Architecture. About half of this book is about mapping a domain model made of "domain objects" to relational tables. The trouble that ORMs run into when they attempt to take this approach to the level of a general framework is the "object-relational impedance mismatch". It's referred to as the Vietnam of computer science because that's where programmers go to die.... SQLAlchemy does a remarkably good job, and Django ORM is also good but has some inconsistencies that are understandable given the difficulties of reaching a general solution to the concern. I spent a few years trying to develop the Python domainmodel library, which in attempting to reach a truly general solution eventually exhausted me, and I backed off to try to figure out what was actually happening.

The problems of doing things in general don't arise when you just try to write a mapper for the model you actually have. And that's the value of the patterns in Patterns of Enterprise Application Architecture. But then each time you change the domain model, you have to revisit the mappers, which produces the desire to reach a general solution, and then you start to die trying....

There's a good reason for the difficulties that are encountered, which weren't obvious to me until I had some experience of event sourcing. The reason is that what's important about a domain model is that it makes decisions. These decisions are events. And event sourcing became a "thing" for these reasons, which (I understand) were intuited rather than arrived at with the retrospective analysis I am describing now. The point about "decisions" is that once a decision is made, it will forever have been made, even if a subsequent decision effectively reverses the consequences. That's where the "immutable" thing comes from, and also the "append" thing. It turns out to be a deep metaphysical fact about the universe, a fact that is missed when we build a model by using "domain objects" or "aggregates" that endure and change. Change is really a contrast between subsequent decisions. And that's the problem with the non-event sourced domain models: the decisions are coded explicitly, and so it's "hard" to figure out what has changed when the domain objects or aggregates are changed. Especially when you have a cluster of objects that are all somehow changed, and these "changes" need to be recorded altogether in an atomic database transaction. There's simply no end to the search for an ultimate generality which doesn't conclude in coding the decisions explicitly as immutable domain model event objects.

The problems of the "CRUD" models don't show up for simple models that aren't involved in a distributed system. That's why Django works quite well in lots of cases. But as soon as you need to update changes from several objects, or you need to propagate the state of the application reliably, the deficiency of the non-event sourced approach starts to show up. The DDD aggregate is an attempt to solve the problem of making changes to lots of domain objects, by introducing the "consistency boundary". But then you are still at risk of "dual writing" when you try to update your DB and then update your search engine, and you can't solve this problem with any kind of "signals" or arrangements writing messages to a message queue. The solution to propagating state is to write the notifications in the same database in the same atomic database transaction.

So... following those two results brings us to the design of the event stores such as EventStoreDB and AxonDB: they write domain events into two sequences: the sequence of its aggregate so that the events for an aggregate can be retrieved, and the sequence for the domain model as a whole giving a total ordering of the domain events so that they can be propagated beyond the domain model. It's possible to degrade away from this position, but it's often just not worth it, and the tendency is to fall back down to the situation that "CRUD" domain models lead to (implicit coding of decisions making identification of the "changes" a difficult problem, and "dual writing").

To do CQRS in a reliable way really depends on having the total ordering of the events. That's why people get into difficulty with CQRS. It's put forward as something you can do without having a total ordering of the decisions in the model. And you can do it, but you can't make a reliable system without that. And if you don't have a clear conceptual understanding of this stuff, you can die trying to make it reliable....

All domains are inherently susceptible to an analysis of their events. Event storming has never resulted in the conclusion that this or that domain has no events. As Whitehead wrote in Process and Reality, the actual world is built up from "actual occasions" (decisions) and everything that has any existence is derived from "actual occasions". This means that event sourcing is always at least applicable, and my feeling is that given any complexity and given we are normally working with a distributed system of some kind, it makes sense to do event sourcing by default. There was an old recommendation that you only want to do event sourcing in special cases, but I think the main reason why you don't want to do event sourcing is because of the skills in a team.

It would have been better if object-oriented analysis and design had taken note of Whitehead's scheme from the outset, but it seems to have been running on pre-modern philosophic categories (rather the modern process philosophy in Whitehead's scheme). A second opportunity was missed when Christopher Alexander's pattern language scheme was taken up into software development, since pattern language was effectively an application of Whitehead's scheme. Pattern language was originally intended to describe events, and that's why Ward Cunningham called his pattern language for software development "episodes". Episodes was a precursor to Extreme Programming, and is cited in Kent Beck's original book as such. The problem seems to have been that those who followed on, such as Fowler in writing PoEAA and also Evans in writing DDD, overlooked to register the influence of Whitehead's scheme on Alexander's work, despite the great efforts made by them and others such a Richard Gabriel, and apparently all the great software developers who shaped our field, to pay close attention to Alexander's work. So we should do what Alexander actually did, and apply Whitehead's scheme directly. This leads us to an event-oriented approach to analysis and design, and an enhanced approach to event sourcing and event sourcing in DDD, and software development, and each other.

I say "software development" because the incremental and iterative approaches were all largely event-oriented, and this was totally not described by the manifesto for agile software development. I say "each other" because Whitehead's scheme was also applied by Carl Rogers in fashioning his person-centred psychology, which is very mainstream now, and effectively gives an approach to psychology which is congruent with Alexander's pattern language scheme and also event sourcing.

So by taking note of Whitehead's scheme, we can adopt Alexander's scheme for describing design, Rogers' scheme for conditioning human relationships with ourself and others, and the event storming and event sourcing approach to analysis and design of software, and have a unified approach that is internally congruent and coherent and logical and adequate and applicable.

Anyway, that's basically what around one third of my book has turned out to be about... It was supposed to be just a distillation of the Python eventsourcing library. But then I started writing about these surrounding issues, to put the whole thing in a broader context, and now it's around 350 pages in a PDF.... The distilled code will be released soon as the next major version of the Python eventsourcing library, but at the moment it's not open source. The book isn't quite finished, but it's getting there. And of course I would love to encourage a broader discussion about these ideas, and improve the book. I've chatted with enough people to give me a lot of confidence in my understanding of the history and the conceptions we have inherited and have been working with. But it's all new to me too, so it would be great to discuss this stuff with more people, somehow.

@stefanondisponibile
Copy link

stefanondisponibile commented Dec 24, 2020

a) thanks very much for the word "fantastic" in "fantastic book", i see it and i very much appreciate it!

b) do you mean the "problem" is that model changes being persisted "magically" when we call uow.commit() thanks to sqlalchemy's session magic?

c) you ask if CQRS would solve the problem, but did you mean eventsourcing? the terms are often used together, but in the book we do talk about CQRS but we do not talk about eventsourcing.

...which is not to say that i think eventsourcing is a bad idea! i've never fully committed to it but every time i've dipped my toes in it it's made a lot of sense to me... but i can't say i've been able to rigorously think thru all the persistence + transaction semantics that might ensue.

maybe check out @johnbywater's eventsourcing book and github project? I haven't had time to take a proper look yet, and I gather it's very opinionated, but I know John has spent a lot more time thinking about it than I have!

Sorry for the late reply @hjwp!

a) You're welcome! It ain't that easy finding some good Python resources about DDD, and the book is quite valuable for starters (like me).
b) Yes, I mean that the "magic" is performed at the ORM level that's tying together the Domain/Model/Entity and infrastructure, I want to avoid that. (a tip: I haven't read carefully but it looks like the mapping as described in the book is going to be deprecated)
c) Well, I mean both indeed. I dispatch a Command like DoThis(foo="foo"), the model will receive this message, and raise back the event that changes it, of course writing it to some kind of event store. Then that event is propagated to the system and some handler can update the views accordingly (that's why I'm saying CQRS solves it). As others, I'm sketching a library around this, yet I have some doubts, and just like @johnbywater is saying, it would be nice to have a way of discussing around these issues, brainstorming ideas and all.

The use I'm thinking for this library is around this:

# ...
unit_of_work = SomeUnitOfWork(SomeRepo(SomeStore()))
some_command = DoThisCommand(foo="bar")
ddd.messagebus.handle(some_command, unit_of_work)
DEBUG: Store initialized.       [ InMemoryEventStore ]
DEBUG: Repo initialized.        [ InMemoryRepository ]
DEBUG: Unit Of Work initialized.
DEBUG: Command initialized.     [    CreatePerson    ]
DEBUG: Getting new person.
DEBUG: Playing the event (CreatePerson(stream_id='f5cdcd68-151e-418b-a0b8-e115a35fd506')).
DEBUG: person => Person(_stream_id='f5cdcd68-151e-418b-a0b8-e115a35fd506' _history=[PersonCreated(stream_id='f5cdcd68-151e-418b-a0b8-e115a35fd506')] _version=2 created=True)
INFO: We'll send an email to <NAME-NOT-SET-YET> (f5cdcd68-151e-418b-a0b8-e115a35fd506).
DEBUG: Results => [{'PersonCreated': 'f5cdcd68-151e-418b-a0b8-e115a35fd506'}]
INFO: Nice new name John!
DEBUG: Getting new person.
DEBUG: Playing the event (CreatePerson(stream_id='71b3fd11-5743-4aeb-81f8-0ce6df534ffd')).
DEBUG: person => Person(_stream_id='71b3fd11-5743-4aeb-81f8-0ce6df534ffd' _history=[PersonCreated(stream_id='71b3fd11-5743-4aeb-81f8-0ce6df534ffd')] _version=2 created=True)
INFO: We'll send an email to <NAME-NOT-SET-YET> (71b3fd11-5743-4aeb-81f8-0ce6df534ffd).
INFO: Nice new name Sarah!
INFO: Sarah has a new friend: John
INFO: John has a new friend: Sarah
DEBUG: Results => []
defaultdict(<class 'list'>,
            {'71b3fd11-5743-4aeb-81f8-0ce6df534ffd': [{'_type': 'PersonCreated',
                                                       'stream_id': '71b3fd11-5743-4aeb-81f8-0ce6df534ffd'},
                                                      {'_type': 'NameChanged',
                                                       'new_name': 'Sarah',
                                                       'person_id': '71b3fd11-5743-4aeb-81f8-0ce6df534ffd'},
                                                      {'_type': 'FriendAdded',
                                                       'friend_id': 'f5cdcd68-151e-418b-a0b8-e115a35fd506',
                                                       'person_id': '71b3fd11-5743-4aeb-81f8-0ce6df534ffd'}],
             'f5cdcd68-151e-418b-a0b8-e115a35fd506': [{'_type': 'PersonCreated',
                                                       'stream_id': 'f5cdcd68-151e-418b-a0b8-e115a35fd506'},
                                                      {'_type': 'NameChanged',
                                                       'new_name': 'John',
                                                       'person_id': 'f5cdcd68-151e-418b-a0b8-e115a35fd506'},
                                                      {'_type': 'FriendAdded',
                                                       'friend_id': '71b3fd11-5743-4aeb-81f8-0ce6df534ffd',
                                                       'person_id': 'f5cdcd68-151e-418b-a0b8-e115a35fd506'}]})
DEBUG: Person(_stream_id='71b3fd11-5743-4aeb-81f8-0ce6df534ffd' _history=[] _version=3 created=True name='Sarah' friends=['f5cdcd68-151e-418b-a0b8-e115a35fd506'])

@johnbywater
Copy link

Looks interesting! Well done for getting something working.

Regarding discussion, there's a nice and welcoming community around the Python eventsourcing library that would be willing to discuss this with you. There's a link to join the Slack workspace here:
https://eventsourcing.readthedocs.io/en/stable/topics/support.html#community-support

I'm not sure what putting the command on a message bus has to do with event sourcing as such. It is often useful to make the command execution synchronous, so that the client gets to know whether the command succeeded or failed. The command can also be implemented quite simply as a command method that is called. But it can sometimes be useful to do it asynchronously, and add code to support passing "command messages". But this concern applies to any model, whether or not is event sourced? Regarding CQRS and event sourcing, the important thing is to avoid "dual writing" on both sides, and to establish a "total order" of the domain events in the model. Doing this independently of particular infrastructure isn't tremendously difficult, but it does take a little bit of time to figure out what's needed, and make it work nicely. My library (and book) covers this quite well, I think.

The new code, that I mentioned in my post above, which was developed whilst writing my little book (https://leanpub.com/eventsourcinginpython), that started out as slides from a talk and eventually came to be a nice distillation of the Python eventsourcing library, without all the clutter that accumulated whilst I was finding my way with this topic, is now available as a "feature branch" in that project's repo. I pushed it last week.
https://github.com/johnbywater/eventsourcing/tree/feature/book_version/eventsourcing

I'm going to update the docs to cover the new code in the style of the Python Standard Library. The book will then complement this open source library, and function as a more in depth explanation and guide to the event sourcing mechanisms, presented as a coherent and interrelated set of design patterns, with usage and examples described in more detail, along with the historical and philosophical and psychological foundations of event-orientated analysis and design.

The "simple story" of event sourcing in general circulation isn't really the full story, it isn't really adequate to make a reliable system. I've tried to boil down to an absolute minimum what is needed to make an event sourced application and a system of event sourced applications. But I'm sure it isn't perfect. So if you wanted to suggest how it could be improved, I'm sure many people would appreciate that. I certainly would :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants