Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend: remove or replace event subsystem #19812

Closed
camsaul opened this issue Jan 20, 2022 · 5 comments · Fixed by #33392
Closed

Backend: remove or replace event subsystem #19812

camsaul opened this issue Jan 20, 2022 · 5 comments · Fixed by #33392
Assignees
Labels

Comments

@camsaul
Copy link
Member

camsaul commented Jan 20, 2022

Way back in #956 we added the metabase.events subsystem which is a basic pub/sub message queue for different parts of the app. There are a few reasons I think we should remove it:

  • The Metabase server is a single service, so we don't need an async message queue between different parts of the process
  • Because all the events are sent and handled asynchronously it makes debugging, handling errors and writing tests needlessly complicated
  • Because it's async it means DML stuff can't always be done atomically inside a transaction inside PUT/POST API endpoints (if the event handler is doing its own DML stuff)
  • It adds a needless layer of indirection that makes it hard to reason about the code or figure out where or why things happen (example: grep the codebase and try to tell me where driver/notify-database-updated gets triggered. I'll wait)
  • We use it in only a handful of places
  • It requires loading up a bunch of namespaces at start time which means we need to initialize that stuff in order to run certain tests. If we forget to use test fixtures to initialize it then some test namespaces might break if they are ran in isolation
  • Loading a bunch of namespaces at launch makes potential ideas like Lazy loading REST API; decrease Metabase startup time by 33% [note: waiting on switch to Toucan 2] #15429 harder to implement/less effective
  • Extra cognitive overhead of having the extra subsystem

I'm not against the idea of decoupling stuff with events and recipients entirely, if it's done tastefully. NSNotificationCenter in Cocoa(Touch) or hooks in Emacs Lisp are some examples. I think the main change we'd want to make is to make event dispatch/handling synchronous instead of async. That would fix 90% of my complaints. We'd still have the extra level of indirection tho, and the subsystem would still be used in only a handful of places, which is not enough to justify having it if you ask me.

I think we should do one of the following

  1. Remove events subsystem entirely and just do things directly e.g. call driver/notify-database-updated directly inside the PUT /api/database/:id REST API endpoint
  2. Make the events subsystem synchronous and go all in with it and actually use it in a lot more than the current handful of places
@dpsutton
Copy link
Contributor

dpsutton commented Jan 20, 2022

In principal, I'm happy with removing lots of things out of the events system, but it seems some notion of asynchronous tasks must still exist. A concrete example is the database sync events #{:database-create :database-trigger-sync}. Some notion of asynchronous syncing must happen. I suppose changing that to just dispatch the action to sync on the threadpool for that is sufficient.

I do think the revision system is tasteful in the event revision system though. It is a bit action-at-a-distance but just dispatching "card-update" and "card-create" and letting some other code care about revisions works pretty well in my opinion. The activity system feels similar.

Ironically, your culprit of driver/notify-database-updated doesn't seem to be using the event system. That's a driver defmethod that goes directly to invalidate the pooled connection. Would that actually be easier to find if all "call" sites were (events/dispatch :event/db-updated old-db new-db) or something similar?

Another thing I'm remembering, resyncing schemas used to be syncronous. And then we had to fix a bug where unhiding 15 tables would completely lock up a metabase instance deadlocking on db connections. Just something to remember as we balance these tradeoffs. #15916

@camsaul
Copy link
Member Author

camsaul commented Jan 20, 2022

One of the places notify-database-updated gets triggered is the handler metabase.events.driver-notifications whose events is sent by things like the PUT /api/database/:id 💀

@camsaul
Copy link
Member Author

camsaul commented Jan 20, 2022

I think things like saving Revisions should be done synchronously since it adds little overhead when we're already doing DML anyway and it makes that stuff 100x easier to test. I think triggering sync is the only event we'd want to do asynchronously, and in that case it's easy enough to have the event handler that triggers the sync trigger it inside a thread pool or something.

I really do like the idea of replacing the unqualified keywords with something namespaced e.g. qualified keywords.

I think basically what I want (if we do keep it) it something like the Emacs Lisp hook system. We could actually use the do method combination in Methodical to define a multimethod that will run all matching implementations sequentially. e.g.

;; define a new event handler
(derive ::save-database-revision :metabase.models.database/on-created)

(m/defmulti events/dispatch-event ::save-database-revision
  [{:keys [database]}]
  ...)
;; invoke the event handler; executes all matching methods for `events/dispatch-event` sequentially synchronously
(events/dispatch-event :metabase.models.database/on-created {:database db})

That's one way we might do it. IDK, just spitballing

@camsaul
Copy link
Member Author

camsaul commented Apr 14, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants