Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal - use asyncio for events management (Supporting Event Chaining) #23

Closed
ndopj opened this issue Mar 10, 2022 · 5 comments
Closed

Comments

@ndopj
Copy link

ndopj commented Mar 10, 2022

fastapi-events currently uses ASGI middle-ware for the events management. This middle-ware creates event store for each specific request which is filled by events dispatched while computing the request and then used to execute collected events after the response was sent to a client. While this might be sufficient solution, this architecture has some disadvantages and there might be even more simplistic solution.

  • Initial problem
    First of all I am really thankful for this library and great work put into it. One of the limitations of currently used architecture and also reason why I had to stick with custom solution was the fact that currently its not possible to dispatch event from a registered handler (not possible to chain handlers by event). Since dispatch() function called from registered handler is called after the response was already sent there is no middle-ware which would execute additional events.

  • Custom simplistic solution
    It took me some time to find out how would I create custom self-executing event queue which would execute events no matter when and where dispatched as long as dispatch() function has access to event queue itself. Then I got an idea that if FastAPI is built on top of asyncio it should definitely be possible to dispatch tasks/events to the event loop so it will be queued together with other FastAPI tasks (mainly request handlers?). Following is very simple code change that allows to dispatch events into asyncio event loop and therefore there is not any requirement for event store, middle-ware executor and handling more than one task at a time.

def _dispatch(event_name: Union[str, Enum], payload: Optional[Any] = None) -> None:
    async def task():
          await local_handler.handle((event_name, payload))
    asyncio.create_task(task()) # we don't await for task execution, only register it

Differences between task management architectures

Property Middle-ware Event-loop (asyncio)
Executes after the response Yes Depends on usage
Doesn't block response Yes Yes
Dispatch must be called from a request context Yes Yes (anywhere from async context, so asyncio is accesible)
Dispatch can be used within registered handler No Yes

There are some key points to consider from the table above. While both strategies don't block the response, strategy with asyncio event loop can execute event sooner then the response is sent to client. This might happen when we do dispatch(event) with consecutive await promise. The event is dispatched to the event loop but since there is await after event has been dispatched the event loop might start executing dispatched event. From user perspective I would say this is acceptable/preferable behavior - I have already dispatched event but still awaiting for other resource and therefore other tasks can be executed in mean time. If dispatch is called and there is no consecutive await its guaranteed that it will be executed after the current event(request) finishes its execution.

While this change in architecture might break behavior users are used to I would say that strategy of detaching events execution to the asyncio event pool is more preferred and stable for the future. Instead of executing event right after the request/response where it was dispatched, event is sent to a queue and scheduled for execution with every other request/response and events that Fastapi creates. New architecture still allows old behavior to be used in case anyone needs to schedule event execution after the response. Moreover this architecture allows users to define preferred behavior instead of forcing them with strict rules.

@ndopj ndopj changed the title Proposal - use asyncio for task management Proposal - use asyncio for events management Mar 10, 2022
@melvinkcx
Copy link
Owner

@ndopj Thank you very much for the proposition and your analysis.

I totally agree with you when it comes to the limitation of the current implementation. Event chaining was something I was thinking of when I started the project but haven't spent too much effort into it as it hasn't bit me yet. While your proposal seems feasible, it is definitely a breaking change as you described.

If we were to introduce a new architecture (which I think we should), here's what I think we can do:

  1. Add an optional keyword argument to the dispatch() to allow users to specify the architecture.
    If the argument is not specified, we default to the default middleware architecture.
    Eg:

    • dispatch("event_name", {}, to_="loop")
    • dispatch("event_name", {}, to_="middleware"), which should be the same as dispatch("event_name", {}) if (2) below is not set

    Users can optionally simplify the above by using functools.partial:

    • dispatch_loop = functools.partial(dispatch, to_="loop")
  2. We introduce a global variable to allow users to specify the default architecture, either:

    • middleware, or
    • asyncio event loop

    So, if the users write dispatch("event_name", {}), the architecture used will depend on if the global variable is set to a value fastapi-events recognises, or it shall fallback to "middleware".

Please let me know your thoughts. Any proposals to the API are also largely welcome 🙏

@melvinkcx
Copy link
Owner

@ndopj I believe there is another issue we need to solve if we want to support an architecture without middleware.

Today, handlers are specified as arguments when adding the middleware:

# With Starlette

app = Starlette(middleware=[
    Middleware(EventHandlerASGIMiddleware,
               handlers=[local_handler])  # registering handlers
])

or

# With FastAPI

app = FastAPI()
app.add_middleware(EventHandlerASGIMiddleware, 
                   handlers=[local_handler])   # registering handler(s)

Without the middleware, we need another solution to allow users to specify the handlers they use. So that all handlers work properly with the asyncio event loop architecture:

def _dispatch(event_name: Union[str, Enum], payload: Optional[Any] = None) -> None:
    async def task():
          # FIXME get a list of handlers registered and invoke them
          pass
          
    asyncio.create_task(task()) # we don't await for task execution, only register it

I don't have a good idea yet. It seems like we will need to leverage global context or similar for this purpose. Any suggestions are welcome too

@ndopj
Copy link
Author

ndopj commented Mar 19, 2022

@melvinkcx I agree on mentioned two steps of how to introduce the new architecture. Such solution is backwards compatible and by default provides users with expected current architecture based behavior. Also if anywhere in the future decision will be made to stick with either event loop or middleware its easy to discard the other architecture (let's say when there will be a new major release version with other changes breaking old behavior). Also we should use Enum instead of plain strings for "middleware" and "event_loop" but I am sure whoever is going to implement this feature would do so.

Regarding all handlers registration (not to be confused with event handler registration) we should first raise a valid points for what we want to achieve from new registration system (also note from here that I am nowhere close to experienced python developer and my development experience origins in different languages).

  1. as we are trying to preserve old behavior we should still allow users to register their handlers trough middle-ware or if new registration system would handle both cases nicely and would be preferred way we should at least be sure that registration of handler trough middle-ware doesn't break expected functionality so it would not break code bases built using old versions when upgrading.
  2. it would be feasible to have registration system which would handle both cases so there will be consistent way to register handlers instead of having two separate ways for each architecture which would lower user experience little bit.

I think that we can agree that global context solution is fallback option here for us. Note that for now there will always be need for middle-ware registration if current architecture is expected from user so this is only about how EventHandlerASGIMiddleware obtains registered handlers. Current handlers argument remains for back compatibility there will be just one line addition of handlers that are expected to execute by middle-ware but were registered trough global context. Event loop dispatched event can simply access new global context registry to obtain handlers. Example usage with on app start event:

app = FastAPI()
app.add_middleware(EventHandlerASGIMiddleware, # use this only when middle-ware arch is wanted
                   handlers=[local_handler])   # registering handler(s), not required to be done here - optional for back compatibility

@app.on_event("startup")
async def startup_event():
    ...
    # register handler(s) used either by event loop or middle-ware events
    fastapi_events.register_handler(MyCustomHandler(), local_handler)
    ...

While this might be solution we should at least try to find more cleaner approach.

First thing that I come up with was to use fastapi dependencies system. But this would anyway require to manually register handler(s) to global store, so in the end this can be used just to write cleaner code with our fallback option.

Another thing would be to somehow obtain registered middle-ware from fastapi so we can access all registered handlers even from event loop dispatched event. While this does maintain backwards compatibility it would require users using event loop to register middle-ware with specified event handlers despite not using middle-ware arch. so middle-ware would act only as registry for handler(s). This might be confusing to users and is probably also no go solution.

Last thing in my mind is to self register handler to the global context (similar to proposed fallback solution) in the handler base class. Since all handlers must be instantiated so users can register event processing functions we could modify base class in such way that in the constructor it would register itself to the global context store. Middle-ware and event loop events would then use this global store to access all handlers. Since this would automatically register all handlers, handlers passed to EventHandlerASGIMiddleware would be simply ignored (but argument for them remains for back compatibility). Those would be already registered because they were instantiated. This solution is exactly mentioned fallback solution striped for need to manually register handlers. This would also discard need to register handlers for middle-ware so usage would look like:

Library part:

# handlers/base.py
class BaseEventHandler(ABC):

    def __init__(self):
        fastapi_events.register_handler(self)

    async def handle_many(self, events: Iterable[Event]) -> None:
        await asyncio.gather(*[self.handle(event) for event in events])

    @abc.abstractmethod
    async def handle(self, event: Event) -> None:
        raise NotImplementedError


# middleware.py
class EventHandlerASGIMiddleware:
    def __init__(self, app: ASGIApp, handlers: Iterable[BaseEventHandler]) -> None:
        self.app = app
        # Not required anymore: self._handlers: Iterable[BaseEventHandler] = None 
   ...
   ...
       async def _process_events(self) -> None:
        q: Deque[Event] = event_store.get()
        await asyncio.gather(*[handler.handle_many(events=q)
                               for handler in fastapi_events.get_handlers()])

User part:

app = FastAPI()
app.add_middleware(EventHandlerASGIMiddleware, # use this only when middle-ware arch is wanted
                   handlers=[local_handler])   # just back compatibility, handlers are automatically registered

This way all the handlers would be accessed from the global context so even event loop events can access handlers registered trough middle-ware (current code bases) while in the mentioned fallback with manual registration this is not the case (unless modified little bit). If this proposal would change to implementation I would suggest to first implement automatic handler registration to ensure this will work. On top of that it should be easy to build solution with event loop dispatching and the system you mentioned to preserve backwards compatibility forcing by default current behavior .

@melvinkcx
Copy link
Owner

@ndopj Absolutely right about using enums! Let me separate my response into sections to make it easier to consume.

Goals

Before diving into the implementation details, I think we should first agree on the goals. Here are what I think:

  1. Event chaining: It should be possible for us to dispatch events within event handlers
  2. Minimum changes to API and existing behaviour for maximum backward compatibility

Backward Compatibility

To achieve goal#2 from the above, I think we should preserve these behaviours:

  1. Events dispatched within a request-response cycle should only be handled after the response is made or we risk breaking the use cases it currently supports.
    A typical example is dispatching events that trigger side effects when a resource is created. If a resource is not committed into the database yet, and the event handler is invoked, it will likely cause unintended errors.

  2. Configuring handlers and other settings through a middleware. The reason is that it is possible that multiple instances of middleware are created especially in testing. If we rely solely on a single global context to replace the middleware, it will be complicated and difficult to achieve similar isolation. More on the solutions to this in the sections below.

Achieving Event Chaining

The main idea of your proposal is to achieve event chaining, and we both agree that an architectural change is necessary. In order to preserve the behaviours outlined in the section "Backward Compatibility", I'm exploring the possibility of using a hybrid approach (event loop + middleware + global context). Here's my idea:

  1. Instead of keeping track of the handlers registered as an instance variable in the middleware. Each middleware instance should have a dedicated global context that is accessible without references to the middleware.

  2. Introduce a smarter dispatcher (dispatch()). The implementation of dispatch() should detect if it is called within the request-response cycle or is within an event handler:

    1. If it is called within a request-response cycle, the events will be stored in the event_store and be processed after a response is made. (The same as today)

      event_store: ContextVar = ContextVar("fastapi_context")

    2. If it is called within an event handler, the event handling should be put into the event loop immediately using your suggestion:

      def _dispatch(event_name: Union[str, Enum], payload: Optional[Any] = None) -> None:
          async def task():
              handlers = [] # TODO get all handlers from global context
              await asyncio.gather(*[handler.handle((event_name, payload)) for handler in handlers])
      
          asyncio.create_task(task()) # we don't await for task execution, only register it

With the above, I believe all the goals described will be achieved.

About Auto-Registering Handlers

Similar to how the isolation of middleware instances might be needed for certain use cases, such as testing. Implementing auto-registration upon instantiation might be too much of a side effect as it will drastically change the behaviour and there isn't a way to opt out either. I think we can revisit this idea in the future if there are more compelling arguments for us to introduce this.

Let me know what you think =)

@melvinkcx melvinkcx changed the title Proposal - use asyncio for events management Proposal - use asyncio for events management (Supporting Event Chaining) Apr 14, 2022
@melvinkcx
Copy link
Owner

This has been implemented in #24

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants