Skip to content

[RFC] Should Services know about Batching? #120

@Jack-Khuu

Description

@Jack-Khuu

Code Pointer: service.py

The idea behind batching is simple: you aggregate similar requests into a single payload for more efficient processing.
What I want to get discussion on is: should Services and Replicas handle batching or should they just blindly pass payloads along?

IMO there are currently 2 potential patterns (that can/should co-exist) for supporting batching without upstreaming to Services/Replicas

  • Caller Batching: Agents expose an endpoint that explicitly expects a batched input
    • Caller + Agent operate on batched payloads
    • Con: Caller has to deal with batching
  • Actor Batching: Agents expose an endpoint to enqueue requests and Agents writes separate logic to batch process internally
    • Caller operates on singletons
    • Agents handles aggregating and operates on batched payloads
    • Con: Agents are a tad more complex and will share similar boilerplate

Where Services/Replicas can potentially comes in is with replica routing and prebatching in the Actor Batching cases.

  1. Callers call singleton endpoints
  2. Services route batched requests to specific Replicas
    • Aware Routing: Services route payloads (w/ similarity criterion) and to specific Replicas
  3. Replicas aggregate singleton requests
    • Prebatching: Replicas create an aggregated payload and passes it along to the Agent's Batched endpoint
  4. Agents process aggregated payload

~ Note that the singleton endpoint called by the user, never reaches the Actor ~

In combination, this gives Forge the ability to generalize customizing the routing and batching process.
Beyond that, this has an added side effect of simplifying the Actor flow (execution and creating new ones)

[Current] Without Services/Replica handling batching

def NewActor(ForgeActor):
  ...
  @endpoint 
  async def add_request(self):
       # Enqueue

   async def run(self):
        # Process from internal Queue

   async def process_batched_request(self):
        # Execution

With Services/Replica handling batching

def NewActor(ForgeActor):
  ...
  @endpoint 
  async def process_batched_request(self):
       # Execution

  @classmethod
  def enbatch(cls, request, cummulative_batch):
       # Add request to batch struct

Welcome all feedback, but here are a few seeding questions

  • If we plan to upstream Service/Replica, does this make them too RL specific?
  • Does custom routing match the identity of Forge ("Users don't need to think about the network magic")?

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions