[RFC] Should Services know about Batching?

Code Pointer: [service.py](https://github.com/meta-pytorch/forge/blob/d4011ea56f36333d191a8b0dd64a21c8e0df9757/src/forge/controller/service/service.py#L53)

The idea behind batching is simple: you aggregate similar requests into a single payload for more efficient processing. 
What I want to get discussion on is: _**should Services and Replicas handle batching**_ or should they just blindly pass payloads along?

IMO there are currently 2 potential patterns (that can/should co-exist) for supporting batching without upstreaming to Services/Replicas
* **Caller Batching**: Agents expose an endpoint that explicitly expects a batched input
  * Caller + Agent operate on batched payloads
  * _Con_: Caller has to deal with batching
* **Actor Batching**: Agents expose an endpoint to enqueue requests and Agents writes separate logic to batch process internally
  * Caller operates on singletons
  * Agents handles aggregating and operates on batched payloads
  * _Con_: Agents are a tad more complex and will share similar boilerplate

---

Where Services/Replicas can potentially comes in is with replica routing and prebatching in the Actor Batching cases.
1) Callers call singleton endpoints
2) Services route batched requests to specific Replicas
    - **Aware Routing**: Services route payloads (w/ similarity criterion) and to specific Replicas
 3) Replicas aggregate singleton requests
    - **Prebatching**: Replicas create an aggregated payload and passes it along to the Agent's Batched endpoint
  4) Agents process aggregated payload

~ _Note that the singleton endpoint called by the user, never reaches the Actor_ ~

In combination, this gives Forge the ability to generalize customizing the routing and batching process. 
Beyond that, this has an added side effect of simplifying the Actor flow (execution and creating new ones)

[Current] Without Services/Replica handling batching
```
def NewActor(ForgeActor):
  ...
  @endpoint 
  async def add_request(self):
       # Enqueue

   async def run(self):
        # Process from internal Queue

   async def process_batched_request(self):
        # Execution
```

With Services/Replica handling batching
```
def NewActor(ForgeActor):
  ...
  @endpoint 
  async def process_batched_request(self):
       # Execution

  @classmethod
  def enbatch(cls, request, cummulative_batch):
       # Add request to batch struct
```

--- 

Welcome all feedback, but here are a few seeding questions
* If we plan to upstream Service/Replica, does this make them too RL specific?
* Does custom routing match the identity of Forge ("Users don't need to think about the network magic")?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Should Services know about Batching? #120

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Should Services know about Batching? #120

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions