Skip to content

Add Batch Processing with Concurrency Limits to Model Adapters #7

@DevilsAutumn

Description

@DevilsAutumn

Description

Problem:
Current implementation processes prompts sequentially, making evaluations on large datasets extremely slow. Need concurrent processing with rate limit protection.

Requirements:

  1. Process prompts in batches with configurable concurrency
  2. Add semaphore-based rate limiting. Read more about semaphore here.
  3. Respect API rate limits
  4. Provide batch size configuration
  5. Handle failures gracefully in batch mode

Acceptance Criteria:

  • ModelAdapter has batch_generate() method
  • Configurable max_concurrent_requests parameter
  • Configurable batch_size parameter
  • Respects rate limits (uses asyncio.Semaphore)
  • Failed requests don't stop entire batch
  • 10x faster on 100+ prompt datasets
  • Tests verify concurrent execution

Implementation Hints:

async def batch_generate(
    self,
    prompts: List[str],
    batch_size: int = 10,
    max_concurrent: int = 5,
    **kwargs
) -> List[str]:
    semaphore = asyncio.Semaphore(max_concurrent)
    
    async def generate_with_limit(prompt):
        async with semaphore:
            return await self._generate_single(prompt, **kwargs)
    
    batches = [prompts[i:i+batch_size] for i in range(0, len(prompts), batch_size)]
    
    all_results = []
    for batch in batches:
        batch_results = await asyncio.gather(*[
            generate_with_limit(p) for p in batch
        ], return_exceptions=True)
        all_results.extend(batch_results)
    
    return all_results

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions