Skip to content

feat: Implement BURST and STEP load patterns #217

@arekay-nv

Description

@arekay-nv

Problem

LoadPatternType.BURST and LoadPatternType.STEP are defined in the config schema enum but have no scheduler implementation:

# src/inference_endpoint/config/schema.py:72-73
BURST = "burst"
STEP = "step"

A user who selects either pattern in their YAML config will get no error, but also no expected behavior — the scheduler will silently fall through or raise an unhandled case.

Expected Behavior

  • BURST: Issue a configurable burst of N queries at a fixed interval (e.g. N queries every T seconds)
  • STEP: Incrementally step up QPS in stages (useful for finding saturation point)

Files to Modify

  • src/inference_endpoint/config/schema.py — document the pattern parameters
  • src/inference_endpoint/load_generator/scheduler.py — add scheduler implementations
  • src/inference_endpoint/config/runtime_settings.py — ensure RuntimeSettings handles these patterns

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: core-engineLoad generator, scheduler, async utilspriority: P2Medium — address within quartertype: featureNew feature or capability

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions