[FEAT][executors]: optimize advanced vLLM sampler with shared prefix caching #11

Closed

Assignees

Labels

component: executorsenhancementplatform: cudaplatform: rocmtype: performance

opened

on May 18, 2026

Refactor the rollout sampler to explicitly enable vLLM's Shared Prefix Caching mechanism. This is critical to eliminate KV cache redundancy during multi-candidate generation per prompt in GRPO.

Metadata

Assignees

inaniloquentee

Labels

component: executorsenhancementplatform: cudaplatform: rocmtype: performance

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests