-
Notifications
You must be signed in to change notification settings - Fork 285
Open
Description
Summary
Adding this parent issue tracking all the outstanding issues with the tinker API server in SkyRL
SkyRL-Train Backend
- Support PPO loss with Tinker
- Support critic model inSkyRLTrainBackend - KL Penalty is not supported
- Support ref model inSkyRLTrainBackend - Support logprobs with the
sample()API.
- Provide an example for using truncated importance sampling with the tinker API server.
- assigned to @tamoghnokandar - Better validation for configuration parameters to the
SkyRLTrainBackend[tinker] Implement better validation for CLI parameters with the skyrl-train backend #1279 - Set LR outside of workers in skyrl-train backend: [tinkerification] Remove scheduler from inside the workers and default to use set_lr() and get_lr() in training loop #998
- Add
sampleAPI to the new inference codepath: [tinker] Sample API in new inference server #1286 - Add
rendererto the new inference codepath: [SkyRL][tinker] Add renderer endpoint to inference client #1288
Jax Backend
- Improve the memory handling and reporting to make it easier to see where the memory goes
- Add multimodal support to APIs and add support for a VLM model
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels