Skip to content

Online DPO Support #1816

@dhineshkumar-r

Description

@dhineshkumar-r

Hey Folks,

Does Nemo-RL support Online DPO(https://arxiv.org/abs/2402.04792)?
Given there are environments in Nemo-RL, was wondering about the possibility of generating preferences online based on the outcomes in the environment and fine-tuning.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions