Skip to content

Conversation

@ucalyptus2
Copy link

What does this PR do?

This PR introduces a new InterleaveTrainer class that enables alternating between different training strategies within the same training loop. This implementation allows for more flexible training patterns where different optimization objectives can be interleaved during model training.

Key additions:

  • Add InterleaveTrainer class and configuration

    • Implements a trainer that can alternate between different training strategies
    • Provides configurable scheduling of training phases
    • Supports seamless integration with existing TRL trainers
  • Add unit tests for interleaved training

    • Comprehensive test coverage for trainer functionality
    • Tests for configuration validation
    • Integration tests with different training scenarios
  • Update init.py files to expose new trainer

    • Make InterleaveTrainer accessible through the main TRL package
    • Maintain consistent import patterns with other trainers
  • Implement trainer configuration with InterleaveConfig

    • Flexible configuration options for defining training schedules
    • Support for customizing phase transitions
    • Type-safe configuration validation

Technical Details

The InterleaveTrainer allows users to define multiple training phases that can be alternated during the training process. This is particularly useful for scenarios where you want to:

  • Alternate between different learning objectives
  • Switch between different datasets during training
  • Implement curriculum learning strategies
  • Balance multiple training goals in a controlled manner

Before submitting

  • Did you read the contributor guideline
  • Did you write any new necessary tests?
  • Did you make sure to update the documentation with your changes?

Who can review?

Anyone familiar with TRL's trainer implementations and interested in advanced training strategies. @huggingface/trl-core-team would be great reviewers for this feature.

@qgallouedec
Copy link
Member

Hi, thanks for your contribution! I’m unsure if the potential profit outweighs the added complexity. For now, I’m putting this on hold to gauge community interest in supporting this feature.

Enable the user to set Eager Execution instead of building cuda graph to save memory.
@ucalyptus2 ucalyptus2 closed this Apr 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants