Custom trl.SFTTrainer that adds a KL divergence loss between a LoRA-adapted model and its base model.
-
Updated
Jul 25, 2025 - Python
Custom trl.SFTTrainer that adds a KL divergence loss between a LoRA-adapted model and its base model.
Add a description, image, and links to the kldivergence topic page so that developers can more easily learn about it.
To associate your repository with the kldivergence topic, visit your repo's landing page and select "manage topics."