Skip to content

lansinuote/Simple_TRL

Repository files navigation

训练自然语言的LLM模型

包括基于TRL的训练,和手动训练两种实现.

训练方法包括DPO和PPO

环境信息:

python=3.10

torch==2.1.0(cuda)

transformers==4.34.0

datasets==2.14.5

trl==0.7.4

视频课程:制作中.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published