LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
-
Updated
Jul 11, 2024 - Python
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
An Open-sourced Knowledgable Large Language Model Framework.
Collaborative Training of Large Language Models in an Efficient Way
Code and analysis for optimizing dynamic neural networks. This project investigates and implements various optimization techniques to enhance dynamic neural networks.
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.
A toy large model for recommender system based on LLaMA2/SASRec/Meta's generative recommenders. Besides, note and experiments of official implementation for Meta's generative recommenders.
Shaping Language Models with Cognitive Insights
Minimal yet high performant code for pretraining llms. Attempts to implement some SOTA features. Implements training through: Deepspeed, Megatron-LM, and FSDP. WIP
Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
The official implementation of paper "Demystifying Instruction Mixing for Fine-tuning Large Language Models"
Best practice for training LLaMA models in Megatron-LM
Add a description, image, and links to the deepspeed topic page so that developers can more easily learn about it.
To associate your repository with the deepspeed topic, visit your repo's landing page and select "manage topics."