I'm a NLPer interested in Large Language Model and graduated from SYSU with a master's degree.

🔭 Experiences:

Shopee, responsible for building NLP algorithm ability about Customer Service. (from 2022-04 to now)
Tencent, responsible for building NLP algorithm ability about Product Understanding. (from 2021-06 to 2022-04)
Alibaba, Internship at Alibaba (from 2020-06 to 2020-09).

⚙ Here are some my public projects:

Project	Description	Code
Firefly	One-stop training for LLMs. Some achievements: 1. firefly-llama2-13b ranked 3rd among all 13B models on Open LLM Leaderboard, only 0.5 points less than 1st. 2. firefly-llama-30b ranked 10th among all 30B models on Open LLM Leaderboard trained with single V100. 3. firefly-baichuan-13b achieves over 1.63 million downloads. 4. firefly-qwen1.5-en-7b-dpo improves 7.21 points compared with the official chat model. 5. firefly-gemma-7b improves 9.37 points compared with the official chat model.
GPT2-chitchat	Chinese GPT2 for chitchat
Firefly-LLaMA2-Chinese	Chinese Llama2 with efficient and effective training method.
LongQLoRA	Efficient and Effective method for extending context length of Llama2 to 8192 with single V100. Technical Report
CPM	Chinese composition model based on CPM
CLIP-Chinese	Chinese CLIP model trained with 1.4 million image-text pairs
ClipCap-Chinese	Chinese image caption model based on clip and mengzi
OFA-Chinese	Chinese multi-modal unified pre-training model
LLMPruner	Prune vocabulary of LLMs to save memory in training.

📁 Here are some my technical blogs:

Provide feedback

Saved searches