Skip to content

Latest commit

 

History

History
28 lines (17 loc) · 810 Bytes

README.md

File metadata and controls

28 lines (17 loc) · 810 Bytes

Light Fairseq

This repo contains Hugging Face transformers-style conversions of some Fairseq model checkpoints

1. Checkpoints

GPT-like SMoE and dense model checkpoints from arXiv:2112.10684

  • en_dense_lm_125m: "Phando/fairseq-dense-125m"
  • en_moe_lm_15b: "Phando/fairseq-moe-15b" / "Phando/fairseq-moe-15b-bf16"

2. Example usage:

from lightfs import FSGPTForCausalLM

# load `en_dense_lm_125m` from 🤗Huggingface model hub
model = FSGPTForCausalLM.from_pretrained("Phando/fairseq-dense-125m")
from lightfs import FSGPTMoEForCausalLM

# load `en_moe_lm_15b` from 🤗Huggingface model hub, with 🤗Accelerate MP and bf16
model = FSGPTMoEForCausalLM.from_pretrained("Phando/fairseq-moe-15b-bf16", device_map="auto")