Skip to content

ouyangliqi/GPT3

Repository files navigation

OPT and causal language modeling

Fine-tuning (or training from scratch) the library models for language modeling on a text dataset for OPT... Models such as OPT are trained or fine-tuned using a causal language modeling (CLM) loss.

The following example fine-tunes Facebook OPT on WikiText-2. We're using the raw WikiText-2 (no tokens were replaced before the tokenization). The loss here is that of causal language modeling.

This training script is adapted from the HuggingFace Language Modelling examples.

You can invoke training by using the prepared bash script

bash ./run_clm.sh <batch-size-per-gpu> <mem-cap> <model> <gpu-num>
  • batch-size-per-gpu: number of samples fed to each GPU, default is 16
  • mem-cap: limit memory usage within a value in GB, default is 0 (no limit)
  • model: the size of the OPT model, default is 6.7b. Acceptable values include 125m, 350m, 1.3b, 2.7b, 6.7, 13b, 30b, 66b.
  • gpu-num: the number of GPUs to use, default is 1.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published