GitHub

OPT and causal language modeling

Fine-tuning (or training from scratch) the library models for language modeling on a text dataset for OPT... Models such as OPT are trained or fine-tuned using a causal language modeling (CLM) loss.

The following example fine-tunes Facebook OPT on WikiText-2. We're using the raw WikiText-2 (no tokens were replaced before the tokenization). The loss here is that of causal language modeling.

This training script is adapted from the HuggingFace Language Modelling examples.

You can invoke training by using the prepared bash script

bash ./run_clm.sh <batch-size-per-gpu> <mem-cap> <model> <gpu-num>

batch-size-per-gpu: number of samples fed to each GPU, default is 16
mem-cap: limit memory usage within a value in GB, default is 0 (no limit)
model: the size of the OPT model, default is 6.7b. Acceptable values include 125m, 350m, 1.3b, 2.7b, 6.7, 13b, 30b, 66b.
gpu-num: the number of GPUs to use, default is 1.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
test		test
tokenizer		tokenizer
.gitignore		.gitignore
README.md		README.md
benchmark.sh		benchmark.sh
build_model.py		build_model.py
colossalai_zero.py		colossalai_zero.py
generate.py		generate.py
run_clm.py		run_clm.py
run_clm.sh		run_clm.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

test

test

tokenizer

tokenizer

.gitignore

.gitignore

README.md

README.md

benchmark.sh

benchmark.sh

build_model.py

build_model.py

colossalai_zero.py

colossalai_zero.py

generate.py

generate.py

run_clm.py

run_clm.py

run_clm.sh

run_clm.sh

utils.py

utils.py

Repository files navigation

OPT and causal language modeling

About

Releases

Packages

Languages

ouyangliqi/GPT3

Folders and files

Latest commit

History

Repository files navigation

OPT and causal language modeling

About

Resources

Stars

Watchers

Forks

Languages