GitHub

Efficient PreTrained Language Models

After the advent of BERT, many pretrained language models were introduced, and most of them opted for larger model sizes to achieve better performance. While large-scale pretrained models indeed ensure good performance, they come with the drawback of being challenging to use in typical computing environments. To address this issue, there has been a movement to build efficient pretrained models that can offer a certain level of performance.

There are broadly two ways to enhance model efficiency: one is by reducing the number of model parameters, and the other is by improving the attention mechanism. This project compares representative models of these two approaches with the baseline model, BERT, and assess the efficiency gains in real tasks. Models aimed at lightweighting include ALBERT, Distil BERT, and Mobile BERT, while those focused on enhancing attention mechanisms include Reformer, Longformer, and BigBird.

Model Descs

LightWeight Focused Models

ALBERT
A Lite BERT

Distil BERT
Distilled BERT

Mobile BERT

Attention Focused Models

Reformer

Longformer

BigBird

Model Specs

LightWeight Focused Models

Model	Params	Size	LightWeight Ratio (BERT Based)
BERT	109,482,240	417.649 MB	100%
AlBERT	11,683,584	44.577 MB	10.67%
Distil BERT	66,362,880	253.158 MB	60.62%
Mobile BERT	24,581,888	93.776 MB	22.45%

Attention Focused Models

Model	Params	Size	Attention Type
BERT	109,482,240	417.649 MB	Full Attention
Reformer	148,654,080	567.070 MB	Sparse Attention
Longformer	148,659,456	567.091 MB	-
Big Bird	127,468,800	486.317 MB	-

Results

LightWeight Focused Models

	BERT	AlBERT	Distil BERT	Mobile BERT
COLA Accuracy	-	-	-	-
Training Speed per Batch	-	-	-	-

Attention Focused Models

	BERT	Reformer	Longformer	Big Bird
IMDB Accuracy	-	-	-	-
Training Speed per Batch	-	-	-	-

How to Use

python3 run.py -mode ['lightweight', 'attention']

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
ckpt		ckpt
data		data
module		module
report		report
README.md		README.md
run.py		run.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ckpt

ckpt

data

data

module

module

report

report

README.md

README.md

run.py

run.py

setup.py

setup.py

Repository files navigation

Efficient PreTrained Language Models

Model Descs

Model Specs

Results

How to Use

References

About

Releases

Packages

Languages

moon23k/Efficient_PLMs

Folders and files

Latest commit

History

Repository files navigation

Efficient PreTrained Language Models

Model Descs

Model Specs

Results

How to Use

References

About

Resources

Stars

Watchers

Forks

Languages