PGPS-Pretraining

The structural and semantic pre-training language model of PGPSNet.

Figure 1. Pipeline of structural and semantic pre-training.

Environmental Settings

They are the same as PGPSNet.

PGPS9K Dataset

You could download the dataset from Dataset Homepage.

In default, unzip the dataset file to the fold ./datasets.

Pre-training

The default parameter configurations are set in the config file ./config/config_default.py and the default training modes are displayed in ./sh_files/train.sh, for example,

CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch \
--nproc_per_node=1 \
--master_port=$((RANDOM + 10000)) \
start.py

The training records of pre-training are saved in the folder ./log. We choose the model of last epoch as the pre-trained language model.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
config		config
core		core
datasets		datasets
images		images
loss		loss
model		model
sh_files		sh_files
utils		utils
vocab		vocab
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
start.py		start.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PGPS-Pretraining

Environmental Settings

PGPS9K Dataset

Pre-training

About

Releases

Packages

Languages

License

mingliangzhang2018/PGPS-Pretraining

Folders and files

Latest commit

History

Repository files navigation

PGPS-Pretraining

Environmental Settings

PGPS9K Dataset

Pre-training

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages