The structural and semantic pre-training language model of PGPSNet.
Figure 1. Pipeline of structural and semantic pre-training.
They are the same as PGPSNet.
You could download the dataset from Dataset Homepage.
In default, unzip the dataset file to the fold ./datasets
.
The default parameter configurations are set in the config file ./config/config_default.py
and the
default training modes are displayed in ./sh_files/train.sh
, for example,
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch \
--nproc_per_node=1 \
--master_port=$((RANDOM + 10000)) \
start.py
The training records of pre-training are saved in the folder ./log
. We choose the model of last epoch as the pre-trained language model.