Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support setting arguments of pretraining by a config file #22

Merged
merged 9 commits into from Nov 28, 2021
Merged
4 changes: 4 additions & 0 deletions .gitignore
Expand Up @@ -127,3 +127,7 @@ dmypy.json

# Pyre type checker
.pyre/

# PyCharm
.idea
monologg marked this conversation as resolved.
Show resolved Hide resolved
.vscode
48 changes: 48 additions & 0 deletions configs/roberta-base.yaml
@@ -0,0 +1,48 @@
model:
attention_probs_dropout_prob: 0.1
bos_token_id: 0
eos_token_id: 2
hidden_act: "gelu"
hidden_dropout_prob: 0.1
hidden_size: 768
initializer_range: 0.02
intermediate_size: 3072
layer_norm_eps: 1e-05
max_position_embeddings: 512
model_type: "roberta"
num_attention_heads: 12
num_hidden_layers: 12
pad_token_id: 1
type_vocab_size: 1
vocab_size: 51200
data:
data_dir: "datasets/roberta"
mlm_probability: 0.15
training:
output_dir: "checkpoints/roberta"
overwrite_output_dir: False
do_train: True
do_eval: False
per_device_train_batch_size: 8
per_device_eval_batch_size: 8
gradient_accumulation_steps: 4
learning_rate: 1e-5
weight_decay: 0.1
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
max_steps: 1000000
lr_scheduler_type: "linear"
warmup_steps: 10000
logging_strategy: "steps"
logging_steps: 500
evaluation_strategy: "no"
eval_steps: 0
save_strategy: "steps"
save_steps: 10000
seed: 42
fp16: False
sharded_ddp: False
deepspeed:
gradient_checkpointing: False
123 changes: 32 additions & 91 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.