You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running the above training script ignore the parameters in --config_overrides
Expected behavior
I was expecting the --config_overrides string to override the training parameters. Although I see documentation suggesting that it is possible to specify something like --model_type="gpt2-medium" this produces an error such as no such model.
Perhaps there is an alternative way to specify a medium or large GPT-2 model ?
Thanks.
The text was updated successfully, but these errors were encountered:
Environment info
transformers
version: 4.10.2Platform: Linux-5.4.0-84-generic-x86_64-with-Ubuntu-20.04-focal
Python version: 3.7.12
PyTorch version (GPU?): 1.9.0+cu111 (True)
Tensorflow version (GPU?): 2.7.0 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?:
Using distributed or parallel set-up in script?:
Who can help
Information
Using the following to train GPT-2 from scratch:
The --config_overrides doesn't appear to take effect:
Starting the training o/p's :
To reproduce
Steps to reproduce the behavior:
Expected behavior
I was expecting the --config_overrides string to override the training parameters. Although I see documentation suggesting that it is possible to specify something like --model_type="gpt2-medium" this produces an error such as no such model.
Perhaps there is an alternative way to specify a medium or large GPT-2 model ?
Thanks.
The text was updated successfully, but these errors were encountered: