-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeError: get() takes no keyword arguments - breaks training jobs #418
Comments
This is my command to start the training job: estimator = PyTorch(
entry_point="train_deploy.py",
source_dir="code_chesterton",
role=role,
framework_version="1.5",
py_version="py3",
instance_count=2, # this script only support distributed training for GPU instances.
instance_type="ml.p3.8xlarge",
debugger_hook_config=False,
)
estimator.fit({"training": inputs_train, "validation": inputs_valid}) |
In the test script the following tokenizer function when invoked while mapping the dataset changes the datatype of 'os.environ' from 'os._Environ' to 'dict' tokenizer = AutoTokenizer.from_pretrained(model_checkpoint, use_fast=True) This causes get() method in 'dict' class to fail as it does not support 'default' keyword argument. |
IMO we should file an issue with transformers package. |
We have filed an issue here: huggingface/datasets#2115 |
I have run into this issue recently. I use the HuggingFace container because I found it supported on SageMaker. estimator = HuggingFace(
entry_point='train.py',
role=role,
instance_type='ml.p3.2xlarge',
instance_count=1,
transformers_version='4.4.2',
pytorch_version='1.6.0',
py_version='py36'
) Later I found this issue is solved in the newest version of container (thanks to the contributors) estimator = HuggingFace(
entry_point='train.py',
role=role,
instance_type='ml.p3.2xlarge',
instance_count=1,
transformers_version='4.11.0',
pytorch_version='1.9.0',
py_version='py38'
) |
I have been fine-tuning
distilbert
from the HuggingFace Transformers project. When callingtrainer.train()
, somewheresmdebug
tries to callos.environ.get()
and I get the above error.There are no other messages.
It affects this line:
/smdebug/core/logger.py", line 51, in get_logger
whether or not I setdebugger_hook_config=False
The text was updated successfully, but these errors were encountered: