01_getting_started_pytorch notebook - Hyperparameters sent by the client aren't passed to the Training Arguments #67

hellosamstuart · 2021-08-04T15:14:00Z

Hello!

Reopening an issue connected to this thread here:

Hyperparameters sent by the client aren't passed to the Training Arguments #52

Thank you HuggingFace team for all you do! This summer I have been working from this notebook when I noticed a gap I will discuss below. PS - this is my first GitHub issue, if you have any feedback.

Description (same as 52)

The hyperparameters sent by the client have an underscore in them (e.g. output_data_dir), whereas those received by the argparser have a hyphen (e.g. output-data-dir). Therefore, values do not get propagated through the train.py file.

Why another issue?

There have been recent commits resolving the above-linked issue in most of the notebooks. However I noticed the commit to fix this issue for Pytorch missed the second half of the typos (commit fixed train-batch-size and eval-batch-size, but still need to fix output-data-dir and model-dir).

Relevant Commits

9df51d5 (tensorflow + others)
c3fa5b5 (half the fix for pytorch)

Files

I have tested the solution on these files

notebooks/sagemaker/01_getting_started_pytorch/sagemaker-notebook.ipynb
notebooks/sagemaker/01_getting_started_pytorch/scripts/train.py

Solution

In the train.py file, swap these lines -

parser.add_argument("--output-data-dir", type=str, default=os.environ["SM_OUTPUT_DATA_DIR"])
parser.add_argument("--model-dir", type=str, default=os.environ["SM_MODEL_DIR"])
with these

parser.add_argument("--output_data_dir", type=str, default=os.environ["SM_OUTPUT_DATA_DIR"])
parser.add_argument("--model_dir", type=str, default=os.environ["SM_MODEL_DIR"])

The text was updated successfully, but these errors were encountered:

hellosamstuart · 2021-08-04T16:17:40Z

Hi @philschmid! Got a tip from my colleague who filed the issue #52 that tagging you here would be best for Sagemaker issues. Please see my note above about a gap a recent commit leaving a remaining issue behind. Thanks!

philschmid · 2021-08-04T16:59:28Z

Hey @hellosamstuart,
Thank you for finding those! Should be merged and good to use.

You must have spent way too much time creating this issue. It looks so well structured. BTW you can always open a PR with the fix too

philschmid mentioned this issue Aug 4, 2021

changed - _ #68

Merged

philschmid closed this as completed Aug 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

01_getting_started_pytorch notebook - Hyperparameters sent by the client aren't passed to the Training Arguments #67

01_getting_started_pytorch notebook - Hyperparameters sent by the client aren't passed to the Training Arguments #67

hellosamstuart commented Aug 4, 2021

hellosamstuart commented Aug 4, 2021

philschmid commented Aug 4, 2021

01_getting_started_pytorch notebook - Hyperparameters sent by the client aren't passed to the Training Arguments #67

01_getting_started_pytorch notebook - Hyperparameters sent by the client aren't passed to the Training Arguments #67

Comments

hellosamstuart commented Aug 4, 2021

Description (same as 52)

Why another issue?

Files

Solution

hellosamstuart commented Aug 4, 2021

philschmid commented Aug 4, 2021