Skip to content

Training job not saved to S3 despite providing S3 output location due to no model artifact saved under path /opt/ml/model #3455

@yshen92

Description

@yshen92

Describe the bug
After training is done, model is not saved to S3 although an output S3 URI has been specified for the Estimator.
From CloudWatch log:
2022-11-04 03:58:58,042 sagemaker_tensorflow_container.training WARNING No model artifact is saved under path /opt/ml/model. Your training job will not save any model files to S3.

To reproduce
Tested on Amazon_JumpStart_Text_Classification 4. Finetune the pre-trained model on a custom dataset onwards

Expected behavior
After HPO, model is successfully saved to the output S3 URI specified

Screenshots or logs
hpo_text_classifier.log

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.116.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): Jumpstart model
  • Framework version: *
  • Python version: 3.10
  • CPU or GPU: GPU
  • Custom Docker image (Y/N): -

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions