Skip to content

Tensorflow error when using input of type sagemaker.session.s3_input #696

@rohitgmathews

Description

@rohitgmathews

System Information

  • Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): Tensorflow
  • Framework Version: 1.12.0
  • Python Version: 2.7
  • CPU or GPU: CPU (ml.m5.xlarge)
  • Python SDK Version:
  • Are you using a custom image: No

Describe the problem

Sagemaker tensorflow fails when trying to train on data that is provided in the form of s3_input

For example:
Given

s3_input_train = sagemaker.s3_input(
            s3_data='s3://my_bucket/path/to/prefix,
            content_type='csv',
            distribution='ShardedByS3Key')

the following fail while looking for training_data in /opt/ml/input/data/training/

estimator.fit({'train': s3_input_train})
(or)
# Assume s3_eval_train was also created similar to s3_input_train
estimator.fit({'train': s3_input_train, 'validation': s3_eval_train})

But this succeeds,
estimator.fit(s3_input_train)

Minimal repro / logs

2019-03-12 15:12:29,570 ERROR - container_support.training - uncaught exception during training: [Errno 2] No such file or directory: '/opt/ml/input/data/training/'
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/container_support/training.py", line 36, in start
    fw.train()
  File "/usr/local/lib/python2.7/dist-packages/tf_container/train_entry_point.py", line 173, in train
    train_wrapper.train()
  File "/usr/local/lib/python2.7/dist-packages/tf_container/trainer.py", line 69, in train
    estimator = self._build_estimator(run_config=run_config)
  File "/usr/local/lib/python2.7/dist-packages/tf_container/trainer.py", line 92, in _build_estimator
    return self.customer_script.estimator_fn(run_config, hyperparameters)
  File "/opt/ml/code/tensorflow_entry_point.py", line 26, in estimator_fn
    feature_columns = [tf.feature_column.numeric_column(INPUT_TENSOR_NAME, shape=get_shape())]
  File "/opt/ml/code/tensorflow_entry_point.py", line 20, in get_shape
    filename0 = os.listdir(training_dir)[0]
  • Exact command to reproduce:
s3_input_train = sagemaker.s3_input(
            s3_data='s3://my_bucket/path/to/prefix,
            content_type='csv',
            distribution='ShardedByS3Key')
estimator.fit({'train': s3_input_train})

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions