-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Description
System Information
- Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): Tensorflow
- Framework Version: 1.12.0
- Python Version: 2.7
- CPU or GPU: CPU (ml.m5.xlarge)
- Python SDK Version:
- Are you using a custom image: No
Describe the problem
Sagemaker tensorflow fails when trying to train on data that is provided in the form of s3_input
For example:
Given
s3_input_train = sagemaker.s3_input(
s3_data='s3://my_bucket/path/to/prefix,
content_type='csv',
distribution='ShardedByS3Key')
the following fail while looking for training_data in /opt/ml/input/data/training/
estimator.fit({'train': s3_input_train})
(or)
# Assume s3_eval_train was also created similar to s3_input_train
estimator.fit({'train': s3_input_train, 'validation': s3_eval_train})
But this succeeds,
estimator.fit(s3_input_train)
Minimal repro / logs
2019-03-12 15:12:29,570 ERROR - container_support.training - uncaught exception during training: [Errno 2] No such file or directory: '/opt/ml/input/data/training/'
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/container_support/training.py", line 36, in start
fw.train()
File "/usr/local/lib/python2.7/dist-packages/tf_container/train_entry_point.py", line 173, in train
train_wrapper.train()
File "/usr/local/lib/python2.7/dist-packages/tf_container/trainer.py", line 69, in train
estimator = self._build_estimator(run_config=run_config)
File "/usr/local/lib/python2.7/dist-packages/tf_container/trainer.py", line 92, in _build_estimator
return self.customer_script.estimator_fn(run_config, hyperparameters)
File "/opt/ml/code/tensorflow_entry_point.py", line 26, in estimator_fn
feature_columns = [tf.feature_column.numeric_column(INPUT_TENSOR_NAME, shape=get_shape())]
File "/opt/ml/code/tensorflow_entry_point.py", line 20, in get_shape
filename0 = os.listdir(training_dir)[0]
- Exact command to reproduce:
s3_input_train = sagemaker.s3_input(
s3_data='s3://my_bucket/path/to/prefix,
content_type='csv',
distribution='ShardedByS3Key')
estimator.fit({'train': s3_input_train})
Metadata
Metadata
Assignees
Labels
No labels