-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the bug
Maybe not quite a bug, but this issue cost me a while to debug, because it wasn't immediately clear where in the training process the failure was occuring, and the error message "Invalid length for parameter Key, value: 0, valid min length: 1" provides no clues either.
Basically, when using a sagemaker.inputs.TrainingInput and specifying an input of type S3Prefix, the first argument needs to be a whole path including a bucket and a valid key, otherwise the result is the above error.
To reproduce
The following minimal script can be used to reproduce:
from sagemaker.tensorflow import TensorFlow
from sagemaker import get_execution_role
from sagemaker.inputs import TrainingInput
input_base_dir = TrainingInput(
"s3://only-bucket",
s3_data_type="S3Prefix",
)
est = TensorFlow(
py_version="py37",
framework_version="2.4",
entry_point="train_model.py",
role=get_execution_role(),
instance_count=1,
instance_type="local",
)
est.fit({"input": input_base_dir})
Expected behavior
The TrainingInput will simply consist of all the objects in the bucket.
Screenshots or logs
Traceback (most recent call last):
File "create_training_job.py", line 20, in <module>
est.fit({"input": input_base_dir})
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/estimator.py", line 689, in fit
self.latest_training_job = _TrainingJob.start_new(self, inputs, experiment_config)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/estimator.py", line 1468, in start_new
estimator.sagemaker_session.train(**train_args)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/session.py", line 585, in train
self.sagemaker_client.create_training_job(**train_request)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/local/local_session.py", line 186, in create_training_job
training_job.start(InputDataConfig, OutputDataConfig, hyperparameters, TrainingJobName)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/local/entities.py", line 221, in start
input_data_config, output_data_config, hyperparameters, job_name
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/local/image.py", line 202, in train
data_dir, input_data_config, output_data_config, hyperparameters
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/local/image.py", line 489, in _prepare_training_volumes
data_source = sagemaker.local.data.get_data_source_instance(uri, self.sagemaker_session)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/local/data.py", line 54, in get_data_source_instance
return S3DataSource(parsed_uri.netloc, parsed_uri.path, sagemaker_session)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/local/data.py", line 185, in __init__
sagemaker.utils.download_folder(bucket, prefix, working_dir, sagemaker_session)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/utils.py", line 277, in download_folder
s3.Object(bucket_name, prefix).download_file(file_destination)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/boto3/s3/inject.py", line 315, in object_download_file
ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/boto3/s3/inject.py", line 173, in download_file
extra_args=ExtraArgs, callback=Callback)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/boto3/s3/transfer.py", line 315, in download_file
future.result()
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/s3transfer/futures.py", line 106, in result
return self._coordinator.result()
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/s3transfer/futures.py", line 265, in result
raise self._exception
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/s3transfer/tasks.py", line 255, in _main
self._submit(transfer_future=transfer_future, **kwargs)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/s3transfer/download.py", line 343, in _submit
**transfer_future.meta.call_args.extra_args
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/botocore/client.py", line 391, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/botocore/client.py", line 692, in _make_api_call
api_params, operation_model, context=request_context)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/botocore/client.py", line 740, in _convert_to_request_dict
api_params, operation_model)
File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/botocore/validate.py", line 360, in serialize_to_request
raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid length for parameter Key, value: 0, valid min length: 1
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 2.69.0
- Framework name (eg. PyTorch) or algorithm (eg. KMeans): TensorFlow
- Framework version: 2.4
- Python version: 3.7
- CPU or GPU: CPU
- Custom Docker image (Y/N): N
Additional context
There are two ways this could be improved (I'm happy to contribute either one): Check for bucket-only s3 urls and produce a more helpful error message; or assume that the absence of a key in the prefix means the TrainingInput should consist of all the objects in the bucket.