Skip to content

Opaque error when specifying only bucket for TrainingInput #2776

@mszep

Description

@mszep

Describe the bug
Maybe not quite a bug, but this issue cost me a while to debug, because it wasn't immediately clear where in the training process the failure was occuring, and the error message "Invalid length for parameter Key, value: 0, valid min length: 1" provides no clues either.

Basically, when using a sagemaker.inputs.TrainingInput and specifying an input of type S3Prefix, the first argument needs to be a whole path including a bucket and a valid key, otherwise the result is the above error.

To reproduce
The following minimal script can be used to reproduce:

from sagemaker.tensorflow import TensorFlow
from sagemaker import get_execution_role
from sagemaker.inputs import TrainingInput

input_base_dir = TrainingInput(
        "s3://only-bucket",
        s3_data_type="S3Prefix",
        )

est = TensorFlow(
        py_version="py37",
        framework_version="2.4",
        entry_point="train_model.py",
        role=get_execution_role(),
        instance_count=1,
        instance_type="local",
    )

est.fit({"input": input_base_dir})

Expected behavior
The TrainingInput will simply consist of all the objects in the bucket.

Screenshots or logs

Traceback (most recent call last):
  File "create_training_job.py", line 20, in <module>
    est.fit({"input": input_base_dir})
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/estimator.py", line 689, in fit
    self.latest_training_job = _TrainingJob.start_new(self, inputs, experiment_config)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/estimator.py", line 1468, in start_new
    estimator.sagemaker_session.train(**train_args)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/session.py", line 585, in train
    self.sagemaker_client.create_training_job(**train_request)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/local/local_session.py", line 186, in create_training_job
    training_job.start(InputDataConfig, OutputDataConfig, hyperparameters, TrainingJobName)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/local/entities.py", line 221, in start
    input_data_config, output_data_config, hyperparameters, job_name
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/local/image.py", line 202, in train
    data_dir, input_data_config, output_data_config, hyperparameters
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/local/image.py", line 489, in _prepare_training_volumes
    data_source = sagemaker.local.data.get_data_source_instance(uri, self.sagemaker_session)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/local/data.py", line 54, in get_data_source_instance
    return S3DataSource(parsed_uri.netloc, parsed_uri.path, sagemaker_session)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/local/data.py", line 185, in __init__
    sagemaker.utils.download_folder(bucket, prefix, working_dir, sagemaker_session)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/sagemaker/utils.py", line 277, in download_folder
    s3.Object(bucket_name, prefix).download_file(file_destination)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/boto3/s3/inject.py", line 315, in object_download_file
    ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/boto3/s3/inject.py", line 173, in download_file
    extra_args=ExtraArgs, callback=Callback)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/boto3/s3/transfer.py", line 315, in download_file
    future.result()
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/s3transfer/futures.py", line 106, in result
    return self._coordinator.result()
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/s3transfer/futures.py", line 265, in result
    raise self._exception
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/s3transfer/tasks.py", line 255, in _main
    self._submit(transfer_future=transfer_future, **kwargs)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/s3transfer/download.py", line 343, in _submit
    **transfer_future.meta.call_args.extra_args
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/botocore/client.py", line 391, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/botocore/client.py", line 692, in _make_api_call
    api_params, operation_model, context=request_context)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/botocore/client.py", line 740, in _convert_to_request_dict
    api_params, operation_model)
  File "/Users/szemark/.pyenv/versions/miniconda3-3.7-4.10.3/lib/python3.7/site-packages/botocore/validate.py", line 360, in serialize_to_request
    raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid length for parameter Key, value: 0, valid min length: 1

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.69.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): TensorFlow
  • Framework version: 2.4
  • Python version: 3.7
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N

Additional context
There are two ways this could be improved (I'm happy to contribute either one): Check for bucket-only s3 urls and produce a more helpful error message; or assume that the absence of a key in the prefix means the TrainingInput should consist of all the objects in the bucket.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions