Skip to content

Local mode batch transform does not support split_type = "None" #2632

@tvoipio

Description

@tvoipio

Describe the bug

Batch transform API is inconsistent between SageMaker and local mode. The Transformer.transform() method accepts the string None as a valid value when calling SageMaker (as specified in the documentation too: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_TransformInput.html#sagemaker-Type-TransformInput-SplitType). Local mode accepts only the Python literal None, but the string None is not accepted, resulting in a ValueError

To reproduce

Given a Model object model suitable for local mode (tested with custom inference container):

transformer = model.transformer(
    instance_count=1,
    instance_type="ml.c4.xlarge",
    output_path=transform_output_path,
    accept="text/csv",
)

transformer.transform(
    data=transform_input_path,
    content_type="text/csv",
    split_type="None",  # how to split input into records; "None" passes as-is
)

works, but

transformer = model.transformer(
    instance_count=1,
    instance_type="local",
    output_path=transform_output_path,
    strategy="SingleRecord",
    accept="text/csv",
)

transformer.transform(
    data=transform_input_path,
    content_type="text/csv",
    split_type="None",  # how to split input into records; "None" passes as-is
)

does not: ValueError: Invalid Split Type: None occurs in data.py

Expected behavior

String None passed as the value of split_type works also in local mode.

Screenshots or logs
If applicable, add screenshots or logs to help explain your problem.

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.59.2.dev0 (as of commit ebc3b3e)
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): N/A
  • Framework version: N/A
  • Python version: 3.8.10
  • CPU or GPU: CPU (local)
  • Custom Docker image (Y/N): Y

Additional context

Proposed solution: Modify

if split_type is None:
to if split_type == "None" or split_type is None

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions