-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the bug
Batch transform API is inconsistent between SageMaker and local mode. The Transformer.transform() method accepts the string None as a valid value when calling SageMaker (as specified in the documentation too: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_TransformInput.html#sagemaker-Type-TransformInput-SplitType). Local mode accepts only the Python literal None, but the string None is not accepted, resulting in a ValueError
To reproduce
Given a Model object model suitable for local mode (tested with custom inference container):
transformer = model.transformer(
instance_count=1,
instance_type="ml.c4.xlarge",
output_path=transform_output_path,
accept="text/csv",
)
transformer.transform(
data=transform_input_path,
content_type="text/csv",
split_type="None", # how to split input into records; "None" passes as-is
)
works, but
transformer = model.transformer(
instance_count=1,
instance_type="local",
output_path=transform_output_path,
strategy="SingleRecord",
accept="text/csv",
)
transformer.transform(
data=transform_input_path,
content_type="text/csv",
split_type="None", # how to split input into records; "None" passes as-is
)
does not: ValueError: Invalid Split Type: None occurs in data.py
Expected behavior
String None passed as the value of split_type works also in local mode.
Screenshots or logs
If applicable, add screenshots or logs to help explain your problem.
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 2.59.2.dev0 (as of commit ebc3b3e)
- Framework name (eg. PyTorch) or algorithm (eg. KMeans): N/A
- Framework version: N/A
- Python version: 3.8.10
- CPU or GPU: CPU (local)
- Custom Docker image (Y/N): Y
Additional context
Proposed solution: Modify
| if split_type is None: |
if split_type == "None" or split_type is None