-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the bug
I am attempting to find a Processor that allows me to specify requirements.txt per container/step and allow containers to import custom code from scripts I write besides the one specified by the code argument. To this end, it seems like sagemaker.processing.FrameworkProcessor is a good option.
However, when I use the source_dir argument (which enables me to import custom code), it seems to prevent step caching from working.
To reproduce
Here is the code such that step caching does work:
I define
BASE_DIR = os.path.dirname(os.path.realpath(__file__))and the processing step is defined as
sklearn_processor = FrameworkProcessor(
estimator_cls=SKLearn,
framework_version="0.23-1",
instance_type=processing_instance_type,
instance_count=processing_instance_count,
base_job_name=base_job_name,
sagemaker_session=pipeline_session,
role=role,
)
step_args = sklearn_processor.run(
outputs=[
ProcessingOutput(output_name="train", source="/opt/ml/processing/train"),
ProcessingOutput(output_name="validation", source="/opt/ml/processing/validation"),
ProcessingOutput(output_name="test", source="/opt/ml/processing/test"),
],
code=os.path.join(BASE_DIR, "preprocess.py"),
arguments=[
'--input-data', input_data,
'--random-seed', random_seed,
'--train-fraction', train_fraction,
'--validation-fraction', validation_fraction,
],
)
step_process = ProcessingStep(
name=step_name,
step_args=step_args,
cache_config=CacheConfig(enable_caching=True, expire_after="T12h"),
)This works as expected; takes about 5 minutes to run the first time and then only a couple seconds afterward.
If I switch to
code="preprocess.py",
source_dir=BASE_DIR,Caching no longer works and it takes 5 minutes to execute the pipeline every time.
Note: if I instead switch to
code=os.path.join(BASE_DIR, "preprocess.py"),
source_dir=BASE_DIR,System information
A description of your system. Please provide:
- SageMaker Python SDK version:
sagemaker-2.196.0 - Framework name (eg. PyTorch) or algorithm (eg. KMeans):
- Framework version:
- Python version:
3.11 - CPU or GPU:
- Custom Docker image (Y/N):
Additional context
Add any other context about the problem here.
