Skip to content

Missing dependencies in SKLearn based endpoint/Batch Transform job #975

@lihip

Description

@lihip

Please fill out the form below.

System Information

  • Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): SKLearn
  • Framework Version: sagemaker 1.36.1
  • Python Version: 3.6
  • CPU or GPU: CPU
  • Python SDK Version: sagemaker 1.36.1
  • Are you using a custom image: yes, using entry point script

Describe the problem

I created an SKLearn estimator with dependencies parameter defined to include a python library on the instance (that is not installed on the container by default). Training using the estimator worked (dependency was installed), but when trying to create an endpoint for that model (using the deploy() method), it was failing due to import error of that library. Same happens with Batch Transform jobs.
I would expect that any dependency that was defined for the estimator, will also be available for the endpoint container. In any case, I couldn't find another way to import it to the endpoint as well.

Minimal repro / logs

CloudWatch error:
sagemaker_containers._errors.ImportModuleError: No module named 'xgboost'

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions