Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preloading models on Sagemaker multi-model endpoint doesn't work #1001

Open
sassarini-marco opened this issue Aug 5, 2022 · 0 comments
Open

Comments

@sassarini-marco
Copy link

Hi,

I'm trying to load some models at sagemaker endpoint server startup to make them already available on model prediction requests to skip the loading step phase on first request.

I've configured the mms with the following parameters accordingly to the mms documentation:

  • model_store = '/'
  • default_workers_per_model = 1
  • preload_model = 'true'
  • load_models = .. # the container local path where i store the model.

The model is a decompressed tar.gz archive generated through sagemaker training process plus a MAR-INF/MANIFEST.json directory with the model_name information.

From cloudwatch logs i see the model has been loaded correctly on a worker thread which immediatly stops after scale-down call.

Following some screen with the logs.
The configuration:
image

The load-scale down:
image

I don't see errors in the logs: what's going on? Is it a bug?

Best regards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant