Preloading models on Sagemaker multi-model endpoint doesn't work #1001

sassarini-marco · 2022-08-05T16:37:38Z

Hi,

I'm trying to load some models at sagemaker endpoint server startup to make them already available on model prediction requests to skip the loading step phase on first request.

I've configured the mms with the following parameters accordingly to the mms documentation:

model_store = '/'
default_workers_per_model = 1
preload_model = 'true'
load_models = .. # the container local path where i store the model.

The model is a decompressed tar.gz archive generated through sagemaker training process plus a MAR-INF/MANIFEST.json directory with the model_name information.

From cloudwatch logs i see the model has been loaded correctly on a worker thread which immediatly stops after scale-down call.

Following some screen with the logs.
The configuration:

The load-scale down:

I don't see errors in the logs: what's going on? Is it a bug?

Best regards.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preloading models on Sagemaker multi-model endpoint doesn't work #1001

Preloading models on Sagemaker multi-model endpoint doesn't work #1001

sassarini-marco commented Aug 5, 2022

Preloading models on Sagemaker multi-model endpoint doesn't work #1001

Preloading models on Sagemaker multi-model endpoint doesn't work #1001

Comments

sassarini-marco commented Aug 5, 2022