Provide SageMaker compatible docker container #147

jaheba · 2019-06-25T12:21:29Z

Description

We should think about creating a GluonTS container, which can be used in Amazon SageMaker. It could behave similar to SageMaker DeepAR w.r.t data-loading (train and test channels) and evaluation of models.

Having these containers could make it a lot easier for customers to try out GluonTS models or even use them in a production setting.

This would require two things:

implementing a “shell” which can interacts with the SageMaker system
building and distribution of the docker images

References

SageMaker DeepAR

The text was updated successfully, but these errors were encountered:

jaheba · 2019-06-25T15:53:02Z

One thing I’m unsure about is how we dispatch between different models.

Also, do we want to support training of more than one model per run?

/cc @aalexandrov @lostella

lostella · 2019-06-25T18:28:03Z

Also, do we want to support training of more than one model per run?

Not sure why we would want to do that: the way I see it, each job would be one training, and should produce one model artifact that should be servable

jaheba · 2019-06-26T12:40:43Z

Although I think there is a point to be made that it could be simpler to use, SageMaker has the concept of one training-configuration is one job.

aalexandrov · 2019-06-26T12:55:57Z

One thing I’m unsure about is how we dispatch between different models.

The approach used in #151 is to dispatch the Estimator based on an environment variable SWIST_ESTIMATOR_CLASS (should really be GLUONTS_ESTIMATOR_CLASS) or, if that variable is not set, through a special estimator_class hyperparamter.

jaheba · 2019-06-26T12:58:43Z

through a special estimator_class hyperparamter

Which is not available during inference. Except, if we store it as part of the model. And that would imply that all models have to go through training. Not sure I like that.

jaheba · 2019-06-26T13:01:17Z

Another thought:

Since we have models which don't require offline training, we can make the shell accept Forecasters, where:

Forecaster = Union[Predictor, Estimator]

That way we would skip training if a Predictor was specified and just start with the test-step.

lostella · 2019-06-26T13:11:25Z

through a special estimator_class hyperparamter

Which is not available during inference. Except, if we store it as part of the model. And that would imply that all models have to go through training. Not sure I like that.

@jaheba at inference time there should be no Estimator involved; the container should be given instructions on how to instantiate a Predictor: this will be whatever type of Predictor with its own settings, or a GluonPredictor (either a SymbolBlockPredictor or a RepresentableBlockPredictor, see https://github.com/awslabs/gluon-ts/blob/master/src/gluonts/model/predictor.py#L170 and all the lines that follow) which will be created using the network that was written to S3 after some training job.

jaheba · 2019-06-26T13:15:51Z

the container should be given instructions on how to instantiate a Predictor

That was exactly my point. During training we can pass a hyper-parameter to select the right Estimator/Predictor. But during inference we would depend on the generated model, which not all models require.

jaheba added the enhancement New feature or request label Jun 25, 2019

jaheba mentioned this issue Jun 28, 2019

Started shell refactoring. #156

Merged

jaheba closed this as completed Jul 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide SageMaker compatible docker container #147

Provide SageMaker compatible docker container #147

jaheba commented Jun 25, 2019

jaheba commented Jun 25, 2019

lostella commented Jun 25, 2019

jaheba commented Jun 26, 2019

aalexandrov commented Jun 26, 2019

jaheba commented Jun 26, 2019

jaheba commented Jun 26, 2019

lostella commented Jun 26, 2019 •

edited

jaheba commented Jun 26, 2019

Provide SageMaker compatible docker container #147

Provide SageMaker compatible docker container #147

Comments

jaheba commented Jun 25, 2019

Description

References

jaheba commented Jun 25, 2019

lostella commented Jun 25, 2019

jaheba commented Jun 26, 2019

aalexandrov commented Jun 26, 2019

jaheba commented Jun 26, 2019

jaheba commented Jun 26, 2019

lostella commented Jun 26, 2019 • edited

jaheba commented Jun 26, 2019

lostella commented Jun 26, 2019 •

edited