-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
I am testing deploying a pytorch inference model trained elsewhere using the SageMaker API, roughly like this, trying to test everything locally on my own machine for now:
sess = LocalSession()
sess.config = {'local': {'local_code': True}}
model = PyTorchModel(
model_data="model1.tar.gz",
role=MYROLE,
framework_version="1.8.1",
py_version="py3",
entry_point="inference.py",
)
model.sagemaker_session = sess
predictor = model.deploy(
instance_type="local",
initial_instance_count=1,
deserializer=JSONDeserializer(),
serializer=JSONSerializer(),
)
The model.deply call creates two temporary directories: one for the docker-compose command and config file, the other for mounting in the container as /opt/ml/model
However there are some thing happening which I find odd and incorrect and especially annoysagemaker/ts/models/model.maring for large models:
- the tmp file that gets mounted contains the content of the model1.tar.gz file I proved PLUS the original model1.tar.gz file
- when the container initializes it internally uses the torch-model-archiver to create the file
/sagemaker/ts/models/model.marinside the container which contains the content of model1.tar.gz PLUS AGAIN the model1.tar.gz file itself. - in summary, the model (which may be big), is present 4 times on the FS of instance running this:
- in the /tmp file that is mounted, twice, once as the tar.gz file and once extracted
- in the container FS inside a zip file (the mode.mar file) which contains the files, plus the original tar.gz file
UPDATE: actually, it seems the model is present also in
- /home/model-server/tmp/model
- something like /home/model-server/tmp/models/1547c25a74854a1aaaf72751c0febd02/
- this apparently is the only path actually needed when i have my own model loading function as this is the path I get
and in both cases, the tar.gz is present together with the unarchived files of the tar.gz
- this apparently is the only path actually needed when i have my own model loading function as this is the path I get
Surely that cannot be intentional, why would all that duplication be needed?
Since I am using my own model loading code, I do not understand why packing the model up into a model.mar file is even needed? Would I not just load it from the mounted /opt/ml/model directory ?