Weird model quadruplication when deploying pytorch inference model

I am testing deploying a pytorch inference model trained elsewhere using the SageMaker API, roughly like this, trying to test everything locally on my own machine for now:

```
sess = LocalSession()
sess.config = {'local': {'local_code': True}}
model = PyTorchModel(
    model_data="model1.tar.gz",
    role=MYROLE, 
    framework_version="1.8.1",  
    py_version="py3",
    entry_point="inference.py",
    )
model.sagemaker_session = sess
predictor = model.deploy(
    instance_type="local", 
    initial_instance_count=1,
    deserializer=JSONDeserializer(),
    serializer=JSONSerializer(),
)
```

The `model.deply` call creates two temporary directories: one for the docker-compose command and config file, the other for mounting in the container as /opt/ml/model

However there are some thing happening which I find odd and incorrect and especially annoysagemaker/ts/models/model.maring for large models:
* the tmp file that gets mounted contains the content of the model1.tar.gz file I proved PLUS the original model1.tar.gz file
* when the container initializes it internally uses the torch-model-archiver to create the file `/sagemaker/ts/models/model.mar` inside the container which contains the content of model1.tar.gz PLUS AGAIN the model1.tar.gz file itself.
* in summary, the model (which may be big), is present 4 times on the FS of instance running this: 
  * in the /tmp file that is mounted, twice, once as the tar.gz file and once extracted
  * in the container FS inside a zip file (the mode.mar file) which contains the files, plus the original tar.gz file

UPDATE: actually, it seems the model is present also in 
* /home/model-server/tmp/model 
* something like /home/model-server/tmp/models/1547c25a74854a1aaaf72751c0febd02/
  * this apparently is the only path actually needed when i have my own model loading function as this is the path I get
and in both cases, the tar.gz is present together with the unarchived files of the tar.gz 

Surely that cannot be intentional, why would all that duplication be needed?

Since I am using my own model loading code, I do not understand why packing the model up into a model.mar file is even needed? Would I not just load it from the mounted /opt/ml/model directory ? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Weird model quadruplication when deploying pytorch inference model #2465

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Weird model quadruplication when deploying pytorch inference model #2465

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions