Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Loading Embeddings in Different Environment from Training #103

Open
tbouchik opened this issue Aug 21, 2023 · 1 comment
Open

Error Loading Embeddings in Different Environment from Training #103

tbouchik opened this issue Aug 21, 2023 · 1 comment

Comments

@tbouchik
Copy link
Contributor

tbouchik commented Aug 21, 2023

Describe the bug
When I train my AE on a training server and then load it on a production server, I encounter an error while trying to use the embed function. However, the same function works without issues on the training server.

To Reproduce
Steps to reproduce the behavior:

Train the AE on the training server.
Load the trained model on the production server.
Execute autoencoder_model.embed(someTensorX).

  1. See error
    `Traceback (most recent call last):
    File "", line 1, in

File "/usr/local/lib/python3.10/dist-packages/pythae/models/base/base_model.py", line 129, in embed
return self(DatasetOutput(data=inputs)).z

File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)

File "/usr/local/lib/python3.10/dist-packages/pythae/models/ae/ae_model.py", line 76, in forward
z = self.encoder(x).embedding

File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)

File "LOCAL_PATH_OF_TRAIN_SERVER_PYTHON_FILE_LOADING_PYTHAE_MODEL", line 40, in forward
TypeError: 'c' not supported between instances of 'NoneType' and 'NoneType'`

Expected behavior
I expect the model to embed the tensor without any errors, irrespective of the server it's being executed on.

Desktop Prod Server:
OS version:
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.2 LTS
Release: 22.04
Codename: jammy
Kernel version:
5.15.0-76-generic

Desktop Train Server:
OS version:
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.2 LTS (beaver-osp1-gendry X45)
Release: 18.04
Codename: bionic
Kernel version:
5.4.0-150-generic

Additional context
The error seems to be related to the path of the Python file on the training server, as indicated in the traceback. It appears that the training environment's path is somehow hardcoded into the model when it's saved, which might be causing the issue when trying to load the model in a different environment.

@clementchadebec
Copy link
Owner

clementchadebec commented Aug 25, 2023

Hi @tbouchik,

Thanks for mentioning this issue. It is a weird bug. Can you share your python environments on the training server and the production one (pip freeze) ? In particular, do you have the same version of Python on both servers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants