Provided model cannot be used with the new Tensorflow pickle loader. (module tensorflow.python.training.tracking missing) #67

latendresse · 2023-12-08T02:59:09Z

Hello,

I tried using the trained model GNN_Edge_MLP_MoLeR__2022-02-24_07-16-23_best.pkl you provided. Trying to generate 10 molecules failed in trying to load it. It appears that the pkl version is no longer handled by the new Tensorflow version. Please, see Python trace below.

Do you have another trained model to try, or should we downgrade to an older version of Tensorflow?

Thank you,

-- Mario

molecule_generation sample /home/azureuser/molecule-generation/model 10
2023-12-08 01:49:50.037579: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions i
n performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
File "/home/azureuser/miniconda3/envs/moler-env/bin/molecule_generation", line 8, in
sys.exit(main())
File "/home/azureuser/miniconda3/envs/moler-env/lib/python3.10/site-packages/molecule_generation/cli/cli.py", line 35, in main
run_and_debug(lambda: commands[args.command].run_from_args(args), getattr(args, "debug", False))
File "/home/azureuser/miniconda3/envs/moler-env/lib/python3.10/site-packages/dpu_utils/utils/debughelper.py", line 21, in run_and_debug
func()
File "/home/azureuser/miniconda3/envs/moler-env/lib/python3.10/site-packages/molecule_generation/cli/cli.py", line 35, in
run_and_debug(lambda: commands[args.command].run_from_args(args), getattr(args, "debug", False))
File "/home/azureuser/miniconda3/envs/moler-env/lib/python3.10/site-packages/molecule_generation/cli/sample.py", line 30, in run_from_args
print_samples(
File "/home/azureuser/miniconda3/envs/moler-env/lib/python3.10/site-packages/molecule_generation/cli/sample.py", line 13, in print_samples
with load_model_from_directory(model_dir, **model_kwargs) as model:
File "/home/azureuser/miniconda3/envs/moler-env/lib/python3.10/site-packages/molecule_generation/wrapper.py", line 187, in load_model_from_directory
model_class = get_model_class(ModelWrapper._get_model_file(model_dir))
File "/home/azureuser/miniconda3/envs/moler-env/lib/python3.10/site-packages/molecule_generation/utils/model_utils.py", line 74, in get_model_class
data_to_load = pickle.load(in_file)
ModuleNotFoundError: No module named 'tensorflow.python.training.tracking'

kmaziarz · 2023-12-08T11:47:54Z

Yes, there is a backward compatibility issue introduced around tensorflow 2.14. I wanted to take a look at handling this, but for now I would recommend downgrading. The newest tensorflow versions are fishy anyway, as I've recently found that starting with 2.10 there are memory leaks in certain scenarios (in MoLeR's case, they seem to appear when repeatedly encoding a lot of molecules in a loop). Neither the compatibility issues nor the potential leaks happen in 2.9, so I would recommend that version unless you really need to use a newer one.

This PR addresses two sources of memory leaks apparent when repeatedly encoding many molecules in a loop, both originating from `tensorflow`: - First, there is a very mild leak, caused by `tensorflow` not fully cleaning up some of its internals, which appears across many `tensorflow` versions. - Second, there is also a bigger leak introduced in `tensorflow` vesion `2.10`. The first issue is addressed by manually clearing `_py_funcs_used_in_graph`, while for the second I temporarily pin the supported `tensorflow` version to `<2.10`, awaiting the issue to be fixed upstream. The pin also avoids backward compatibility problems that start to appear in `2.14` and prevent the pretrained checkpoint from being loaded (see #67).

kmaziarz · 2024-01-04T12:42:19Z

Closing as we now require tensorflow<2.10 (due to memory leaks that got introduced in 2.10).

kmaziarz self-assigned this Dec 8, 2023

kmaziarz added the question Request for help or information label Dec 8, 2023

kmaziarz mentioned this issue Dec 8, 2023

Avoid memory leaks and other tensorflow issues #68

Merged

kmaziarz closed this as completed Jan 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provided model cannot be used with the new Tensorflow pickle loader. (module tensorflow.python.training.tracking missing) #67

Provided model cannot be used with the new Tensorflow pickle loader. (module tensorflow.python.training.tracking missing) #67

latendresse commented Dec 8, 2023

kmaziarz commented Dec 8, 2023

kmaziarz commented Jan 4, 2024

Provided model cannot be used with the new Tensorflow pickle loader. (module tensorflow.python.training.tracking missing) #67

Provided model cannot be used with the new Tensorflow pickle loader. (module tensorflow.python.training.tracking missing) #67

Comments

latendresse commented Dec 8, 2023

kmaziarz commented Dec 8, 2023

kmaziarz commented Jan 4, 2024