Help in finetuning generator #21

gmseabra · 2022-05-17T20:38:47Z

Hi,

I'm trying an experiment, to fine-tune the generator with a small set of molecules with specific properties, (so it will generate new molecules with similar properties) but I'm running into some errors that I have been unable to solve. I'd really appreciate if anyone could shed some light into what I'm doing wrong.

What I'm doing:

From a set of 10 molecules, split into 80:10:10 for train:valid:test. Put into folder finetune_moler/input.
Run MoLeR in pre-process mode:
$ molecule_generation preprocess finetune_moler/input finetune_moler/output finetune_moler/trace
Then try to finetune the pre-trained model provided with the small set of molecules above: ```

molecule_generation train MoLeR finetune_moler/trace \
				--load-saved-model ./PRETRAINED_MODEL/GNN_Edge_MLP_MoLeR__2022-02-24_07-16-23_best.pkl \
				--load-weights-only \
				--save-dir finetune_moler/tuned_model

The pre-process step seems to run just fine. But in the fine-tuning step, I'm getting the following error:

(dumps a lot of informational messages)
Traceback (most recent call last):
  File "/opt/miniconda3/envs/moler-env/bin/molecule_generation", line 33, in <module>
    sys.exit(load_entry_point('molecule-generation', 'console_scripts', 'molecule_generation')())
  File "/home/seabra/work/source/repos/microsoft/molecule-generation/molecule_generation/cli/cli.py", line 35, in main
    run_and_debug(lambda: commands[args.command].run_from_args(args), getattr(args, "debug", False))
  File "/opt/miniconda3/envs/moler-env/lib/python3.10/site-packages/dpu_utils/utils/debughelper.py", line 21, in run_and_debug
    func()
  File "/home/seabra/work/source/repos/microsoft/molecule-generation/molecule_generation/cli/cli.py", line 35, in <lambda>
    run_and_debug(lambda: commands[args.command].run_from_args(args), getattr(args, "debug", False))
  File "/home/seabra/work/source/repos/microsoft/molecule-generation/molecule_generation/cli/train.py", line 140, in run_from_args
    loaded_model_dataset = training_utils.get_model_and_dataset(
  File "/opt/miniconda3/envs/moler-env/lib/python3.10/site-packages/tf2_gnn/cli_utils/model_utils.py", line 319, in get_model_and_dataset
    load_weights_verbosely(trained_model_file, model)
  File "/opt/miniconda3/envs/moler-env/lib/python3.10/site-packages/tf2_gnn/cli_utils/model_utils.py", line 148, in load_weights_verbosely
    K.batch_set_value(tfvar_weight_tuples)
  File "/opt/miniconda3/envs/moler-env/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/opt/miniconda3/envs/moler-env/lib/python3.10/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 911, in assign
    raise ValueError(
ValueError: Cannot assign value to variable ' decoder/node_categorical_features_embedding/categorical_features_embedding:0': Shape mismatch.The variable shape (98, 64), and the assigned value shape (166, 64) are incompatible.

Could someone point me to what I'm doing wrong? Would it be possible to get an example of successfully fine-tuning the model?

Thanks a lot!
Gustavo.

The text was updated successfully, but these errors were encountered:

kmaziarz · 2022-05-19T10:34:22Z

The steps you tried are quite reasonable! The missing ingredient is that preprocessing computes various vocabularies of motif/atom types based on the data, which affects the shapes of some layers in the model, but those shapes are already fixed if a pretrained model is loaded, hence the shape mismatch (intuitively, this is saying that the number of motif/atom types found in your finetuning dataset was smaller than the number of types found in the original dataset, which is expected).

So, one would have to tell preprocessing to use the vocabularies from the pretrained checkpoint instead of computing them afresh. This isn't supported in the current code (we did briefly experiment with fine-tuning, but not enough for this to end up in the release), but shouldn't be hard to add. I'll hack something together this week and then share with you as a branch; once you test it out I can then make a PR and merge it into main.

gmseabra · 2022-05-19T13:47:42Z

Oh, I see, I get now.

Thank you so much!

kmaziarz · 2022-05-20T14:24:17Z

@gmseabra: can you pull kmaziarz/finetuning, re-install the package from there, and try fine-tuning again?

The only change to the workflow you described would be passing --pretrained-model-path when doing preprocessing. However, note that by default molecule_generation train will do validation every 5000 steps, and wait until there is no improvement on the validation dataset. If you're fine-tuning on a small set of molecules, it may make sense to set this to something lower (so that training has a chance to stop before it overfits) and/or limit the total number of such rounds of validation. For example, passing

--model-params-override '{"num_train_steps_between_valid": 50}' --max-epochs 8

means that you will do some multiple of 50 steps, at most 8 * 50, but possibly less if validation stops improving.

Let me know how this goes!

kmaziarz · 2022-06-10T11:12:31Z

@gmseabra Did you have any luck with fine-tuning?

MeilinaR · 2023-08-12T17:27:58Z

Hi! I've been trying to replicate this example with the steps you provided above, where I try to finetune it on a small set of 3K molecules, I still encounter the following error (running everything on Colab now). I just took the existing checkpoint, and I wanted to finetune it to a smaller set:

!molecule_generation preprocess input output trace
--pretrained-model-path --load-saved-model /content/drive/MyDrive/subset_gpu_finetuning/moler/molecule-generation/best_model/GNN_Edge_MLP_MoLeR__2022-02-24_07-16-23_best.pkl

!molecule_generation train MoLeR trace
--model-params-override '{"num_train_steps_between_valid": 50}' --max-epochs 8
--load-saved-model /content/drive/MyDrive/subset_gpu_finetuning/moler/molecule-generation/best_model/GNN_Edge_MLP_MoLeR__2022-02-24_07-16-23_best.pkl
--load-weights-only

But even when aligning the metadata that the model was originally trained with (I took the same Guacamol files), it still didn't want to run. This is the error I encountered:

Traceback (most recent call last):
  File "/usr/local/bin/molecule_generation", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/site-packages/molecule_generation/cli/cli.py", line 35, in main
    run_and_debug(lambda: commands[args.command].run_from_args(args), getattr(args, "debug", False))
  File "/usr/local/lib/python3.10/site-packages/dpu_utils/utils/debughelper.py", line 21, in run_and_debug
    func()
  File "/usr/local/lib/python3.10/site-packages/molecule_generation/cli/cli.py", line 35, in <lambda>
    run_and_debug(lambda: commands[args.command].run_from_args(args), getattr(args, "debug", False))
  File "/usr/local/lib/python3.10/site-packages/molecule_generation/cli/train.py", line 140, in run_from_args
    loaded_model_dataset = training_utils.get_model_and_dataset(
  File "/usr/local/lib/python3.10/site-packages/tf2_gnn/cli_utils/model_utils.py", line 319, in get_model_and_dataset
    load_weights_verbosely(trained_model_file, model)
  File "/usr/local/lib/python3.10/site-packages/tf2_gnn/cli_utils/model_utils.py", line 148, in load_weights_verbosely
    K.batch_set_value(tfvar_weight_tuples)
  File "/usr/local/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/usr/local/lib/python3.10/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1022, in assign
    raise ValueError(
ValueError: Cannot assign value to variable ' decoder/node_type_selector/MLP_final_layer/kernel:0': Shape mismatch.The variable shape (256, 142), and the assigned value shape (256, 167) are incompatible.

I'm not sure whether I did something wrong, and where I could fix it!

kmaziarz · 2023-10-09T14:22:06Z

Hi @MeilinaR, sorry for the silence, I forgot to respond to this one. Are you still having issues? I noticed the command you pasted passes --load-saved-model to preprocess; is this a typo?

kmaziarz · 2024-01-04T12:43:08Z

Closing due to lack of activity.

kmaziarz self-assigned this May 17, 2022

kmaziarz added the enhancement New feature or request label May 19, 2022

kmaziarz mentioned this issue Aug 12, 2022

Add support for fine-tuning #30

Merged

kmaziarz linked a pull request Aug 12, 2022 that will close this issue

Add support for fine-tuning #30

Merged

kmaziarz closed this as completed in #30 Aug 25, 2022

kmaziarz reopened this Oct 9, 2023

kmaziarz closed this as completed Jan 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help in finetuning generator #21

Help in finetuning generator #21

gmseabra commented May 17, 2022

kmaziarz commented May 19, 2022

gmseabra commented May 19, 2022

kmaziarz commented May 20, 2022 •

edited

Loading

kmaziarz commented Jun 10, 2022

MeilinaR commented Aug 12, 2023

kmaziarz commented Oct 9, 2023

kmaziarz commented Jan 4, 2024

Help in finetuning generator #21

Help in finetuning generator #21

Comments

gmseabra commented May 17, 2022

kmaziarz commented May 19, 2022

gmseabra commented May 19, 2022

kmaziarz commented May 20, 2022 • edited Loading

kmaziarz commented Jun 10, 2022

MeilinaR commented Aug 12, 2023

kmaziarz commented Oct 9, 2023

kmaziarz commented Jan 4, 2024

kmaziarz commented May 20, 2022 •

edited

Loading