Can't train a model with the same reconstruct score as your model. #15

manurubo · 2018-07-10T13:06:07Z

Hello,

I've been trying to train a model with the same instructions you provide in the Readme of molvae, but I have not been able to get the same reconstruct as you.
First I used your code, and had some problems with this part of the code in vaetrain.py:

if (it + 1) % 1500 == 0: #Fast annealing
            scheduler.step()
            print "learning rate: %.6f" % scheduler.get_lr()[0]
            torch.save(model.state_dict(), opts.save_path + "/model.iter-%d-%d" % (epoch, it + 1))
            beta = max(1.0, beta + anneal)

When the code first enters into this if, the LOGs of the reconstruction goes crazy and the final model I obtained had 0 reconstruction score. After reading and understanding more of your code I thought that the max in beta = max(1.0, beta + anneal) may be a min, so I changed that line of code to beta = min(1.0, beta + anneal) .

With this new change, I trained the whole model and now the LOGs seemed fine, but after I had the whole model, I tried to test this new model with the reconstruct.py and the model is giving me about a 0.52 reconstruct score, way too far from the 0.77 reconstruction score that the model MPNVAE-h450-L56-d3-beta0.005 has.

I would like to know if the change I made to the beta = max(1.0, beta + anneal) it's correct, also why with or without this change, I'm not being able to train a new model following your instructions? Is there something I would have to do that I'm not doing to train a model? Did you trained your model with the same instructions that the Readme says?

Thank you, I'm going crazy because this doesn't make sense at all, I would be very grateful if you had any idea of what's happening.

The text was updated successfully, but these errors were encountered:

wengong-jin · 2018-07-11T20:10:12Z

That's some sort of debugging code that I accidentally introduced. I removed this line. Please see the updated version. Many thanks

For the performance on reconstruction, I set beta=0.001. This should give you the 77% performance.

manurubo · 2018-07-17T13:08:57Z

I've tried again with beta=0.001 and I'm not obtaining 77% performance. The iterations defined with MAX_EPOCH in pretrain and vaetrain are the same you used or should I make more iterations?

Also, I tried to see if the first part of the model, the one that is obtained from the pretrain.py, had similar reconstruct performance to the MPNVAE-h450-L56-d3-noKL/model.iter-2 but obtained totally diferent performance, maybe there's any problem here?
I saw that the MPNVAE-h450-L56-d3-noKL/model.iter-2 had similar reconstruct performance to the model.iter-0 I obtained from pretrain.py, but in the molvae Readme says that I should use the model.iter-2 for vaetrain.py, so that's the model I'm using.
CUDA_VISIBLE_DEVICES=0 python vaetrain.py --train ../data/train.txt --vocab ../data/vocab.txt \ --hidden 450 --depth 3 --latent 56 --batch 40 --lr 0.0007 --beta 0.005 \ **--model pre_model/model.2** --save_dir vae_model/
I changed the beta as you said when I executed the command this is just the example of the Readme.

Thank you.

NamanChuriwala · 2018-09-19T10:01:00Z

The reconstruction accuracy obtained from model.iter-0 and model.iter-2 wouldn't be very different since during training the models, you can see the loss flattens out after a few thousand minibatches in the first epoch itself. It is possible that the author of the repo trained their model on a larger number of molecules and not on molecules only from 'data/train.txt'.
You could try increasing the training data size. The number of iterations wouldn't make a difference.

maxime-langevin · 2019-05-16T14:02:45Z

Dear @wengong-jin ,

I'm currently trying out your repo (very nice work by the way!), and I'm running into similar problems as described in this issue.
I'd just like some insight on what I understood of your answer: if I want to use your model's encoding and reconstruction abilities, I should train it with a low beta value (0.001), and if I want to use purely its molecule generation abilities, I should train it with the standard beta value (1)?
I using sometimes one ability, sometimes the other, and just wanted to know if it was normal to have the model trained in two different fashions for each task (which is what I understand from your answer to this issue, and is what I do currently)? It would be a bit more convenient to have the model trained in one fashion and be able to use it for the two tasks, but the way I use it currently is already very pleasant.

Thanks a lot!

wengong-jin · 2019-05-16T14:22:02Z

Hi maxime1310,

To some degrees what you said is right. Training with small beta helps reconstruction and larger beta helps generation. Yet training with beta=1.0 wouldn't necessarily give you the best molecule generation ability, and it varies from dataset to dataset. In general you need to treat it as hyperparameter and try different values for the downstream task.

maxime-langevin · 2019-05-16T14:28:51Z

Thanks a lot for the quick answer!

On the datasets you worked on, doing hyperparameter search for beta allowed you to sometimes find beta values for which you got both good reconstruction quality and molecule generation? Or do you recommend to first select the downstream task, and then search for a beta value for which the model will perform well on this particular downstream task?

wengong-jin · 2019-05-16T14:33:05Z

I would recommend to search for beta that performs well on the particular task.

wengong-jin closed this as completed Jul 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't train a model with the same reconstruct score as your model. #15

Can't train a model with the same reconstruct score as your model. #15

manurubo commented Jul 10, 2018

wengong-jin commented Jul 11, 2018 •

edited

manurubo commented Jul 17, 2018

NamanChuriwala commented Sep 19, 2018

maxime-langevin commented May 16, 2019

wengong-jin commented May 16, 2019

maxime-langevin commented May 16, 2019

wengong-jin commented May 16, 2019

Can't train a model with the same reconstruct score as your model. #15

Can't train a model with the same reconstruct score as your model. #15

Comments

manurubo commented Jul 10, 2018

wengong-jin commented Jul 11, 2018 • edited

manurubo commented Jul 17, 2018

NamanChuriwala commented Sep 19, 2018

maxime-langevin commented May 16, 2019

wengong-jin commented May 16, 2019

maxime-langevin commented May 16, 2019

wengong-jin commented May 16, 2019

wengong-jin commented Jul 11, 2018 •

edited