Basic MolGAN model #2426

MiloszGrabski · 2021-03-06T11:09:44Z

BasicMolGAN model

Description

Final stage of implementation of basic version of MolGAN originally created by Nicola De Cao and Thomas Kipf.
This is a generative graph convolution model based on WGAN architecture that enables generation of small molecules.
Smiles are converted into model input by MolGANFeaturizer class (tested on MoleculeNet QM9 dataset).
Currently, training is a bit unstable and sometimes requires a few separate training attempts to generate a valid model.
This is something I hope to fix in the future release.
Jupyter notebook tutorial will be provided in a separate PR.

Type of change

Please check the option that is related to your PR.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
- In this case, we recommend to discuss your modification on GitHub issues before creating the PR
Documentations (modification for documents)

Checklist

codecov-io · 2021-03-06T22:21:23Z

Codecov Report

Merging #2426 (4fdd415) into master (ab5fa05) will increase coverage by 0.09%.
The diff coverage is 98.72%.

@@            Coverage Diff             @@
##           master    #2426      +/-   ##
==========================================
+ Coverage   85.65%   85.75%   +0.09%     
==========================================
  Files         307      309       +2     
  Lines       27196    27353     +157     
==========================================
+ Hits        23295    23456     +161     
+ Misses       3901     3897       -4

Impacted Files	Coverage Δ
deepchem/models/tests/test_molgan_model.py	`98.52% <98.52%> (ø)`
deepchem/models/molgan.py	`98.86% <98.86%> (ø)`
deepchem/models/__init__.py	`94.28% <100.00%> (+0.16%)`	⬆️
deepchem/models/gan.py	`87.36% <0.00%> (+1.05%)`	⬆️
...hem/feat/molecule_featurizers/molgan_featurizer.py	`84.61% <0.00%> (+4.39%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ab5fa05...4fdd415. Read the comment docs.

MiloszGrabski · 2021-03-06T22:57:49Z

Issue not connected with my code.
===End Flaky Test Report===
=========================== short test summary info ===========================
FAILED deepchem/splits/tests/test_scaffold_splitter.py::TestScaffoldSplitter::test_scaffolds
= 1 failed, 688 passed, 6 skipped, 35 deselected, 695 warnings in 2406.46s (0:40:06) =

rbharath

Did a first round of review. Looks good overall! I've made a few comments about docs mainly below.

@peastman Could you do a review as well? This model would be the first molecular GAN in the library and we might want to make sure the API is one we like

deepchem/models/molgan.py

rbharath · 2021-03-11T17:36:22Z

deepchem/models/molgan.py

+
+    Returns
+    -------
+    keras.Model


I'm not sure if this will render correctly in the docs. Could you try building the docs locally and seeing if this looks good?

Do you mean running doctest?

Hi I tried running sphinx, but struggling to get it working.
Should I just remove Returns part?

deepchem/models/tests/test_molgan_model.py

peastman · 2021-03-12T17:30:46Z

Is it possible to add a test for correct operation? The current tests just check that certain parameters were recorded correctly. They don't tell you whether the model works. For example, can you train a small model, have it predict some molecules, and verify that they're valid?

MiloszGrabski · 2021-03-12T21:41:52Z

Problem is that it not always trains model that can generate valid molecules.
Not sure how to accommodate that fact.

peastman · 2021-03-12T22:02:57Z

Here is an example of a GAN test case:

deepchem/deepchem/models/tests/test_gan.py

Lines 133 to 148 in 7ed0125

    
           def test_mix_gan(): 
        
             """Test a GAN with multiple generators and discriminators.""" 
        
             gan = ExampleGAN(n_generators=2, n_discriminators=2, learning_rate=0.01) 
        
             gan.fit_gan( 
        
                 generate_data(gan, 1000, 100), generator_steps=0.5, checkpoint_interval=0) 
        
             # See if it has done a plausible job of learning the distribution. 
        
             means = 10 * np.random.random([1000, 1]) 
        
             for i in range(2): 
        
               values = gan.predict_gan_generator( 
        
                   conditional_inputs=[means], generator_index=i) 
        
               deltas = values - means 
        
               assert abs(np.mean(deltas)) < 1.0 
        
               assert np.std(deltas) > 1.0

It trains a GAN based on a very simple data distribution, and checks whether the generated data is reasonably close. Can you do something similar for this class? Train it on a very simple dataset, and see whether it has learned the basic properties of the training data?

MiloszGrabski · 2021-03-12T23:06:19Z

I am not sure as it generates molecules and not always learns to do so.
But I have not tried on a very small amount of compounds.
I will check if this might work.

MiloszGrabski · 2021-03-21T11:22:37Z

I have run pytest on test_molgan_model and it worked fine. Not sure what is the issue here.
The only issue I see is this:
But not sure why it is picking it up.

MiloszGrabski · 2021-03-22T07:26:19Z

ok, so there seems to be corner case where there is a conflict between yapf and flake8.
Any idea about workaround?

peastman · 2021-03-25T17:22:52Z

ok, so there seems to be corner case where there is a conflict between yapf and flake8.

How do you mean? It's just complaining about the indentation of one line.

On the other hand, there seems to be an actual test failure on Windows (but not Linux):

FAILED deepchem/models/tests/test_molgan_model.py::test_molgan_model::test_training
= 1 failed, 690 passed, 6 skipped, 35 deselected, 696 warnings in 2863.04s (0:47:43) =
Exception ignored in: <function Model.__del__ at 0x000001BFDC1B34C8>
Traceback (most recent call last):
  File "D:\a\deepchem\deepchem\deepchem\models\models.py", line 61, in __del__
    shutil.rmtree(self.model_dir)
  File "c:\miniconda3\envs\deepchem\lib\shutil.py", line 516, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "c:\miniconda3\envs\deepchem\lib\shutil.py", line 404, in _rmtree_unsafe
    onerror(os.rmdir, path, sys.exc_info())
  File "c:\miniconda3\envs\deepchem\lib\shutil.py", line 402, in _rmtree_unsafe
    os.rmdir(path)
OSError: [WinError 145] The directory is not empty: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpkauecabw'

peastman · 2021-03-25T17:29:27Z

The new test case looks good. Just out of curiosity, could you post a few examples of molecules it generates? I'm just wondering what sorts of molecules it produces after training on that small number of examples. Has it memorized members of the training set? Or is it just outputting strings of 'C's?

MiloszGrabski · 2021-03-25T21:19:32Z

Actually, I was surprised to see that it did not recreated single molecule from original set.
Some of them were odd looking some of them were fine.
There was no single case so far that was just C's.
In terms of indentation, this is precisely the case when flask8 wants different indentation than yafp (searched in google and I am not the only one experiencing this issue).
So when I fixed it in flake8, yapf tripped over it. I will try to rewrite code somewhat. Maybe it is worth considering switching test to do flake8 and yapf separately. I do not have much experience with automatic code testing, so not sure if this is a good idea.
I will post some molecules ones have a moment to run jupyter

MiloszGrabski · 2021-03-26T10:56:31Z

Not sure what is the issue here, I run pytest on my machine and it passed:
pytest -v -m "not slow" D:\deepchem\deepchem\models\tests\test_molgan_model.py

FYI: I run test several times to be sure, all of them passed.

MiloszGrabski · 2021-03-26T11:40:51Z

Example model outputs (sample of 1000 generated each time):

The model was trained in the loop of 10, with each loop creating new model.
The loop breaks when valid molecules are generated, so first successful model is retained.
Luckily enough, for these attempts it generated models at first attempt.
As you can see, it has quite low validity/uniqueness (based on smiles) score.
The authors solved it by implementing a few things which I would like to add in the future release.
One of them, which seems the most straightforward, is guiding via model checking after each training session.
If model performs worse go back to previous checkpoint otherwise use current.
The infrastructure still needs a lot of work to be useful, but the base seems solid enough.

What I also noticed that with more compounds low number of iterations is favourable (8-10), whereas for small number (like in the test set) you need much larger number (1000-5000).
I suppose it makes sense as with large sets you get much more exposure per training loop.
When I plotted losses, generally around 1000 was enough for them to stabilize.
But loss itself cannot be used as measure of model performance, because even with small losses the model
can generate a lot of invalid molecules i.e. I have observed models with worse losses that provided better metrics.

Model 1, 804 valid, 6 unique, first training attempt successful:

Model 2, 60 valid, 10 unique, first training attempt successful:

Model 3, 13 valid, 7 unique, first training attempt successful

Model 4, 17 valid, 3 unique, first training attempt successful

peastman · 2021-03-29T23:55:58Z

Those actually aren't too bad. Each model produces a decent amount of diversity. It does seem to have a tendency to produce the same molecules over and over, but that's not surprising after overfitting to a tiny dataset.

Luckily enough, for these attempts it generated models at first attempt.

If it usually produces valid molecules on the first attempt, maybe we should reduce the number of iterations? Letting it fail nine times before succeeding could cause the test not to notice if a future change makes the code stop working as well.

rbharath · 2021-03-30T00:04:01Z

These don't look bad at all! I'd be curious about how these compare to molecules generated by the normalizing flows. CC @ncfrey who might have some insight there

ncfrey · 2021-03-30T13:33:06Z

Looks really cool @MiloszGrabski! The validity and uniqueness scores are pretty low, but that might not matter too much. About how long does it take to generate 1000 samples?

MiloszGrabski · 2021-04-01T06:38:40Z

@ncfrey It is almost instantaneous, I would say 1 sec maybe. Never paid it much attention as it was so fast.
That is why in the future version I want to implement guiding system in form of selection which model progress to the next epoch.
This will require sample generation after each epoch/applying metric, which would be prohibitive if not fast enough.
@peastman I think though of what you said and will modify code slightly. Let me know what you think of the change.
I have reduced number of attempts to 6, also to speed up the process I have introduced break. Given we only care if it manages to pass once (at least for now) there is no point of training if it already produced valid model.

MiloszGrabski · 2021-04-01T13:15:15Z

Do not know what is going on here:
Works fine on my machine, any ideas?

Unittest also worked just fine.
From error it looks like input mismatch, but no idea why it is happening.

I would welcome any suggestion why it might be.

MiloszGrabski · 2021-04-01T13:33:38Z

@ncfrey @peastman @rbharath
I think I know why the issue albeit do not understand why it works on my machine but does not on test.
My generator output is = [a,b,c,d]
My discriminator input is = [a,b]
Normally additional inputs are just ignored, but in the test case it seems to behave differently. Any clue?

The worst part is, that I am unable to test changes on my machine. So every time I have to wait 40 minutes in order to know if fix works.

rbharath · 2021-04-06T07:24:12Z

One thought off hand is perhaps it's an issue with different randomness? Perhaps setting the random seed to the same value could help here.

MiloszGrabski · 2021-04-06T08:18:12Z

The current errors seems to be connected with gan itself:
Windows test, line 847
@peastman any clue?

# Train the discriminator.
    
        inputs = [self.get_noise_batch(self.batch_size)]
        for input in self.data_input_layers:
>         inputs.append(feed_dict[input.ref()])
E         KeyError: <Reference wrapping <KerasTensor: shape=(None, 9, 9) dtype=float32 (created by layer 'input_4')>>

deepchem\models\gan.py:348: KeyError
============================== warnings summary ===============================
c:\miniconda3\envs\deepchem\lib\site-packages\tensorflow\python\autograph\impl\api.py:22
  c:\miniconda3\envs\deepchem\lib\site-packages\tensorflow\python\autograph\impl\api.py:22: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
    import imp

OK, I get it. it is because there is no inputs 3 and 4 in real data. The whole concept seems to crumble just because I have to add the "fake" inputs to discriminator to sort out the initial issue.
The problem is that generator outputs differ form normal training and sample generation. Therefore, generator has 4 outputs First 2 for training, last 2 for generation of real samples. Because of the initial issue (which does not occur on my machine but only in test) I have added fake (unused) inputs into discriminator, but it means now that I also have to create fake data in dataset. It would be best to be able to specify in generator which sort of output to generate (training or sample generation). It would simply everything, any clue on how to do that. Given simple "if" does not work here.

I have reverted back to the old version which is giving input mismatch.
There two options, either I rewrite the custom training and sample generation loops from WGAN or we can change this possibly
if len(inputs) != len(input_spec) to if len(inputs) < len(input_spec). The latter would be simpler and still should work as intended. Given that excessive inputs will be ignored.
This is current process:

I do not see how I am able to modify either generator or discriminator to recognize if this is training/sample generation version of outputs.

MiloszGrabski · 2021-04-07T12:05:37Z

I was thinking about creating simple layer that will decide which type of output to provide.
But I am struggling with finding a good method to pass signal into layers.
Something similar to training, any ideas?
Is there a simpler way than subclassing keras.Model and modifying call method to accept additional option parameter?

peastman · 2021-04-07T16:44:34Z

I don't think that design is going to work very well. As the documentation for create_generator() says,

deepchem/deepchem/models/gan.py

Lines 242 to 244 in ab5fa05

    
               by any conditional inputs.  The number and shapes of its outputs must match 
        
               the return value from get_data_input_shapes(), since generated data must 
        
               have the same form as training data.

If you violate that requirement, you're going to create all sorts of problems for yourself.

Can't you do everything you need just by overriding predict_gan_generator()? Given the softmax outputs, it looks like it ought to be easy to calculate the argmax ones, and the only time you need them is when you're predicting molecules.

MiloszGrabski · 2021-04-08T06:34:48Z

@peastman I was looking for a way but do not see how.
The problem is that real data goes through one-hot encoding, fake data logits goes through softmax for discriminator or gumbel then argmax for sample generation.
I cannot simply return logits as then discriminator would have to handle conversion and differentiation between real/fake data.
The only option would to reverse softmax in order to get logits in predict_gan_generator(). Do you know any built-in method to do so?

I have decided to go with subclassing and modifying the call method as it makes everything clean and easy to understand.
It works quite well and should eliminate the issue.
I am cleaning the code now.
These are samples after training (used on same set as before).

While predicit sample I get this message first time:
"WARNING:tensorflow:Layer SimpleMolGANGenerator is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2. The layer has dtype float32 because its dtype defaults to floatx.

If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.

To change all layers to have dtype float64 by default, call tf.keras.backend.set_floatx('float64'). To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor."
This comes from numpy random noise; what is standard in deepchem, 64 or 32?

MiloszGrabski · 2021-04-08T11:15:38Z

@rbharath @peastman
It seems the issues are not connected with my PR
FAILED deepchem/feat/tests/test_mol2vec_fingerprint.py::TestMol2VecFingerprint::test_mol2vec_fingerprint
FAILED deepchem/models/tests/test_mpnn.py::test_mpnn_classification - assert ...

rbharath

My apologies for the slow review turnaround! I think this is good to merge on my end

Once @peastman is done with his review, we can go ahead and merge this in

codecov-commenter · 2021-04-19T03:58:25Z

Codecov Report

Merging #2426 (4fdd415) into master (ab5fa05) will increase coverage by 0.09%.
The diff coverage is 98.72%.

@@            Coverage Diff             @@
##           master    #2426      +/-   ##
==========================================
+ Coverage   85.65%   85.75%   +0.09%     
==========================================
  Files         307      309       +2     
  Lines       27196    27353     +157     
==========================================
+ Hits        23295    23456     +161     
+ Misses       3901     3897       -4

Impacted Files	Coverage Δ
deepchem/models/tests/test_molgan_model.py	`98.52% <98.52%> (ø)`
deepchem/models/molgan.py	`98.86% <98.86%> (ø)`
deepchem/models/__init__.py	`94.28% <100.00%> (+0.16%)`	⬆️
deepchem/models/gan.py	`87.36% <0.00%> (+1.05%)`	⬆️
...hem/feat/molecule_featurizers/molgan_featurizer.py	`84.61% <0.00%> (+4.39%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ab5fa05...4fdd415. Read the comment docs.

rbharath · 2021-04-20T05:16:43Z

Wanted to check in on this PR. I'll plan to go ahead and merge in tomorrow in case there's no further comments

MiloszGrabski · 2021-04-20T14:41:40Z

Once you merge, I will do a few small tweaks and create tutorial in jupyter lab

rbharath · 2021-04-21T03:51:37Z

@MiloszGrabski congrats on the merge! Major new feature added to DeepChem :). I'll be sure to give this a shout-out in the DeepChem 2.6 release notes (tentative for Mid May)

PARODBE · 2021-04-21T09:35:40Z

@MiloszGrabski thank you for this library, could you create a notebook for all the options: molgan.py and test_molgan

Thank you!

m-grabski and others added 7 commits March 5, 2021 11:20

Initial commit

9671811

Update model file

954a472

fixed issue with importing WGAN

450988c

unittests

cfabecc

updated example

356d06d

updated example, conversion of graphs

ac94ebd

changed example format

2970663

rbharath reviewed Mar 11, 2021

View reviewed changes

peastman reviewed Mar 12, 2021

View reviewed changes

deepchem/models/tests/test_molgan_model.py Outdated Show resolved Hide resolved

MiloszGrabski added 3 commits March 21, 2021 08:04

example fix

ba7dccb

change __init__ to setUp

90a51df

added training test

ba828d3

MiloszGrabski requested review from peastman and rbharath March 21, 2021 10:59

trying to solve E125

255d157

MiloszGrabski added 2 commits March 26, 2021 09:25

workaround flake8 E125/ yapf issue

0b9a0cb

removed returns from molgan, test clean-up

8b19f9a

MiloszGrabski added 3 commits April 1, 2021 07:46

modified test_training

e7228fe

Merge remote-tracking branch 'upstream/master' into BasicMolGAN

76f055c

fixed import issue

e29b3ac

MiloszGrabski added 3 commits April 2, 2021 08:08

pytest fix, change inputs in critic

32301db

pytest fix

7c369fe

pytest fix

2f4cd25

Update molgan.py

109d8ed

new version of generator

c93cb54

Merge remote-tracking branch 'upstream/master' into BasicMolGAN

4fdd415

rbharath approved these changes Apr 13, 2021

View reviewed changes

rbharath merged commit 2b136d2 into deepchem:master Apr 21, 2021

atreyamaj mentioned this pull request Apr 21, 2021

New Drug Generation #2421

Closed

Basic MolGAN model #2426

Basic MolGAN model #2426

Conversation

MiloszGrabski commented Mar 6, 2021 • edited

BasicMolGAN model

Description

Type of change

Checklist

codecov-io commented Mar 6, 2021 • edited

Codecov Report

MiloszGrabski commented Mar 6, 2021

rbharath left a comment

Choose a reason for hiding this comment

rbharath Mar 11, 2021

Choose a reason for hiding this comment

MiloszGrabski Mar 12, 2021

Choose a reason for hiding this comment

MiloszGrabski Mar 21, 2021

Choose a reason for hiding this comment

peastman commented Mar 12, 2021

MiloszGrabski commented Mar 12, 2021

peastman commented Mar 12, 2021

MiloszGrabski commented Mar 12, 2021

MiloszGrabski commented Mar 21, 2021 • edited

MiloszGrabski commented Mar 22, 2021

peastman commented Mar 25, 2021

peastman commented Mar 25, 2021

MiloszGrabski commented Mar 25, 2021 • edited

MiloszGrabski commented Mar 26, 2021 • edited

MiloszGrabski commented Mar 26, 2021

Example model outputs (sample of 1000 generated each time):

Model 1, 804 valid, 6 unique, first training attempt successful:

Model 2, 60 valid, 10 unique, first training attempt successful:

Model 3, 13 valid, 7 unique, first training attempt successful

Model 4, 17 valid, 3 unique, first training attempt successful

peastman commented Mar 29, 2021

rbharath commented Mar 30, 2021

ncfrey commented Mar 30, 2021

MiloszGrabski commented Apr 1, 2021 • edited

MiloszGrabski commented Apr 1, 2021

MiloszGrabski commented Apr 1, 2021 • edited

rbharath commented Apr 6, 2021

MiloszGrabski commented Apr 6, 2021 • edited

MiloszGrabski commented Apr 7, 2021 • edited

peastman commented Apr 7, 2021

MiloszGrabski commented Apr 8, 2021 • edited

MiloszGrabski commented Apr 8, 2021 • edited

rbharath left a comment • edited

Choose a reason for hiding this comment

codecov-commenter commented Apr 19, 2021 • edited

Codecov Report

rbharath commented Apr 20, 2021

MiloszGrabski commented Apr 20, 2021

rbharath commented Apr 21, 2021

PARODBE commented Apr 21, 2021

MiloszGrabski commented Mar 6, 2021 •

edited

codecov-io commented Mar 6, 2021 •

edited

MiloszGrabski commented Mar 21, 2021 •

edited

MiloszGrabski commented Mar 25, 2021 •

edited

MiloszGrabski commented Mar 26, 2021 •

edited

MiloszGrabski commented Apr 1, 2021 •

edited

MiloszGrabski commented Apr 1, 2021 •

edited

MiloszGrabski commented Apr 6, 2021 •

edited

MiloszGrabski commented Apr 7, 2021 •

edited

MiloszGrabski commented Apr 8, 2021 •

edited

MiloszGrabski commented Apr 8, 2021 •

edited

rbharath left a comment •

edited

codecov-commenter commented Apr 19, 2021 •

edited