Added infograph model finetuning support #3491

arunppsg · 2023-07-20T07:33:03Z

Description

Added support for infograph model finetuning task.

Type of change

Please check the option that is related to your PR.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
- In this case, we recommend to discuss your modification on GitHub issues before creating the PR
Documentations (modification for documents)

Checklist

arunppsg · 2023-07-24T10:55:34Z

@tonydavis629 , just want to check in - does it make sense to add finetuning support for InfoGraph model?

From my understanding, the infograph paper performed two kind of experiments:

exp 1: combining the supervised objective (finetuning) with the unsupervised pretraining objective (InfoGraph)
exp 2: unsupervised infograph pretraining with semi-supervised InfoGraphStar for finetuning.

Am I right in the understanding here?

With the current setup, we can do exp 2 and I believe this code adds support for exp 1. But is there a better way to do it?

tonydavis629 · 2023-07-24T16:32:07Z

@arunppsg I believe this change is redundant, the functionality to do pretraining and finetuning is already there in InfoGraphModel and InfoGraphStarModel. The difference being that this code uses 1 combined InfoGraph model for finetuning and pretraining, while the current implementation uses InfoGraph for pretraining and InfoGraphStar for finetuning.

The 2 experiments done in the paper are 1. unsupervised mutual information maximization between a global and local encoded graph representation (with no finetuning or supervised objective) and 2. layer by layer mutual information maximization between a trained encoder and untrained encoder plus a supervised loss (unsupervised objective + supervised objective being referred to in the paper as semi-supervised).

So the InfoGraphModel is reserved for this #1 unsupervised task to train the encoder. InfoGraphStarModel is to be used for #2 by loading the weights of pretrained InfoGraph into the encoder. Your InfoGraphFinetune functionality is very similar to InfoGraphStarModel. This pretrain + finetune regime is implemented in test_infograph_pretrain_overfit test, and is again similar to test_infograph_pretrain_finetune.

arunppsg · 2023-07-25T14:59:17Z

I agree that InfoGraph* does a semi-supervised finetuning and infograph doing a unsupervised learning to train an encoder. From the table 2 in the paper, they have two results: results from InfoGraph model finetuned via supervised setting and results from InfoGraph* finetuned via semi-supervised learning approach. In this pull request, I am proposing an approach for finetuning the encoder in a supervised setting. It it will be useful, we can merge it in else we can close it.

tonydavis629 · 2023-07-26T14:30:01Z

Figure 2 shows only InfoGraph*, while figure 1 shows InfoGraph. Those 2 approaches were implemented with InfoGraphStarModel and InfoGraphModel. Finetuning the encoder with a supervised dataset is already possible with InfoGraphStarModel. The difference I see in your implementation is that you've combined both into a single model, which may be more convenient, but otherwise I believe it is the same functionality.

arunppsg · 2023-07-27T15:12:50Z

I leave it to @rbharath for final call on whether it will be useful or not to users.

rbharath · 2023-08-01T18:30:46Z

This is redundant with InfographStar, but I think it could be nice for users since our other models allow for pretraining/finetuning in the same model and it's convenient to have the same for infograph.

@arunppsg Let me know once all tests are passing and this is ready for my full review

arunppsg · 2023-08-02T05:57:18Z

This is ready for review and tests are passing.

rbharath

LGTM

rbharath · 2023-08-02T16:49:25Z

deepchem/models/torch_models/torch_model.py

@@ -194,8 +195,8 @@ def __init__(self,
        if device is None:
            if torch.cuda.is_available():
                device = torch.device('cuda')
-            elif torch.backends.mps.is_available():
-                device = torch.device('mps')
+            # elif torch.backends.mps.is_available():


Can you remove this before merging in? Looks like cruft

sure, forgot to remove. removed it.

rbharath · 2023-08-02T16:49:55Z

deepchem/models/torch_models/modular.py

@@ -386,6 +385,7 @@ def restore(  # type: ignore
        model_dir: Optional[str]
            The path to the model directory. If None, the model directory used to initialize the model will be used.
        """
+        logger.info('Restoring model')


These logger changes are also reflected in your other PR. I'm fine merging them in as part of this PR since they are small

arunppsg marked this pull request as draft July 20, 2023 13:02

arunppsg marked this pull request as ready for review July 20, 2023 13:30

arunppsg force-pushed the infograph branch 3 times, most recently from 29b796e to bc90b17 Compare July 23, 2023 07:59

arunppsg added 6 commits July 25, 2023 18:37

logging statement to log epochs

db14d22

added infograph model finetuning

8c48bd6

added docstring

2dba448

infograph finetuning tests

a3b124f

logging datetime of epoch start

855b1cb

typing

0f797b1

rbharath approved these changes Aug 2, 2023

View reviewed changes

rbharath reviewed Aug 2, 2023

View reviewed changes

remove cruft

7940395

arunppsg force-pushed the infograph branch from bc90b17 to 7940395 Compare August 3, 2023 06:49

arunppsg merged commit 9ad373c into deepchem:master Aug 3, 2023
21 of 31 checks passed

arunppsg deleted the infograph branch August 3, 2023 06:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added infograph model finetuning support #3491

Added infograph model finetuning support #3491

arunppsg commented Jul 20, 2023

arunppsg commented Jul 24, 2023

tonydavis629 commented Jul 24, 2023 •

edited

arunppsg commented Jul 25, 2023 •

edited

tonydavis629 commented Jul 26, 2023

arunppsg commented Jul 27, 2023

rbharath commented Aug 1, 2023

arunppsg commented Aug 2, 2023

rbharath left a comment

rbharath Aug 2, 2023

arunppsg Aug 3, 2023

rbharath Aug 2, 2023

Added infograph model finetuning support #3491

Added infograph model finetuning support #3491

Conversation

arunppsg commented Jul 20, 2023

Description

Type of change

Checklist

arunppsg commented Jul 24, 2023

tonydavis629 commented Jul 24, 2023 • edited

arunppsg commented Jul 25, 2023 • edited

tonydavis629 commented Jul 26, 2023

arunppsg commented Jul 27, 2023

rbharath commented Aug 1, 2023

arunppsg commented Aug 2, 2023

rbharath left a comment

Choose a reason for hiding this comment

rbharath Aug 2, 2023

Choose a reason for hiding this comment

arunppsg Aug 3, 2023

Choose a reason for hiding this comment

rbharath Aug 2, 2023

Choose a reason for hiding this comment

tonydavis629 commented Jul 24, 2023 •

edited

arunppsg commented Jul 25, 2023 •

edited