V2: improve CLI for multitask #647

hwpang · 2024-02-13T14:11:30Z

Description

Fix some small bugs related to multitask training and prediction.
Add a test dataset for single molecule multitask regression
Add a CLI test for training and predicting multitask
Improve saving predictions for
- multi-task
- multi-task MVE model -> part of v2.1, won't support for now for v2.0.
- multi-task multiclass -> looking for an example dataset to test this out. -> addressed in V2: Add dataset and tests for multiclass classification #672.

While implementing the cli test for multi-task MVE model, I have encountered bugs related to MVE outside the scope of this PR. I will make another PR to fix it. Edit: single-task MVE is fixed in #664.

Question

Do we have an example multi-class dataset?

Checklist

linted with flake8?
(if appropriate) unit tests added?

chemprop/models/model.py

chemprop/nn/predictors.py

KnathanM · 2024-02-28T22:43:48Z

I now see that the test are failing.

In tests/cli/test_cli_regression_mol.py you have test_train_quick defined twice. The second one uses regression-mve which we are moving to v2.1 so I think you can just delete the second test_train_quick to fix it.

The second error seems a bit more difficult. In tests/cli/test_cli_regression_mol_multitask.py you try to load a multitask model but the model seems to have loc and scale initialized as 1x1 torch tensors, so overwriting them with the 1x12 tensors in the model file throws an error. Maybe the model's ntasks attribute needs to be set sooner so the model will be initialized with a 1x12 loc and scale?

hwpang · 2024-02-28T23:14:04Z

I took a look into this. The error is the same that I saw previously: when loc is initialized to be a 1 by 1 tensor but the trained model is 1 by n_tasks, it has this error.

I found that the isinstance(loc, float) type check is not working as expected. For some reason, the default loc is recognized as an integer. Changing loc to 0.0 doesn't work. I can change the type check to check for int instead, but it can be confusing.

chemprop/models/model.py

Co-authored-by: Nathan Morgan <nate.k.morgan@gmail.com>

hwpang added 5 commits February 12, 2024 21:18

Initializing tensor for loc and scale required for loading model

7299ffc

dimension should be 1 x number of tasks

4f12a18

Add test data for regression mol multitask

89d15d0

Add example model for cli test

f3dc05b

Add cli test for regression mol multitask

87eb0ad

hwpang added this to the v2.0.0 milestone Feb 13, 2024

hwpang linked an issue Feb 13, 2024 that may be closed by this pull request

[BUG]: v2 predict CLI does not work with multitask output - multitask MVE #642

Open

hwpang added 10 commits February 13, 2024 19:11

Add target columns as predict input to use as column header

70415f8

Merge branch 'v2/dev' into v2/cli/multitask

0cc1235

Also test case without target columns

5c9efb6

Add multitask mve cli test

0ad745c

Scale all regression task (including regression-mve, etc)

19dcd32

Add test to train mve

c4e8079

Add test for training regression-mve model

5905b5b

Scale all regression task, use scaler=None if not

3f4957c

Evaluate on the mean, not var

660cdfc

Merge branch 'v2/mve' into v2/cli/multitask

61a38fb

hwpang mentioned this pull request Feb 23, 2024

V2: Fixing MVE regression and adding tests #664

Merged

2 tasks

Merge branch 'v2/dev' into v2/cli/multitask

63d3741

hwpang marked this pull request as ready for review February 23, 2024 21:31

hwpang added 5 commits February 27, 2024 13:56

Remove unnecessary to list

d5d4f32

Remove mve for now as it is not in the goal of v2.0

647cbb5

Remove mve

9d2ea14

Formatting

ebb5e29

Fix dimension

6035d6d

KnathanM reviewed Feb 27, 2024

View reviewed changes

chemprop/models/model.py Outdated Show resolved Hide resolved

chemprop/nn/predictors.py Outdated Show resolved Hide resolved

hwpang added 3 commits February 28, 2024 16:45

Remove changes related to MVE as it's for v2.1

61f9c14

Change type check

867b0fb

Merge branch 'v2/dev' into v2/cli/multitask

0477271

Remove mve related changes

95154aa

hwpang added 7 commits February 28, 2024 18:14

Type check integer

bed4d46

Merge branch 'v2/dev' into v2/cli/multitask

d669910

Formatting

0a74476

Remove merge artifects

8bc682b

Update example model file

7f91eac

Use float in type check

b345795

loc and scale need . for python to recognize them as float

c85a814

KnathanM reviewed Feb 29, 2024

View reviewed changes

chemprop/models/model.py Outdated Show resolved Hide resolved

hwpang and others added 2 commits February 29, 2024 16:20

Remove unused MveFFN

d819f36

Co-authored-by: Nathan Morgan <nate.k.morgan@gmail.com>

Merge branch 'v2/dev' into v2/cli/multitask

1027006

KnathanM approved these changes Feb 29, 2024

View reviewed changes

KnathanM merged commit 34872d4 into chemprop:v2/dev Feb 29, 2024
2 of 3 checks passed

hwpang deleted the v2/cli/multitask branch February 29, 2024 21:47

hwpang mentioned this pull request Feb 29, 2024

[BUG]: v2 predict CLI does not work with multitask output - multitask MVE #642

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V2: improve CLI for multitask #647

V2: improve CLI for multitask #647

hwpang commented Feb 13, 2024 •

edited

KnathanM commented Feb 28, 2024

hwpang commented Feb 28, 2024

V2: improve CLI for multitask #647

V2: improve CLI for multitask #647

Conversation

hwpang commented Feb 13, 2024 • edited

Description

Question

Checklist

KnathanM commented Feb 28, 2024

hwpang commented Feb 28, 2024

hwpang commented Feb 13, 2024 •

edited