Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2: improve CLI for multitask #647

Merged
merged 34 commits into from Feb 29, 2024
Merged

Conversation

hwpang
Copy link
Contributor

@hwpang hwpang commented Feb 13, 2024

Description

  • Fix some small bugs related to multitask training and prediction.
  • Add a test dataset for single molecule multitask regression
  • Add a CLI test for training and predicting multitask
  • Improve saving predictions for

While implementing the cli test for multi-task MVE model, I have encountered bugs related to MVE outside the scope of this PR. I will make another PR to fix it. Edit: single-task MVE is fixed in #664.

Question

  • Do we have an example multi-class dataset?

Checklist

  • linted with flake8?
  • (if appropriate) unit tests added?

@hwpang hwpang added this to the v2.0.0 milestone Feb 13, 2024
@hwpang hwpang marked this pull request as ready for review February 23, 2024 21:31
chemprop/models/model.py Outdated Show resolved Hide resolved
chemprop/nn/predictors.py Outdated Show resolved Hide resolved
@KnathanM
Copy link
Contributor

I now see that the test are failing.

In tests/cli/test_cli_regression_mol.py you have test_train_quick defined twice. The second one uses regression-mve which we are moving to v2.1 so I think you can just delete the second test_train_quick to fix it.

The second error seems a bit more difficult. In tests/cli/test_cli_regression_mol_multitask.py you try to load a multitask model but the model seems to have loc and scale initialized as 1x1 torch tensors, so overwriting them with the 1x12 tensors in the model file throws an error. Maybe the model's ntasks attribute needs to be set sooner so the model will be initialized with a 1x12 loc and scale?

@hwpang
Copy link
Contributor Author

hwpang commented Feb 28, 2024

I took a look into this. The error is the same that I saw previously: when loc is initialized to be a 1 by 1 tensor but the trained model is 1 by n_tasks, it has this error.

I found that the isinstance(loc, float) type check is not working as expected. For some reason, the default loc is recognized as an integer. Changing loc to 0.0 doesn't work. I can change the type check to check for int instead, but it can be confusing.

chemprop/models/model.py Outdated Show resolved Hide resolved
hwpang and others added 2 commits February 29, 2024 16:20
Co-authored-by: Nathan Morgan <nate.k.morgan@gmail.com>
@KnathanM KnathanM merged commit 34872d4 into chemprop:v2/dev Feb 29, 2024
2 of 3 checks passed
@hwpang hwpang deleted the v2/cli/multitask branch February 29, 2024 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG]: v2 predict CLI does not work with multitask output - multitask MVE
2 participants