Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2: consolidate --checkpoint CLI #731

Merged
merged 9 commits into from
Jun 13, 2024
Merged

Conversation

hwpang
Copy link
Contributor

@hwpang hwpang commented Mar 14, 2024

My PR at #667 got closed automatically when I changed my forked source, and I couldn't reopen it. So I open this one to continue the PR.

@kevingreenman kevingreenman added this to the v2.0.0 milestone Mar 14, 2024
@hwpang hwpang modified the milestones: v2.0.0, v2.1.0 Mar 19, 2024
@hwpang
Copy link
Contributor Author

hwpang commented Mar 19, 2024

Moving to v2.1 because this is more related to uncertainty estimation

@KnathanM
Copy link
Contributor

Can you check if --checkpoint handles the three cases being removed? We think this involves using nargs="+" and then iterating through the list of file and directory paths to collect all the model files.

@hwpang hwpang requested a review from KnathanM June 11, 2024 20:15
@hwpang
Copy link
Contributor Author

hwpang commented Jun 11, 2024

@KnathanM I have realized that all these arguments should have been overridden by the --model-path in predict.py. I have removed the stale --checkpoint, --checkpoint-path, --checkpoint-dir, --checkpoint-paths argument, and made the changes as discussed in chemprop dev meeting. Please review, thanks!

Copy link
Contributor

@KnathanM KnathanM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment and then looks good to go!

chemprop/cli/predict.py Outdated Show resolved Hide resolved
Co-authored-by: Nathan Morgan <nate.k.morgan@gmail.com>
@hwpang
Copy link
Contributor Author

hwpang commented Jun 12, 2024

@KnathanM Thanks, fixed!

Comment on lines 173 to 179
for model_path in model_paths:
if model_path.suffix in [".ckpt", ".pt"]:
collected_model_paths.append(model_path)
elif model_path.is_dir():
collected_model_paths.extend(
list(model_path.rglob("*.ckpt")) + list(model_path.rglob("*.pt"))
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea to iterate through all the .ckpt and .pt files in a directory. Maybe we should only use the .pt file. When you train a model, you would have a best.pt and two .ckpt files in the checkpoint directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that is a good point. I think we shouldn't look for .ckpt files in a directory, but can still accept it if it is passed as a file path. The checkpoint files are primarily intended for restarting training, so it is reasonable to require checkpoint files to be explicitly passed for prediction (--model-path mymodel.ckpt) if that is intended instead of finding them implicitly in a directory. This way a user can do both:

SAVE_DIR=mydir
chemprop train -o $SAVE_DIR -i data
chemprop predict --model-path $SAVE_DIR -i data

(which will find all the "best.pt" files) and

chemprop predict -i data --model-path *.ckpt

(which will find all the ckpt files in a directory, but note that they all passed as file paths and not the path to a directory).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I have modified to only search for .pt files if a directory is provided.

Copy link
Contributor

@shihchengli shihchengli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@KnathanM KnathanM merged commit 866074c into chemprop:main Jun 13, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants