-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify examples #456
Simplify examples #456
Conversation
Hi @RasmusOrsoe, Thanks for this! Some initial thoughts before I go into review-mode: As I see it, this PR does two things,
The second point I totally get — it fixes a known and well-defined issue that we have discussed and agree is worth solving. 🚀 The first point I am not sure I understand the need for. If the point is to disable W&B, I figure we could just add a |
Hey @asogaard I definitely think there is a need for some resource that provides a simple point of entry into the code base in as few lines as possible. I'd like the example script to default to no wand (not because of personal feelings!) because a number of people have had trouble running the example scripts because of wand. It was actually caught on a live stream here. I hard-coded the configs because the example scripts (our only point of entry currently) hides these away in default arguments and doesn't really provide a transparent view of what it really is. Perhaps the conclusion is that an actual written tutorial is a better format that an example script. So we could proceed as:
What do you think @asogaard? |
I completely agree that our main training example should be as simple as possible, but no simpler! 😊 So any ideas for making them a more friendly point of entry are more than welcome! I agree with your proposed solution, but please do feel free to make wandb opt-in (or should I make a separate PR for that?). Regarding the config files: all default values are described when you run $ python examples/04_training/01_train_model.py --help
usage: 01_train_model.py [-h] [--gpus GPUS [GPUS ...]] [--max-epochs MAX_EPOCHS] [--early-stopping-patience EARLY_STOPPING_PATIENCE]
[--batch-size BATCH_SIZE] [--num-workers NUM_WORKERS] [--dataset-config DATASET_CONFIG] [--model-config MODEL_CONFIG]
[--prediction-names PREDICTION_NAMES [PREDICTION_NAMES ...]] [--suffix SUFFIX]
(...)
--dataset-config DATASET_CONFIG
Path to dataset config file (default: {CONFIG_DIR}/datasets/training_example_data_sqlite.yml)
--model-config MODEL_CONFIG
Path to model config file (default: {CONFIG_DIR}/models/example_energy_reconstruction_model.yml) and similar for all example scripts. This is mentioned here, but if people are not familiar the standard python CLI, I think this would be good to point out in a tutorial or similar! |
@asogaard OK - I think that's it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions. :)
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
OK. I think this is it @asogaard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a few comments that I think might have been missed in the last batch, @RasmusOrsoe. :)
@asogaard Could we maybe go through these comments on zoom sometime? |
@RasmusOrsoe Absolutely! |
@asogaard I think this it it. However, now I see that the data conversion unit tests fail even though no changes have been made to that part of the repo. I've scrolled through the error messages and I'm a bit puzzled. Do you have any ideas what could be causing this? |
It's probably due to the pandas version problem I mention in the other PR. We'll rerun the tests once that is merged. |
Try merging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for your hard work here, @RasmusOrsoe! 🙏 I think this looks great, and is ready merge. I have added a few minor suggestions (one, regarding warn_once
, is actually a typo, probably mine, so at least that one should probably be added) that you can consider before hitting the big, green button.
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>
Simplify examples
This PR attempts to introduce a simplified training script (
01_train_model_simple.py
) where we showcase training syntax with minimal lines of code. What I did to simplify was:target_labels
andoutput_labels
for all tasks, and making these available undermodel.output_labels
andmodel.target_labels
(closes Make prediction column names self-contained #175 )These changes has brought down the training example from 183 to 68 lines.
In addition, I've made the default values for
target_labels
andoutput_labels
more visual inTask
's, and written them as@propery
such that we have to supply these default labels for tasks in the future. I think this makes it a bit easier to grasp what a givenTask
requires and returns.I have left the other training scripts untouched but renamed them to make the distinction easier.
README has been updated.