Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test QuickStart #45

Open
NoraLoose opened this issue Aug 21, 2023 · 4 comments
Open

Test QuickStart #45

NoraLoose opened this issue Aug 21, 2023 · 4 comments
Assignees

Comments

@NoraLoose
Copy link
Contributor

No description provided.

@cisaacstern
Copy link
Contributor

Step 3 of the Quickstart reads:

ClimSim/README.md

Lines 81 to 83 in f94b862

**Step 3**
Train your model on the training data and validate using the validation data. If you wish to use something like a CNN, you will probably want to separate the variables into channels and broadcast scalars into vectors of the same dimension as vertically-resolved variables. Methods to do this can be found in the [climsim_utils/data_utils.py](https://github.com/leap-stc/ClimSim/blob/main/climsim_utils/data_utils.py) script.

Is it reasonable to expect that our target audience will know how to do this? As someone who has never trained an ML model before, I personally don't know where to begin, but I don't think I'm the target audience.

(Alternatively, if it were to be catered to a more beginner audience, the Quickstart Step 3 could simply describe how to replicate the paper results by running the baseline_models/ (or one of them) against the subsampled training data?)

If the expectation is that the target Quickstart audience knows how to train a model against this data without further instruction, then I'll just have to recuse myself from being a tester for this section until I know how to do that (which probably won't be today)! 😄

@NoraLoose
Copy link
Contributor Author

I agree. We cannot expect our audience to know how to train a model on this new dataset. I train ML models on other datasets, but the dataset really makes a big difference to the workflow. I consider myself as the target audience, but I would not know how to quickly train such a model. (In fact, I would expect that this would take days to weeks if I had to do it from scratch.)

(Alternatively, if it were to be catered to a more beginner audience, the Quickstart Step 3 could simply describe how to replicate the paper results by running the baseline_models/ (or one of them) against the subsampled training data?)

Yes, this is a good suggestion.

@jerrylin96
Copy link
Collaborator

I think the quickstart is in a condition ready for testing.

@NoraLoose
Copy link
Contributor Author

@jerrylin96 did you close this issue on purpose? I have not completed the testing of the quickstart. In fact, I was waiting on instructions, see #55 (comment).

@NoraLoose NoraLoose reopened this Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants