-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Implement bootstrap #107
Conversation
This modification is to help with generating the bootstrap compute arguments. The default, however, is to return a flat list, which is the same behavior as before this modification.
This implementation is to allow the use of `scipy.optimize.least_squares` function when using `_WrapperCalculator`.
These updates were made to help running bootstrap sampling for neural network model.
Instead of sample from each calculator independently, we will combine all the compute arguments and sample from the combined list. Then, we will split them into their respective calculators.
We want to cache the initial guess that we set prior to training because in the bootstrap, we want to start the training NOT from the last optimal parameter. If we use the last optimal values, the result might be biased from seeing the entire dataset. We want to treat each step in the bootstrap sampling as though the bootstrap sample compute arguments are the only data we have.
The default is to use cached initial parameter guess.
For NN model, we use `reset_parameters` method implemented in each layer module. For the empirical model, I removed the option to have custom function to get the initial guess. This is done to make the class behave more similarly to the bootstrap NN class. However, this can be implemented in the future, if needed.
When the default bootstrap cas generator is used with _WrapperCalculator, the number of configurations for each calculator might be different than the original list. As such, the old list of residual functions might not be appropriate for some configurations. For example if we use the original list of residual functions, we might use `forces_residual` when the bootstrap configuration only compute for energy.
Note that this is still a very rough draft, though functional. |
thanks @yonatank93! Let me know when ready and I'll take a look. |
I will @mjwen. Sorry that it is still very premature. By the way, the example for running bootstrap that I currently have reflects the workflow I intend users to use. That is, I want users to get an ensemble of parameters first before propagating the error to some other predictions, or in other words, the set of parameters is fixed. This is to answer your previous email. |
@yonatank93 totally ok! I just wanted to make sure I get notified when it is ready. |
…le calculators Previously, if the cas in separate calculators have the same identifier, there was a problem. Fix this by appending information about which calculator each ca comes from.
Note that we haven't tested the case when we have models for multiple elements.
I think this version works with CalculatorTorchSeparateSpecies.
* Create a parent class for the 2 bootstrap classes, because they have many same methods. * Create a wrapper class to automatically pick between the 2 bootstrap classes depending on the type of loss function. * Update the test * Add typing
@mjwen The bootstrap implementation is ready for you to look at. I haven't added anything in |
The callback function can also break the loop inside the run method. It can also be used to monitor the convergence of the optimization in each iteration, etc.
* Create a parent class for the 2 bootstrap classes, because they have many same methods. * Create a wrapper class to automatically pick between the 2 bootstrap classes depending on the type of loss function. * Update the test * Add typing
The callback function can also break the loop inside the run method. It can also be used to monitor the convergence of the optimization in each iteration, etc.
* Typing * Fix documentation
I added an argument to specify how many bootstrap compute arguments to generate for each sample. Generally, in each sample we want to have the same number of compute arguments as the number of compute arguments in the original list. However, with this option there might be an interesting study that can be done, e.g., if we use fewer compute arguments in each sample.
* Previously, there was a problem when getting the parameters if we use gpu. * I changed so that if we set flat=False when getting parameters, it now returns a list of torch.nn.Parameters instead of torch.Tensor.
Previously, I changed to use `torch.Parameter` and update the parameters using something like ``` for param in model.parameters(): param = <torch.Parameter> ``` However, this doesn't seem to update the parameters in the model. So, I revert back so that updating parameters is done by doing ``` for param in model.parameters(): param.data = <torch.Tensor> ```
The tests include: * Test for the function to retrieve the size of parameters. * Test if the parameter values are updated accordingly. * Test if changing parameters lead to change in predictions.
optimizer settings
There was an issue that the old way to compute sigma raised a `VisibleDeprecationWarning` for some python version and error for Python > 3.9. I think this was because we were using list of float and array inside `np.linalg.norm` to compute the norm of value corresponding to each data point.
…liff into implement_bootstrap
@mjwen I applied what we just discussed. Now I see that adding the argument I also ran pre-commit and built the documentation, but you might want to double-check it. It is ready for you to review again. |
We have CI checking the formatting, using the same pre-commit configs. So as long as it passes the checks, it is good. Everything is good now. I've merged it. Thanks! |
Thanks for your help. |
We want to implement a bootstrap method of UQ. In general, we do this by taking the compute arguments and samples from the list of compute arguments with replacement. Then, we train the model using this sample of compute arguments. The optimal parameters give a point in the ensemble.
TODO: