-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unified Parameter
class
#3
Comments
@mjwen @yonatank93 |
I believe the easiest way is to create a new branch to implement the Parameter, and make a PR to whatever branch has the old Parameter class. Then we can compare them. Of course, this dependents on how significant the change is. Installing experimental versions should not be a problem for @yonatank93 and me. |
Sorry, I still don't have a clear vision about it. So, I would also suggest if @ipcamit can start implementing the Parameter and I can try it and give feedback. I don't have a problem installing the experimental version. |
I would say changes can be really significant, that is why I asked specifically. If you guys have no issues in installing an experimental test version, then best would be to clone an experimental branch and do pip install -e in a new conda env. Will push the branch soon and let you guys know, along with install info, and detailed explanation and reasoning of proposed changes. |
Problem: We have two separate find of parameters,
|
Thanks @ipcamit. This is great! Yes, I totally agree that transform on parameters should be implemented within the Parameter class. I love the general idea to combine them, and we can just leave some fields empty when using them as a container for KIM parameters. I haven't run the example, but only looked at the implementation of the new Parameter class. I have two main questions. First, any idea on how to handle the case where for a parameter under the same name, some component are fixed and some others are allowed to optimize? To be more specific, for example, if we use the SW model for a system with two species Si and C, then parameter I guess the major distinction between the original Second, the use of |
How is that currently dealt with in kimpy internally? Does it return an array or three parameters with same name?
This is also a viable solution, but I think it would be more confusing the way KLIFF currently presents these parameters (as single array). I think I can just modify the Also for array params like that, we can set a separate
Ah! I misunderstood it then. I thought Parameter class was keeping the state consistent. Honestly passing it to scipy makes it even simpler. One of the biggest source of headache was to keep clamps consistent in transformed and inverse space. If bounds are just container of numbers to be passed to scipy, great! |
kimpy always returns a 1D array of floats/ints, even when it is size is 1. Which components to optimize and which not are kept track of by the OptimizingParameters container (including their bounds and other stuff). Using bounds to set them is not doable; the major reason is that not all optimizer supports the use of bounds. Then we will need to use the ``set_array_opt` approach.
Yes, it is easier. The problem is how to communicate between an optimizer and the parameters. Previously, all is wrapped under the OptimizingParameters container. Now, since we are removing this class, then besides |
Also, forget that you mentioned that the MCMC example fails. Are you using parameter transformation? If yes, could it be that you forget to do the reverse transform? |
Yes, its Nan due to negative log in likelihood function. Need to dig bit deeper to ensure all parameters are transformed properly. Let me think more about the array optimization. I would like to keep the option of adding bounds to model as the last resort. I will give first chance to some boolean mask. Something like model.set_opt_params(["A"])
param = model.parameters()
param["A"].opt_mask_([True, False, True])
param["A"].set_transform_(f)
param["A"].bound_([[1.,2.],[10.,np.inf]]) # 2 true array val so bound has to be a 2d array of 2x2
# basicall n x 2 list or arrays, for n true masks
param["A"].finalize_() That way all the information is still localized in Parameters. |
Looks good. Keep us posted once you have new proposals. |
@mjwen @yonatank93 Updated API and fixed issues:
Todo:
Examples of new workflow:
from kliff.dataset import Dataset
from kliff.models import KIMModel
from kliff.calculators import Calculator
from kliff.transforms import ParameterTransform
import numpy as np
from kliff.loss import Loss
# Load dataset
ds = Dataset("Si_training_set_4_configs")
# Load model
model = KIMModel("SW_StillingerWeber_1985_Si__MO_405512056662_006")
# model params
model.get_model_params()
# set params to opt
model.set_opt_params(["A", "B"])
params = model.parameters()
# Set parameter transform
class CustomLogTransform(ParameterTransform):
def __init__(self, base):
super().__init__("Base" + str(base))
self.base = base
def transform(self, x):
return np.emath.logn(self.base, x)
def inverse(self, x):
return self.base ** x
base10transform = CustomLogTransform(10.0)
# set to transformed value
params["A"].copy_(5.0)
# add bounds for optimization in transformed space
params["A"].add_bounds_(np.array([[0.1, 10]]))
params["B"].add_bounds_(np.array([[-np.inf, -0.1]]))
# set transformaiton class
params["A"].add_transform_(base10transform)
params["B"].add_transform_(base10transform)
# finalize model params
params["A"].finalize_()
params["B"].finalize_()
print(model.get_opt_params()) # should print ['A', 'B']
# Set calculator
calc = Calculator(model)
_ = calc.create(ds.get_configs())
# optimize
steps = 100
loss = Loss(calc, nprocs=2)
loss.minimize(method="L-BFGS-B", options={"disp": True, "maxiter": steps})
# Load dataset
ds = Dataset("MoS2_10")
# Load model
model = KIMModel("SW_MX2_WenShirodkarPlechac_2017_MoS__MO_201919462778_001")
print(model.get_model_params())
# set optimization parameters
model.set_opt_params(["A", "B"])
params = model.parameters()
# Add bounds in log space
params["A"].add_bounds_(np.array([[0.01, 10],[0.01, 10]]))
params["B"].add_bounds_(np.array([[-np.inf, -0.1],[-np.inf, -0.1]]))
# add transform
params["A"].add_transform_(base10transform)
params["B"].add_transform_(base10transform)
# Set Mo-Mo and Mo-S params as optimizable
B_mask = np.zeros_like(params["B"],dtype=bool)
B_mask[0] = True
B_mask[2] = True
A_mask = np.zeros_like(params["A"],dtype=bool)
A_mask[0] = True
A_mask[2] = True
params["A"].set_opt_mask(A_mask)
params["B"].set_opt_mask(B_mask)
params["A"].finalize_()
params["B"].finalize_()
print(model.get_opt_params())
# Set calculator
calc = Calculator(model)
_ = calc.create(ds.get_configs())
steps = 100
loss = Loss(calc, nprocs=2)
loss.minimize(method="L-BFGS-B", options={"disp": True, "maxiter": steps})
>>>>>
-Instantiate a transformation class to do the log parameter transform
-params_transform = LogParameterTransform(param_names)
-
-# Create the model
-model = KIMModel(
- model_name="SW_StillingerWeber_1985_Si__MO_405512056662_006",
- params_transform=params_transform
-)
-
-# Set the tunable parameters and the initial guess
-
-opt_params = {
- "A": [["default", -8.0, 8.0]],
- "lambda": [["default", -8.0, 8.0]],
-}
-
-model.set_opt_params(**opt_params)
-model.echo_opt_params()
<<<<<
+param_names = ["A", "lambda"]
+model = KIMModel(model_name="SW_StillingerWeber_1985_Si__MO_405512056662_006")
+
+model.set_opt_params(param_names)
+opt_params = model.parameters()
+
+
+class LogTransform(ParameterTransform):
+ def __init__(self):
+ super().__init__("log")
+ def transform(self, x):
+ return np.log(x)
+ def inverse(self, x):
+ return np.exp(x)
+
+log_transform = LogTransform()
+opt_params["A"].add_transform_(log_transform)
+opt_params["lambda"].add_transform_(log_transform)
+
+opt_params["A"].finalize_()
+opt_params["lambda"].finalize_() |
@mjwen @ipcamit Out of curiosity, is there a reason why kliff doesn't use |
@yonatank93 I think we did discuss it in the MACH conference. I would also like to use some standard format, like ASE atoms. But the rationale behind using custom configuration object was to ensure that we can maintain a minimal required control. I mean ASE atoms might contain fields we do not need, or might be missing some required fields (for example reference to the colabfit dataset that we are developing). So the summary was to to keep the configuration object, and provide parsers with ase and pymatgen etc. However I am willing to move to either ase or pymetgen etc if you and @mjwen agree. But coming to the above discussion, what do you think of above workflow, and parameter declaration? Do you think it will bring more clarity? or will it make MCMC more convoluted? |
@ipcamit I think for MCMC the new workflow is fine. With your update, it doesn't seem to affect the MCMC part's workflow, and it only affects how we initialize the model. I do have a comment about how we set optimizing parameters in the original KLIFF version. Currently, I need to deal with EAM potential, and my issue with KLIFF is that it takes a long time to set the optimizing parameters. See the example below. In [1]: from kliff.models import KIMModel
In [2]: model = KIMModel("EAM_Dynamo_MishinFarkasMehl_1999_Al__MO_651801486679_005")
In [3]: %time model.set_opt_params(embeddingData=[["default"]] * 10000)
CPU times: user 1min, sys: 103 ms, total: 1min
Wall time: 1min 1s My guess is that to set the embedding data as the parameters, KLIFF needs to iterate over the entire tabulated values (in this example there are 10,000 of them). Even if I only want to tune a few parameters, I still need to specify either |
Yes thats the idea that now Parameters will be more contained while keeping things same. will change other things as we go along. Thank you for bringing up that example. Though in case of EAM, it will be little slow, I cant think of a reason as to why KLIFF should be this slow. I will check the timings to pinpoint the exact cause. I think the new parameter class might help it as well. Example, I tried your example in new params class, with In [10]: def init():
...: import numpy as np
...: model.set_opt_params(["embeddingData"])
...: params = model.parameters()
...: idx_mask = np.zeros_like(params["embeddingData"], dtype=bool)
...: idx_mask[10:100:2] = True
...: params["embeddingData"].set_opt_mask(idx_mask)
...: params["embeddingData"].finalize_()
...:
In [11]: %time init()
CPU times: user 117 µs, sys: 22 µs, total: 139 µs
Wall time: 145 µs Its bit more verbose but does it solve your problem? I can make more educated comments on it once I figure exactly why old version is so slow. |
From what I think, the slow part in the old version is around this line. Maybe it's because of the for loop in this method. Especially for EAM there are 10,000 values to iterate over for the embeddingData and other parameters. I might be wrong, though. |
First, regarding using ASE Atoms or pymatgen Structure. It is not easy to directly use them as @ipcamit said. For the fitting, we need to keep track of other info besides cell, coords etc. That's where we use our own Configuration to keep track of energy, reference forces, etc. In addition, we can easily add utilities methods that are typically needed in potential fitting. That being said, it might be a good idea to use ASE Atoms or pymatgen Structure to represent the atomic structure. They are widely used and may offer other utilities. Anyway, we still need a Configuration-like object to keep track of other info. I just don't feel it is alright to attach energy, forces, etc. to Atoms or Structure. I have been moving along these directions in some of my other projects, where I used pymatgen Structure and added additional pydantic checks to ensure the correct shape and typing of all the data. My plan was to replace the KLIFF Configuration sometime in the future with the new stuff. |
Second, @yonatank93 you are absolutely correct that the I was curious why you want to optimize all EAM embedding data. That's typically not how people develop EAM. If I remember correctly, people traditionally only choose a handful of these points (say 10), then optimize the model, and finally tabulate using more (e.g. 10k) interpolating points. EAM is not a data-driven model, so optimizing 10k data points may not be as good. Nevertheless, I just want to let you know this; but still, we need to make the code as fast as possible. |
@mjwen I remembered that we were talking about this when we were in Baltimore. There is another thing we want to try, which is to use some kind of surrogate models to approximate the embedding function, etc. Without changing the KIM model, we do this by using the surrogate models to generate the tabulated values. This is why we set all those tabulated values as parameters. The other case that we tried was to compute the Jacobian of the EAM model in predicting forces of several configurations. I did this just to see which regime of rho and r sampled by the configurations so that when we create our surrogate or if we just use fewer tabulated values, we can appropriately represent these important, sampled regimes. And the motivation for using some surrogate models is that when I did some calculations, I got some kim errors because I tried to sample the embedding function outside the interpolation regime. So, surrogate model can help extending the domain of the embedding function to some degree. |
@ipcamit In general it looks great! We are almost there. I have two questions.
|
|
Going along with @ipcamit comment, the MCMC module as a default uses a uniform prior, and the user needs to specify the bounds of the prior. Since the sampling is done in the transformed space, it would make more sense to input the bounds in the transformed space into the uniform prior. And for me to do, I might want to have default values of the prior bounds by extracting the parameter bounds set by |
What do you mean by this, particularly, what is the use case? From a user's perspective,
I agree leaving the decision to users is a good compromise. My thought is:
This is another reason I don't think |
I think there is no default, i.e. if you do not call |
@yonatank93 if bounds are not set for a parameter, scipy expects
I would not recommend such functionality where the behavior of optimizer is influenced by mere ordering of initialization of two independent fields. Therefore best course of action would be to just fix the |
Depending on what you mean by "initialization", it may or may not be reasonable. If "initialization" means registering the transform function and we are doing later execution (e.g. in But I agree this can still be confusing and if we stick to a single space, it would be easier. For this, we'd better require the bounds to be in the original space, since that is the case if no parameter transformation is used. If we are doing this, then, something like |
An alternative could be to have I am keen on having the ability of adding transformation state bounds as some quantities might be really easy or intuitive in transformed space, but not so in inverse space. |
I am all in the new approach you proposed. |
@mjwen @yonatank93 Updated API and fixed issues:
Examples of new workflow:
params["A"].copy_to_param_(5.0)
params["B"].add_transformed_bounds(np.array([[-2,1]]))
params["A"].add_transform_(base10transform)
params["B"].add_transform_(base10transform)
params["A"].add_bounds(np.array([[0.1, 100]])) Now no need to call the
params["A"].add_bounds(np.array([[0.01, 10],[0.01, 10]]))
params["A"].add_transform_(base10transform)
A_mask = np.zeros_like(params["A"],dtype=bool)
A_mask[0] = True
A_mask[2] = True
params["A"].add_opt_mask(A_mask) Now you can pick and choose array elements to optimize, and selectively mark transformation space for any of the parameters. For example in above, param A is minimized in log space, where as everyother param is optimized in normal space. This paves way for dealing with cases where different scaling transformations might help with different parameters.
param_names = ["A", "lambda"]
model.set_opt_params(param_names)
opt_params = model.parameters()
opt_params["A"].add_transform_(log_transform)
opt_params["lambda"].add_transform_(log_transform) Comments? |
@ipcamit I like that we can apply different transformations to different parameters. Having separate
|
Opt masks can be used to selectively mark vector elements as optimizable. For example for MoS2, the parameter A will 3x1 vector, then applying optimization mask of [True, False, False] would make first term as optimizable but mask 2nd and third from optimizer. So you can optimize only Mo-Mo coefficient, but keep original Mo-S and S-S.
Currently it is not possible as I apply the transformations elementwise, but if needed I can add that feature it is not that difficult to add. However if possible we can do it in the next iteration, unless you want to use it straight away in a project. |
No problem, I don't need it right now, it is just a possible case that I was thinking might come up in the future. I am ok keeping it for the next iteration. |
@ipcamit You did not mention it, but I saw here that parameter setting is now also distinguished by in the original space or transformed space. This is great! I believe that this is ready for a PR into the main kliff repo. Also, I agree with you both that we should not implement functionality that we probably will never use... |
@mjwen Great! then I will close this issue, cleanup a bit, document and submit a PR. We need to discuss documentation (I remember you mentioning jupyter book). Can we schedule another meeting to discuss where will we merge it and tasks forward? |
Yes, let's discuss it. Let's find a time over email or slack... |
Parameters should be treated identically for torch and physics based models.
The text was updated successfully, but these errors were encountered: