Introducing the Lux model for parameters and state #12

CedricTravelletti · 2024-04-29T10:37:46Z

This introduces state inside AtomsCalculators by follwing the Lux model.

This PR is a followup to the Zulip discussion following the April 10th 2024 CESMIX and JuliaMolSim community meeting. It also builds on the discussion in #11 and adresses the same needs.

Philosophy

The basic idea behing this PR is to separate high-level and low-level calls. High-level calls are defined as the ones to potential_energy, forces, forces! and virial, whereas the low-level ones are the ones to calculate.

High-level calls preserve backward compatiblity, while low level calls now follow the Lux model, by using immutable calculators, accepting parameters and state arguments and returning state.

Detailed documentation is bundled with the PR. We refer to it for details.

tjjarvinen · 2024-05-02T22:34:20Z

Thanks for the PR. For the main points it looks good!

I did not look the all inner workings of the @generate_interface macro yet, but I'll look for it soon.

Also, I would like this to become a breaking release (v0.2).
I have some bug corrections and other issues coming for v0.1 that I would like to merge first, So, that all the initial works are in good shape with a final release in v0.1 series. After that we could proceed with this and add other features too, like Hessian.

CedricTravelletti · 2024-05-07T14:29:33Z

@tjjarvinen Thank you for your feedback.

Waiting on these bug corrections then.

cortner · 2024-05-11T01:50:51Z

This is now blocking for me to proceed with the new ace models. Would be great to get it reviewed and merged.

tjjarvinen · 2024-05-11T01:54:51Z

This is now blocking for me to proceed with the new ace models. Would be great to get it reviewed and merged.

You can do review it too! We need couple of them anyways.

I will try to go trough the macro. But I presume it is in good shape. The macro does need more documentation on how it works. But we can do that in the future.

cortner · 2024-05-11T02:03:30Z

Of course I will as soon as somebody has done a first pass please.

tjjarvinen

Sorry for taking this long, but there are a lot of points to check with this.

I am ok with the idea and marked some points on the code that could be looked onto.

But then I have also two bigger issues. One onto which I would like to have some comments from others too. And other that is just implementing tests.

1 Lux interface

We add state and parameters and in the return structure we give out state.

For us this is logical. We return internal state (electronic structure) and we can then calculate some additional features out of it. Parameters like Lennard-Jones terms are not returned.

This is a contradiction to Lux where state is the one you applie learning onto - we don't want to learn internal state, but instead we want to learn parameters.

So, what I am saying here is that the Lux model does not apply straight to our case and then the question is how do we deal with this?

2 Tests

If we are defining the new interface to return state and the function calls to allow calls with 5 positional arguments, then we need to have tests for these also.

examples/low_vs_high_level_calculator.jl

docs/src/interface-definition.md

src/utils.jl

tjjarvinen · 2024-05-14T03:16:00Z

examples/low_vs_high_level_calculator.jl

+
+# Low-level call, implemented by the user.
+AtomsCalculators.calculate(AtomsCalculators.Energy(), hydrogen, LowLevelCalculator(nothing),
+                          state = 1.0)


Note this is not a correct call. state = 1.0 is now considered as a keyword and not the argument state.

julia> AtomsCalculators.calculate(AtomsCalculators.Energy(), hydrogen, LowLevelCalculator(1), 1, state=2) (energy = 0.0 Eₕ, state = nothing) #but julia> AtomsCalculators.calculate(AtomsCalculators.Energy(), hydrogen, LowLevelCalculator(1), 1, 2) (energy = 0.0 Eₕ, state = 2)

cortner · 2024-05-15T00:11:59Z

I think the state could in principle contain part of the output of the model, provided it is invalidated on the next call, or when the model is called with a new input. I actually quite like this usage of st. Do I understand this correctly? ...

at1 = # some structure
at2 = # another structure 

v1, st = virial(at1, calc, ps, st)
# st.virial, st.energy, st.forces are all computed during this call.

E1 = energy(at1, calc, ps, st) 
# no new computation to be done, just return st.energy
# (same with forces) 

E2 = energy(at2, calc, ps, st)
# new input structure at2, so the content of st must be invalidated.

To be clear, I'm not advocating for this yet. I am just thinking out loud -- is this the intention? And if yes, I don't really see how it contradicts the Lux model?

This is a contradiction to Lux where state is the one you applie learning onto

I don't understand this statement. In Lux the state is just an internal state of the model that can contain anything you want (temporary arrays for example) or just nothing at all.

My understanding of Lux is that you do

output, st = model(input, ps, st)

you learn parameters ps such that output matches some data. The value of output should be entirely independent of the contents of st.

cortner · 2024-05-15T00:14:36Z

Re tests, I agree of course.

tjjarvinen · 2024-05-15T00:35:09Z

The issue here is most likely what the terms mean.

Like if one implementation uses state for learning and other does not. Will this lead to some issue? I am concerned that this leads to some problems in the future.

In any case we should have more documentation that defines on what to do with these two (state and parameter).

tjjarvinen · 2024-05-15T00:37:09Z

And one thing I forgot from the review.

We need functions get_state, get_parameters and probably also set_state and set_parameters.

cortner · 2024-05-15T01:19:13Z

I agree with that. I already have those in ACEpotentials and could switch over to extending methods here.

CedricTravelletti · 2024-05-15T13:20:27Z

I think the state could in principle contain part of the output of the model, provided it is invalidated on the next call, or when the model is called with a new input. I actually quite like this usage of st. Do I understand this correctly? ...
at1 = # some structure
at2 = # another structure 

v1, st = virial(at1, calc, ps, st)
# st.virial, st.energy, st.forces are all computed during this call.

E1 = energy(at1, calc, ps, st) 
# no new computation to be done, just return st.energy
# (same with forces) 

E2 = energy(at2, calc, ps, st)
# new input structure at2, so the content of st must be invalidated. 
To be clear, I'm not advocating for this yet. I am just thinking out loud -- is this the intention? And if yes, I don't really see how it contradicts the Lux model?

This is a contradiction to Lux where state is the one you applie learning onto

I don't understand this statement. In Lux the state is just an internal state of the model that can contain anything you want (temporary arrays for example) or just nothing at all.

My understanding of Lux is that you do
output, st = model(input, ps, st)
you learn parameters ps such that output matches some data. The value of output should be entirely independent of the contents of st.

Exactly, and this is the intention in this PR.

Just to clarify (inspired from Why we wrote Lux:

Calculators must be immutable – cannot store any parameter/state but rather store the information to construct them
Calculators (in their low-level calls) are pure functions
Calculators (in their low-level call) return a Tuple containing the result and the updated state

Hope this clarifies the intent.

CedricTravelletti · 2024-05-15T13:23:10Z

@tjjarvinen and @cortner Thans for the feedback!
I will go over the suggestions and add the requested tests.

Concerning the get_state and set_state, if we strictly follow the Lux model, then Calculators calls are pure functions, so we don't have a state inside the calculator.

I can write some more documentation and examples to clarify the situation.

cortner · 2024-05-15T17:37:19Z

if we strictly follow the Lux model ...

I thought we had agreed to not do that. But I'm totally open to revisiting this. The reason we felt we can have a mixed model is that users never want to see ps, st. I thought that the suggestion was to have calls involving ps, st act in a pure way (Lux model) but that we can still store those parameters inside the model if we wish and in that case the calls can become mutating.

An alternative approach that was also discussed and which might be preferrable is to provide a simple wrapper

struct StatefulCalculator
    calc
    ps 
    st 
end

I am not able to anticipate all potential problems with either approaches. I immagine both have advantages and disadvantages.

tjjarvinen · 2024-05-15T21:22:43Z

We also have the high level interface that needs to have parameters and state in the calculator structure. So, we need to have the function calls to extract the state and parameters too.

If we go for the pure Lux formulation then we need to separate the calculators to low and high levels, which is not good.

cortner · 2024-05-15T21:30:26Z

I think at a minimum there is no harm - for now - in allowing calculators to store ps and st.

Provided the low level interface strictly follows Lux rules.

If we learn otherwise we can revisit this discussion.

tjjarvinen · 2024-05-15T21:35:05Z

While calculators contain state and parameters, in low level calls when you get state and parameters as an input, you can just ingnore the contained state and parameters, and perform the calculation with the ones you got as an input. Then you have pure function calls with low level interface.

mfherbst · 2024-05-16T06:00:36Z

In my memory we also advocated for a mixed model, where high-level is convenient and low-level is lux-like. This of course has a little bit the risk of suggesting some wrong expectations about how the mechanics works, so we have to be strict in code examples and document properly.

I would suggest for example that the low-level calls ignore internal state and parameters and that the high-level calls do not take state and parameters at all, but require you to externally take care of updating the state yourself. I.e. I advocate a get_state and set_state and similar and I advocate forcing users of the high-level interface to use those for updating the objects stored in the calculator to make the mechanics clear. Thoughts on that?

CedricTravelletti · 2024-05-16T08:26:32Z

I agree with the above comments about the state and parameters issue.
I think @cortner 's comment pins down the solution:

[...] there is no harm - for now - in allowing calculators to store ps and st.
Provided the low level interface strictly follows Lux rules.

So my implementation suggestion is the following:

low-level calls take (and need) ps and st, thereby strictly following the Lux convention.
bundled calculators are allowed, and are the convention for high-level calls.
high-level calls do not take ps and st.
This means that developpers of calculators are left free to specify the intended behaviour of high-level calls (re-use the state or start afresh or else) and users that want manual control have to use the getters/setters.
We include getters and setters for ps and st.

If you agree with those points, I'll begin working on an implementation.

tjjarvinen · 2024-05-16T18:24:06Z

We should also make sure that low level calculators work with high level calls and force it in the test set.

For usability point it is very important that all low level calculators work with high level calls.

CedricTravelletti · 2024-05-17T08:13:49Z

We should also make sure that low level calculators work with high level calls and force it in the test set.

For usability point it is very important that all low level calculators work with high level calls.

Agreed. Do you think that what is currently being done in AtomsCalculators.jl/src/submodules/AtomsCalculatorsTesting.jl covers this?

It seems to me that these tests cover both low-level and high-level calls for energy, forces and virial.

cortner · 2024-05-25T17:14:52Z

Can somebody give me a brief summary what is still needed to finish this PR please?

cortner · 2024-05-25T17:31:30Z

docs/src/interface-definition.md

@@ -32,6 +57,8 @@ Each of the individual calls have two common inputs: `AtomsBase.AbstractSystem`
 - First input is either `Energy()`, `Forces()` or `Virial()`
 - Second is `AtomsBase.AbstractSystem` compatible structure
 - Third is `calculator` structure
+- Fourth is `parameters`


flag parameters and state and optional?

cortner · 2024-05-25T17:33:34Z

I didn't see anything in the documentation or the PR about providing calls of the form

frc, vir = calculate( ( Forces(), Virial(), ), sys, calc, ...)

Is this not intended to be supported?

From the user's point of view the most common calls will be

energy only
forces ony
forces and virial (e.g. to do MD or optimization in position and cell space)
all of energy, forces and virials together

cortner · 2024-05-25T17:39:15Z

It is also not entirely clear to me how to make the calculate interface non-allocating. Something like the following might be needed?

frc = # allocate
frc, vir = calculate!(  (forces = frc, virial = nothing), sys, calc, ...)
E, frc, vir = calculate!(  (energy = nothing, forces = frc, virial = nothing), sys, calc, ...)

To be clear, I'm not happy with the nothing, I just couldn't think of how to do this more elegantly off the top of my head.

If I missed something when skimming the PR, please point it out.

Was the calculate interface a bad idea afterall? I know I was really pushing for it because it would be extendable. And I'd like to make it work. @CedricTravelletti -- are you free for a zoom call next week Wed-Fri to discuss it?

tjjarvinen · 2024-05-25T17:42:42Z

Calculate interface does not have these at this moment. It has always been discussed as what to do, but not implemented.

You can call

res = AtomsCalculators.energy_forces(sys, calc; kwords...)
res.energy # energy
res.forces # forces
# note there can be something else too - this is allowed by the interface

res = AtomsCalculators.energy_forces_virial(sys, calc; kwords...)
res.energy # energy
res.forces # forces
res.virial   # virial
# note there can be something else too - this is allowed by the interface

Also, the order might not be energy, forces, as that is not defined by the interface. The interface also allows you to return something else to gether with energy, forces and virial.

There are no forces_virial call at this moment.

cortner · 2024-05-25T17:47:35Z

One more comment - I am getting the impression that we are giving calc developers the option of implementing either interface. My first instinct is that this is not helpful. It will create confusion on which one to implement. I think there should be a very clear default, and anybody not following that proceeds at their own risk.

Ideally that default is relatively simple. My personal preference is that this should be a non-allocating low-level interface involving params and states and that the everything else gets derived from that automagically. (e.g. through macros)

cortner · 2024-05-25T17:49:17Z

@tjjarvinen

Also, the order might not ...

The way I wrote the call, the order would become unambiguous so one could in principle enforce it. It may or may not be a good idea but it is possible.

There are no forces_virial call at this moment

Yes, I'm aware. I'm just listing what are typical use-cases. For most use-cases this might not be needed since energy normally doesn't cost anything. But what about a calculator that has no energy? E.g. force mixing. You could return missing for the energy. Again, not clear what it should be, I'm just listing what I think are canonical use-cases.

tjjarvinen · 2024-05-25T18:01:18Z

One more comment - I am getting the impression that we are giving calc developers the option of implementing either interface. My first instinct is that this is not helpful. It will create confusion on which one to implement. I think there should be a very clear default, and anybody not following that proceeds at their own risk.

The issue with only having low level (lux call) that is implemented, is that developers then need to understand lux and that will just give new people impression that the interface is just for esoteric persons. Basically what they mostly want is just an easy way to create a custom calculator and get going, and not to bother with lux stuff.

So, the whole point here is that you can do it easy, and if you need the extra features the low level interface offers, then you can put the extra effort to implement it.

cortner · 2024-05-25T18:03:10Z

You are probably right. But the documentation needs to be crystal clear about it.

cortner · 2024-05-25T19:15:29Z

I think I've been derailing the discussion here a little bit and I apologize for this. I'm just concerned that with the slow speed of development and review I don't want another major revision of the interface immediately after merging the current one.

Having thought about this again a bit longer, could there be an alternative way to make calculators non-allocating by allowing the caller to pass in pre-allocated forces as a keyword argument? It would look something like that.

# Implementation 
function forces( sys, calc, ...;   frc = zero_forces(sys, calc),  kwargs...) 
     # ... 
     return frc 
end 

# allocating usage 
frc = forces(sys, calc) 

# non-allocating usage 
frc = # allocate by whatever means, e.g. bumper
forces(sys, calc; frc = frc)

I think this would have the huge advantage of requiring only one interface and we can get rid of forces! and similar.

I haven't thought through what the best option could be for the calculate interface, but I can see multiple ways how to achieve this.

tjjarvinen · 2024-05-25T20:29:36Z

There are problems with using keywords like that.

First, the keyword frc would need to be implemented for all calculators, as it would become a part of the interface.

Currently the intended interface usage is that keywords are used to give control commands to individual calculators. Calculators can ignore these, if they wish. Arguments on the other hand are intended to be reacted by every calculator.

Second issue is that, are there any Julia programs that mutate keywords? Or is it safe from language point?

suggestion for the issue

That being said, could you simply do

function myforces(sys, calc, ...;   frc = zero_forces(sys, calc),  kwargs...)
     return forces!(frc, sys, calc; kwargs...)
end

You could do that in your own package without need to change AtomsCalculators interface itself. We could even have this call in AtomsUtilityCalculators with the current version of AtomsCalculators.

calculator interface

For calculator interface I suggest doing

mutable struct Forces{T}
    f::T
 # add builder that makes sure f is either nothing or valid force array
end

# allocating call
F_allocating = Forces(nothing)  # or simply F()
calculate(F_allocating, sys, calc, ...; kwargs...)

# non allocating call
F_non_allocating = Forces( zero_forces(sys, calc) )
calculate(F_non_allocating, sys, calc, ...; kwargs...)

This would need an changes in calculator implementation and we would need to change the macro etc.

tjjarvinen · 2024-05-25T20:33:34Z

You could also write a macro that would do something like

@forces sys calc f_pre_allocated (args...) (kwargs...)

cortner · 2024-05-25T20:39:14Z

Part of my proposal was to avoid having too many interface functions. if it were just about forces!, where would not be an issue, but we also need energy_forces! and energy_virial_forces! and so forth ... My impression is that all of the alternative suggestions increase complexity rather than reduce it. Both AtomsBase and AtomsCalculators are already overwhelming and need to simplify them rather than making them more complex. And once we get to the calculate interface it gets even more complex.

First, I don't see the problem of requiring all calculators to accept the frc keyword argument. Why can we not ask that?

But secondly, I think one could even do this without that requirement. Once can simply make it clear in the interface that there is no guarantee that the returned forces are the same array as was passed in and therefore to require usage as follows

# Implementation 
function forces( sys, calc, ...;   frc = zero_forces(sys, calc),  kwargs...) 
     # ... 
     return frc 
end 

# allocating usage 
frc = forces(sys, calc) 

# non-allocating usage 
frc_alc = # allocate by whatever means, e.g. bumper
frc = forces(sys, calc; frc = frc_alc)

and analogous usage of forces_virial and of energy_forces_virial.

I actually think this doesn't even require any change to the interface, only documentation and can be entirely optional.

To be clear I think it would be kind of a corner-case usage where the allocation of the force vector creates a measurable slow-down. But I have encountered such cases and would therefore like to have that option if possible without creating too much additional complexity.

tjjarvinen · 2024-05-25T20:56:52Z

I actually think this doesn't even require any change to the interface, only documentation and can be entirely optional.

Yes, you can implement that with the current implementation. That is completely valid, but optional.

tjjarvinen · 2024-05-25T21:35:47Z

Agreed. Do you think that what is currently being done in AtomsCalculators.jl/src/submodules/AtomsCalculatorsTesting.jl covers this?

It seems to me that these tests cover both low-level and high-level calls for energy, forces and virial.

It covers it well with the current interface. But the new calculator interface returns state and that has to be taken into account. Also, some kind of tests that it is possible to have state and parameters, as arguments need to be implemented too.

CedricTravelletti · 2024-05-27T13:28:28Z

Agreed. Do you think that what is currently being done in AtomsCalculators.jl/src/submodules/AtomsCalculatorsTesting.jl covers this?
It seems to me that these tests cover both low-level and high-level calls for energy, forces and virial.

It covers it well with the current interface. But the new calculator interface returns state and that has to be taken into account. Also, some kind of tests that it is possible to have state and parameters, as arguments need to be implemented too.

Ok, then I'll include those tests.

CedricTravelletti · 2024-05-27T13:42:47Z

To summarize the discussion up to now:

It seems that we have reached a consensus on using the Lux model for parameters and state. And that the primary objective of this PR is thus adressed.

So, the whole point here is that you can do it easy, and if you need the extra features the low level interface offers, then you can put the extra effort to implement it.

The (intended) way to use this new interface should be documented more thoroughly.
There is still some debate concerning allocating vs non-allocating call, and how have combined calls (e.g. energy + forces + virial).

While I agree that 3. is important, I feel it is a separate issue, and that it can be somehow adressed by using states.
Indeed, for most calculators, if you want energy AND forces, what will be done under the hood is to first compute energies and use the resulting state of that calculation to compute forces. I thus tinkg that having forces! and states is sufficient for the moment.

CedricTravelletti · 2024-05-27T13:44:01Z

@cortner I am available on coming Wednesday or Thursday for a call.

cortner · 2024-05-27T14:16:39Z

I'll look at my schedule and get back to you on Zulip.

Regarding "most" - you are thinking of electronic structure. In the empirical and ML potentials world this is not normally done.

cortner · 2024-05-29T05:06:46Z

docs/src/interface-definition.md

+parameters through `potential_energy`, `forces`, `forces!` and `virial` and low-level calls 
+with user-specifiable parameters through `calculate`. 
+
+The low-level `calculate` interface follows the [Lux](https://lux.csail.mit.edu/stable/) model for parameters and state. This means that calculators are **immutable** structs that are passed 


calculators will NOT be immutable. But they need to ACT as immutable. This is a subtle but important difference.

cortner · 2024-05-30T06:09:56Z

@CedricTravelletti -- following up from our discussion ... to make it easier for me to contribute to this PR (especially docs), please create a new lux branch in the current REPO, redirect the PR to that branch. I will then merge it. And then we create a new PR to target main. I think it will also be good to restart the discussion with a clean slate, as there is now a lot of noise here.

CedricTravelletti · 2024-05-30T09:20:34Z

@cortner Ready for branch transfer. I will add yesterday's meeting notes to the new PR discussion.

CedricTravelletti added 3 commits April 29, 2024 12:06

Implemented Lux model for calculate. Includes tests.

fe23a6b

Added docs for Lux model.

aa81848

Added example for low-level vs high-level.

50d4468

CedricTravelletti changed the title ~~Introductin the Lux model for parameters and state~~ Introducing the Lux model for parameters and state Apr 29, 2024

tjjarvinen mentioned this pull request May 7, 2024

Testing improvements and a clarification to the interface #13

Merged

tjjarvinen requested review from cortner and tjjarvinen May 11, 2024 02:01

tjjarvinen requested changes May 14, 2024

View reviewed changes

rkurchin mentioned this pull request May 15, 2024

Introducing state #11

Closed

CedricTravelletti added 2 commits May 15, 2024 17:56

Implementing PR reviews.

0b40e38

Refactored macro to determine types in a single place.

0c2356b

Included getters and setters for ps and st.

6e8019c

Fixed Missing typo.

db245b8

cortner reviewed May 25, 2024

View reviewed changes

cortner reviewed May 29, 2024

View reviewed changes

CedricTravelletti changed the base branch from master to lux May 30, 2024 09:11

Merge branch 'lux' into lux

d231527

Fixed tests so that state and parameters default to nothing.

07083e8

cortner merged commit ac04750 into JuliaMolSim:lux May 30, 2024
4 checks passed

CedricTravelletti mentioned this pull request May 31, 2024

Interface re-design: state and Lux model #22

Merged

Introducing the Lux model for parameters and state #12

Introducing the Lux model for parameters and state #12

Conversation

CedricTravelletti commented Apr 29, 2024 • edited Loading

Philosophy

tjjarvinen commented May 2, 2024

CedricTravelletti commented May 7, 2024

cortner commented May 11, 2024

tjjarvinen commented May 11, 2024

cortner commented May 11, 2024

tjjarvinen left a comment

Choose a reason for hiding this comment

1 Lux interface

2 Tests

tjjarvinen May 14, 2024

Choose a reason for hiding this comment

cortner commented May 15, 2024 • edited Loading

cortner commented May 15, 2024

tjjarvinen commented May 15, 2024

tjjarvinen commented May 15, 2024

cortner commented May 15, 2024

CedricTravelletti commented May 15, 2024

CedricTravelletti commented May 15, 2024

cortner commented May 15, 2024 • edited Loading

tjjarvinen commented May 15, 2024

cortner commented May 15, 2024

tjjarvinen commented May 15, 2024

mfherbst commented May 16, 2024

CedricTravelletti commented May 16, 2024

tjjarvinen commented May 16, 2024

CedricTravelletti commented May 17, 2024

cortner commented May 25, 2024

cortner May 25, 2024

Choose a reason for hiding this comment

cortner commented May 25, 2024 • edited Loading

cortner commented May 25, 2024 • edited Loading

tjjarvinen commented May 25, 2024

cortner commented May 25, 2024

cortner commented May 25, 2024 • edited Loading

tjjarvinen commented May 25, 2024

cortner commented May 25, 2024

cortner commented May 25, 2024 • edited Loading

tjjarvinen commented May 25, 2024

suggestion for the issue

calculator interface

tjjarvinen commented May 25, 2024 • edited Loading

cortner commented May 25, 2024 • edited Loading

tjjarvinen commented May 25, 2024

tjjarvinen commented May 25, 2024

CedricTravelletti commented May 27, 2024

CedricTravelletti commented May 27, 2024

CedricTravelletti commented May 27, 2024

cortner commented May 27, 2024

cortner May 29, 2024

Choose a reason for hiding this comment

cortner commented May 30, 2024

CedricTravelletti commented May 30, 2024

CedricTravelletti commented Apr 29, 2024 •

edited

Loading

cortner commented May 15, 2024 •

edited

Loading

cortner commented May 15, 2024 •

edited

Loading

cortner commented May 25, 2024 •

edited

Loading

cortner commented May 25, 2024 •

edited

Loading

cortner commented May 25, 2024 •

edited

Loading

cortner commented May 25, 2024 •

edited

Loading

tjjarvinen commented May 25, 2024 •

edited

Loading

cortner commented May 25, 2024 •

edited

Loading