Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

public observed equivalent #3536

Closed
isaacsas opened this issue Apr 3, 2025 · 4 comments
Closed

public observed equivalent #3536

isaacsas opened this issue Apr 3, 2025 · 4 comments

Comments

@isaacsas
Copy link
Member

isaacsas commented Apr 3, 2025

In many biological file formats there are explicit "observed" fields defining functions of the unknowns. It would be nice to have a public equivalent to "observed" for MTK systems (basically so such variables can be evaluated and plotted given solution objects). We have been using observed for this from Catalyst for a long time, but given we were told it should be considered internal it would be useful to have another option. Perhaps such a field could be added for V10 (maybe user_observed)?

@hersle
Copy link
Contributor

hersle commented Apr 3, 2025

For me it usually works well to simply add all equations (including those I know should become unknowns) during model building, then structurally simplify the system, and the expected equations are automatically moved over to observeds. Are there some concrete example use cases that require manually adding observeds?

Sometimes I have to define a "more complicated observed", e.g. some function that involves the full timeseries of the solution instead of just values at one point in time. Then I usually define functions on the full solution object that compute this. I generally think of these as different types of "derived quantities".

If it is needed to add "user observeds", I would rather suggest making the existing observed field compatible and expose it. For example, maybe MTK could verify that observeds (passed by the user) are sensible and do not have to be solved for. If satisfied, then maybe structural simplify could append new observeds to this list instead of starting from scratch.

@isaacsas
Copy link
Member Author

isaacsas commented Apr 3, 2025

Not all systems are supported by structural_simplify, or have a notion of algebraic equations being a valid system equation.

This also relies on internals of structural_simplify always choosing the desired equations as observed. I'd rather not rely on it always doing the right thing since there is no promise it might not choose to keep an observed and eliminate another equation instead.

Finally it also means we are losing information in making the model since observed are now no longer labeled in any distinct way.

I agree that preserving user-provided observed would be nice.

@ChrisRackauckas
Copy link
Member

Not all systems are supported by structural_simplify

We should just define it for all systems? I can't think of a system type that wouldn't have a good notion for what it would mean and do?

This also relies on internals of structural_simplify always choosing the desired equations as observed. I'd rather not rely on it always doing the right thing since there is no promise it might not choose to keep an observed and eliminate another equation instead.

No, it's really the exact opposite. Directly putting something into the observed list is relying on specific compiler internals and implementation details in the solver. And many times you can get that wrong. For example, what if you have @discrete x(t)::Bool y(t)::Bool z(t)::Bool and then you have z ~ x || y? Should that observed live in observed or sys.discrete_subsystems[1].observed? That's an implementation detail that is deep in the weeds, and if you do it wrong you will open an issue asking why MTK is broken because you did sol[z] and go an error. The answer is that it's always simpler to not have people trying to muck with compiler internals and instead force things to go through the routes that have error checking and guarantees of correctness. If you're trying to bypass the correctness checks, that's how you get bugs!

That's not the only case. Another case that shows up often is with callbacks. Whether or not a variable is reducible can be dependent on its relationship to callbacks. For example, z ~ x + y might seem like an innocuous equation, let me just force it to always be deleted from the system right? Then the user comes along and writes a callback that does the modification z ~ Pre(z) + 1. Okay, but if z is not in the system, how do you modify its value during a callback? You can't. The compiler should either ensure that the provenance of this is invertible, i.e. if you also have an algebraic relationship 2x + y ~ 0, then

2x + y ~ 0
z ~ Pre(z) + 1
z ~ x + y

is a well-defined system, and so the DAE does not need to define z because you can directly solve for x and y, and so z ~ Pre(z) + 1 has a unique meaning in terms of actually being a callback that modifies x and y. But if you didn't have that extra algebraic condition, z ~ Pre(z) + 1 is not possible if z is eliminated, so you should keep that variable and possibly eliminate something else. How can you know to do this before you know all of the callbacks a user added to the system? You simply don't, and if you make the wrong choice, then again that's how you get bugs.

Finally it also means we are losing information in making the model since observed are now no longer labeled in any distinct way.

No information is lost. If you tell the system z ~ x || y, then it will guarantee that will be true or error. The source of most of the Catalyst.jl bugs were mucking with compiler internals, bypassing the checks, and then getting errors when the mucking was done incorrectly.

I think the real question to ask is, why do you feel it's so important to do that?

  • Is it for a performance optimization? Since the removal of observed equations like this is <1% of the construction time because it's a linear time part of the pass, so I don't see how it has a major effect on performance.
  • Is it because you want to guarantee that the simplification is always done for improving the runtime performance? It's not clear that's a good idea, but as shown earlier there are cases where that's not correct.

I truly cannot figure out why you feel really compelled to have to bypass all of the correctness testing here, especially when your stated goal is you want a more sustainable implementation with less bugs.

As an analogy, you're effectively asking for the ability to bypass the Julia compiler and directly write x86 assembly code so that way we know the code we're generating and have less bugs. What I'm saying is, no, that does not lead to less bugs, for example x86 assembly code could give you incorrect behavior on Mac M-series chips, and so putting that into your libraries is a very bad idea, and with modern compilers it's probably going to generate better assembly for you anyways, so just write Julia code (or at least LLVM IR). This is why we don't have inline assembly code in our packages, and equivalently it would be good to have none of the equivalent inline MTK assembly code in downstream packages. Especially since with this specific case, there is no benchmark that I know of that shows it's actually saving time, while simultaneously there are loads of issues that document the bugs that are coming from it.

So I think the real question to ask here is, if you really want this feature, can you please explain the motivation and why it is so necessary that you are willing to take the extra maintenance cost?

@isaacsas
Copy link
Member Author

isaacsas commented Apr 3, 2025

OK, sounds like this is a no go and not worth continuing to discuss so I'll close this issue.

@isaacsas isaacsas closed this as completed Apr 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants