Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add unified System type #3543

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

AayushSabharwal
Copy link
Member

Checklist

  • Appropriate tests were added
  • Any code changes were done in a way that does not break public API
  • All documentation related to code changes were updated
  • The new code follows the
    contributor guidelines, in particular the SciML Style Guide and
    COLPRAC.
  • Any new documentation only uses public API

Additional context

Add any other context about the problem here.

@hersle
Copy link
Contributor

hersle commented Apr 8, 2025

Is the idea to write down all system types (ODE/NonLinear/SDE/...) through System?
Then specialize on how to solve it when constructing the ODEProblem/NonlinearProblem/SDEProblem/... from it?

@AayushSabharwal
Copy link
Member Author

Yes. Everything is a System and MTK is smart enough to know what problems can be built from it. We will error if you e.g. try to construct an SDEProblem from a system without noise.

@AayushSabharwal
Copy link
Member Author

Even now we have bits of this. DDEs are constructed through ODESystem, SDDEs through SDESystem.

@hersle
Copy link
Contributor

hersle commented Apr 8, 2025

True. Very interesting, I think it sounds like a great idea! Simpler for the user, and I assume it simplifies the code base significantly and reduces the maintenance burden of duplicating changes across every system type.

Would structural simplification continue to work independently of the system type?

Personally I would be in favor of other similar "merges/simplifications" too, if they don't force ambiguities or "loss of information". E.g. to unify variables and parameters, specify parameter dependencies as normal (initialization?) equations, detect the independent variable from the only differentiated variable, etc. I guess that's a different discussion, though.

@ChrisRackauckas
Copy link
Member

Those are all on the list. One at a time

@AayushSabharwal
Copy link
Member Author

reduces the maintenance burden of duplicating changes across every system type.

Yeah, maintaining so many different systems is really annoying. See the reviews in #3497 (review).

E.g. to unify variables and parameters

#3540

specify parameter dependencies as normal (initialization?) equations

Also something we're working on

detect the independent variable from the only differentiated variable

iirc we did this at one point with the ODESystem(eqs) constructor but it was buggy. Requiring the indepvar for constructing time-dependent systems also makes things easier when merging it into one type.

@hersle
Copy link
Contributor

hersle commented Apr 8, 2025

Cool! Yes, one at the time. Sorry for going off topic, just excited about the push in this direction 😅

@isaacsas
Copy link
Member

isaacsas commented Apr 8, 2025

Along those lines, while keeping noise_eqs and jumps as fields libraries can pass in (we'd definitely prefer to use them from Catalyst), it would be good from a user-interface perspective for people writing symbolic models in MTK to be able to write jump terms as part of an equation. Perhaps there could be a Poissonian, analogous to Brownian, that represents the increment of a Poisson process (or maybe more generally a Poisson random measure)?

@ChrisRackauckas
Copy link
Member

Perhaps there could be a Poissonian, analogous to Brownian, that represents the increment of a Poisson process (or maybe more generally a Poisson random measure)?

Yes, that's one thing here I hope we can get to. The main reason is because we want to capture the jump stuff at a higher level, since if you've already created ConstantRateJump/VariableRateJump then you've already committed to it being irregular / directly handled jumps. This makes it not possible to then switch over to the RegularJump description of it as Poisson variables for tau-leaping.

But that also means there are restrictions. ConstantRateJump/VariableRateJump is more general in what it allows, since that definition is able to potentially change the tolerance of the integrator as its affect! or mutate the number of equations or other crazy things. A Poissonian can only add/subtract values from an equation. But it's a bit tricky to make the mapping. For example, think of just A+B -> AB mixed into some ODE which is flooding more of A and B into the solution:

@parameters rate
@poissonian eta(rate*A*B)
D(A) ~ alpha*A - eta
D(B) ~ alpha2*B - eta
D(AB) ~ eta

The things that are weird about a Poissonian representation is:

  1. It needs somewhere to declare its rate function, which I don't know if using the call is the best way? But it needs to hold that as metadata.
  2. From the 3 different equations we can see that the effect is A - 1, B - 1, and AB + 1 as the reactions, so we can build the affect! from the scan, but it's not local to a single equation.

So it's a bit tricker than a Brownian.

@isaacsas
Copy link
Member

isaacsas commented Apr 9, 2025

Having the rate as an argument would be consistent with how it is written mathematically, and detecting the coefficients, and putting them in a list to internally build an affect function, seems not much harder than what is done for building the SDE noise matrix now.

I agree it would be less general in the affect, but we don't need to require its use -- it would just be a nice convenience for people that have equations easily written in that form.

@ChrisRackauckas
Copy link
Member

Just to give more color here, there's already pieces that overlapping in several ways.

  • ODESystem / SDESystem / DDEProblem / SDDEProblem is the obvious one. SDESystem is just an ODESystem with noise equations, DDEs are just ODEs where some variables happen to be delayed, SDDEs are just ODEs with noise equations that happen to have some delay variables.
  • DiscreteSystem and ImplicitDiscreteSystem. All DiscreteSystems are ImplicitDiscreteSystems, it's just a matter of whether you can causalize the equations whether an ImplicitDiscreteSystem is a DiscreteSystem, which happens to be exactly the structural simplification pass / DAE index reduction.
  • ODE / DAE, obvious, and the same connection between Discrete and Implicit Discrete
  • The clock system and the Poisson jumps are highly related. Poisson jumps are stochastic clocks.
  • JumpSystem -> others, it's just adding jumps to ODESystem / DiscreteSystem
  • OptimizationSystem and ODESystem: the combination of them is Optimal Control problems. Stochastic Optimal Control is then OptimizationSystem on SDESystem. Does OptimizationSystem on a JumpSystem even have a name? I don't know but it seems useful. OptimizationSystem on a DiscreteSystem is Dynamic Programming, and OptimizationSystem on ImplicitDiscreteSystem is ? I don't think it has a name, but it's a useful generalization for writing model-predictive control.

So the proliferation of types is making it difficult to realize the combinations, especially the ones that aren't common and don't have names today. Why not have a JumpProblem with an ImplicitDiscreteSystem? You might think that's crazy, but that's actually a nice internal representation of implicit Tau leaping 😅, so why not use it as a codegen path? Why not mix SDDE and JumpSystem? DAEs and jumps are the way of handling conservation laws.

The reality is that there is one fundamental mathematical unpinning that we're settling down towards. One piece to pull in that mathematically makes it make sense is the Levy Decomposition Theorem, or the Lévy–Khintchine representation https://en.wikipedia.org/wiki/L%C3%A9vy_process, which states that all continuous-time stochastic processes can be written in this decomposition of Brownian motions and jumps. This means that even any noise colored noise process, like pink noise, in theory can be written down in this system (e.g. you can represent pink noise as an SDE with white noise processes, and the solution will have the probability distribution of pink noise).

Thus we are taking that foundation, making it all implicit functions so instead of du = f(u,p,t,dW,dEta) it's now 0 = f(du,u,p,t,dW,dEta), and allowing optimization cost functions to be written on it, and allowing it to hybrid with discrete-time clocked dynamics.

It's just optimal discrete-continuous differential algebraic stochastic jump delay equations, or ODCDASJDEs for short. And we can put neural networks in there, so it's just your standard differentiable Neural ODCDASJDEs symbolic-numeric framework. That's it guys, we can touch grass now.

But it's funny that every subset of the two features tends to be enough of a novelty to be a domain and active field of research:

  • DAEs + ODEs together with tools to transform between them, that's acausal modeling, Modelica
  • SDEs + ODEs together with tools transform between them, that's moment closures and Feynman Kac
  • SDEs + DDEs, that's stochastic delay differential equations where we have the first solvers and people are exploring this space
  • Jumps + SDEs, that's jump diffusions, Levy processes,
  • Jumps + ODEs, point processes, or mixed Gillespie+ODE models in systems biology (allowing terms to go back and forth that's the "hybrid" methods of systems biology and the holy grail to look at there)
  • Optimization + ODEs, that's optimal control in general, and parameter estimation, and BVPs
  • Optimization + SDEs, stochastic optimal control
  • Neural Networks + ODEs, "hybrid models", grey box UDEs
  • DiscreteSystem + ODEs, "hybrid models" (notice how in just this list there are like 3 or 4 different interpretations of "hybrid modeling" used in different domains... for reference that's why I don't use the word because it's not explicit and changes meaning depending on which room you're standing in), controls

and I can keep going. The key is that we are trying to smoothly move between combinations of more than just two of these fields: the naming isn't even developed, and the combination of stuff doesn't have names generally beyond like 3, which makes sense because the number of combinations is egregiously large.

But you can imagine how this is going. I have an implicit jump diffusion with some continuous event handling in which I want to train a neural network. For the Brownian motions, I first simplify it with a Girsanov transformation https://docs.sciml.ai/ModelingToolkit/dev/systems/SDESystem/#ModelingToolkit.Girsanov_transform to turn the general noise into an additive noise system, and then use a MomentClosure https://github.com/augustinas1/MomentClosure.jl to approximate the jumps as ODEs, and on the additive noise Neural SDAE I use index reduction to send it to index-1 form and train the neural network use the differentiable SDAE solver, where the codegen gives me optimized Jacobians (sparsity etc.) and the symbolic-numeric improvements that are coming very soon. That's the dream. And I think having to make MomentClosure for JumpSystem, JumpSystem + Optimization, etc. and Discrete, etc. is a type mess.

This system type, the ODCDASJDEs, are the standard form. Every transformation takes a ODCDASJDE to an ODCDASJDE. We will need to be diligent that passes check assumptions though (i.e. this is only valid if there are no jumps, no discrete, etc.) but those can be a standard information (check_only_ode(sys)). But that's where we are heading

@ChrisRackauckas
Copy link
Member

So the way to think about this as a compiler is the system type is a stable IR representation with layered dialects. If you think about LLVM for example, there are dialects of MLIR like Stable HLO. You can write passes in Stable HLO, and then lower to LLVM IR that doesn't use the higher intrinsic, and at the end of the day the codegen requires going from that final form that is closer to the machine matching. So MLIR has dialects for the things like linear algebra, which you do optimizations on, before lowering that to "everything is just arrays and indexing memory", and on that lower level the LLVM IR knows how to generate machine code because that is a closer model to the machine.

In the same way, the ODCDASJDE System needs to model the mathematical problem types it's trying to build towards. So it has noise_eqs, callbacks, jumps, discrete subsystem, and cost functions possible. But the construction just comes from an array of equations:

@possonian eta(A*B)
@brownian zeta(t)

eqs = [
D(A) ~ D(B) + eta + z
D(B) ~ -eta + zeta
0 ~ sin(z(k+1) +z(k))
]

The lowering passes on eqs are then systematically decomposing the higher level representation to the lower level representation: the brownians make the noise_eqs, the discrete system is separated out, the jumps are constructed, etc. Each of those are just passes in structural simplify, which we should make independent functions when we can. Expanding connections can also be thought of a lowering here too, just one specifically about hierarchical models.

Then when building a problem, we validate. To build, eqs should have no Brownians, Poissonians, discrete updates, jumps, etc. in there, those should be filtered out to the right parts. And we then have certain problem types that will need to error (at least by default), so if you have noise_eqs filled the ODEProblem by default errors saying that your problem has noise, this is not a valid problem type to codegen for a solver (and allow you to override that, since there are cases you might want to go "I want to see how this looks without noise first).

This form of layering transformations can make it valid to say, construct the eqs without any brownians and instead directly use noise_eqs, because you're basically just intercepting the compiler at a lower level. That's not something to commit to as a stable public API of course, but it is something that we should document as a lower step to make the process easier to understand internally, and if someone feels really compelled to over-optimize they could directly write to noise_eqs, though we would highly highly recommend they don't and we'd like to see benchmarks of how much time this is saving if we're going to help with maintaining that at all (it better be at least like a 25% time reduction).

The only piece that is hard to make a fully independent part of the pipeline is the final simplifications of structural simplification, namely the index reduction and tearing passes. This is because the tearing passes include SCC construction and scheduling, as a major form of CSE and specialized codegen information. So tearing and index reduction somewhat need to go together at the end of the pipeline because they rely on the stability of incidence graph not changing in order to define their scheduling representation.

But now the other thing to think about is, are there some transforms that may be easier to write on the higher level representation? I'm not sure. Right now Girsanov is written on the noise eqs, and that's probably the easier way to do that. Regular <-> Irregular jumping behavior, probably easier to gather the jumps and then do the swapping between them. Linearization for controls wants to be done on the lowered form (there's a nice generalization to stochastic and all of that too). So it is likely fairly common for the passes we are writing to want to come after lowering from the high level representation to the internal (noise_eqs, jump list, etc.) representation. This means we may want and easy way to call the full set of lowering passes separately from structural simplification in general, though structural simplification will need to lower before the DAE parts.

@isaacsas
Copy link
Member

isaacsas commented Apr 9, 2025

That's a lot to unpack, but on the regular vs. non-regular jump issue I'd think it is easier to have users specify individual symbolic jumps (explicitly or as Brownians), and then aggregate them if one wants a RegularJump under the hood. This would be relatively straightforward to do right now, and even easier with the new Pre operator functionality, and I had planned/wanted to extend JumpSystems with this at some point as the leaping solvers get better developed.

There are also cases where one might not want to aggregate the jumps for leaping as one may need access to the individual jump rates/affects at different times. So those leaping algorithms would probably prefer the individual jump representation (for example, split-step methods as I previously mentioned).

@ChrisRackauckas
Copy link
Member

One notable thing to think about is what is left out of the ODCDASJDE System. There's a lot covered 😅, but there are some things specifically not covered:

  • Partial derivatives. I think we keep PDESystem separate for now. I would over time, over the next decade, want to allow PDESystem to take Brownians and all of that, but let's wait for a bigger team 😅.
  • Integro-differential equations. In theory we can use the symbolic integral operator in the differential equations. We actually support this in the PDESystems already. Integro-differential equations are non-local operators, and non-locality in time requires the BVP / optimal control type of handling. In fact, the really nice general way to think of IDEs is that they lower to optimal control problems. So, future work.
  • Fractional differential equations. They are also non-local operators, wanting to use the same solution approach. So again, future work.

@TorkelE
Copy link
Member

TorkelE commented Apr 9, 2025

I am slightly worried that throwing out different representations will make really hard to build packages on top of MTK (e.g. Catalyst), and that we are essentially forsaking the entire MTK as an IR and just turn it into a normal modelling package.

Realistically, when is this planned for? Actually, adapting Catlyst to something like this could easily soak up all my available time for a year+. Right now I am barely holding on just trying to keep up with the normal rate of change.

@TorkelE
Copy link
Member

TorkelE commented Apr 9, 2025

This won't solve the removing systems types thing, however, would it make sense to make System analogous with Catalyst's ReactionSystem? I.e. in Catalyst we have a ReactionSystem to represent a model. Next, just before simulation (i.e. when passed to e.g. a ODEProblem), the ReactionSystem is converted to something appropriate for the simulation. Here, System just like ReactionSystem would be something public that the user would know about. Next, Catalyst and MTK could both use similar principles to generate problems internally when the user passes System/ReactionSystem onwards. However, this would not be something that the user would need to be aware or care about.

@ChrisRackauckas
Copy link
Member

I am slightly worried that throwing out different representations will make really hard to build packages on top of MTK (e.g. Catalyst), and that we are essentially forsaking the entire MTK as an IR and just turn it into a normal modelling package.
Realistically, when is this planned for? Actually, adapting Catlyst to something like this could easily soak up all my available time for a year+. Right now I am barely holding on just trying to keep up with the normal rate of change.

Well I don't think it will actually change much because we effectively already do this, just half-assed 😅. The SDESystem code is almost a copy-paste at this point of the ODESystem code. Brownians with system actually just generates an ODESystem with brownians in the equations, and structural simplification then creates an SDESystem from that so it has a place to put the noise_eqs. That actually has a "bug" in some ways though with the sparse Jacobian construction: it cannot use the dependency graph from the structural simplification because SDESystem forgets to cache it while ODESystem does. So...

The point is, we already do this, there's just some weird quirks of history where we don't. This will mostly just delete duplicated code and make things a lot simpler.

This also means, when is this planned for? I don't know maybe it's like two months off? Again, it's not a huge change because there is already lots of shared dispatches, code duplication etc., it ends up mostly being a refactoring rather than an actual change. Most of MTK keeping "different system types" is effectively already faked at this point just for backwards compatibility.

and that we are essentially forsaking the entire MTK as an IR and just turn it into a normal modelling package

No, the exact opposite. As I just described, you already have to do this with the IR right now, it's just the interactions between the system types are underdefined so you need to read the code to know the right way to do things. Brownians in ODESystem -> SDESystem? You have to read the code in order to know that's the "right" way to do things. ODESystem already has a cost function for Optimal Control stuff, did you check that in construction from Catalyst? Etc. A lot of the things the issues that you recently opened from Catalyst using the IR are exactly from this issue of having too many systems. Things where SDESystem would convert initial conditions differently, etc. are just all "oops forgot to also do this on SDEs". That's not a bug that is possible.

With the change, everything is just a System, and there are standard queries for "only ODEs allowed", etc. It should simplify and document the IR much more simply because instead of a plethora of 16 different types where every combination of sharing 70% of the code (not an underestimate or exaggeration), we just have one. It should make it a lot less buggy too since we don't need to duplicate dispatches.

This gives us two well-defined points. There's the high-level modeling interface:

@possonian eta(A*B)
@brownian zeta(t)

eqs = [
D(A) ~ D(B) + eta + z
D(B) ~ -eta + zeta
0 ~ sin(z(k+1) +z(k))
]
sys = System(eqs)
sys = structural_simplify(sys)
prob = AbstractProblem(sys, [...])
sol = solve(prob)

but then there's also the IR form, which is lowered_form(sys), which just does the lowering to remove the Brownians and Poissons etc. into a standard form for pre-processing. The point of what I wrote above is that having the single standard form would make a lot of bugs go away, and would make it much easier to document what the allowed interactions on the IR are.

@TorkelE
Copy link
Member

TorkelE commented Apr 9, 2025

I will need some (lots!) of time to think this through. Ideally, it would be nice not to rush into thing,s and sure we are in a stable place before we move on to this (i.e. I'd prefer waiting til at least end of summer). I realise maybe part of the idea is to make MTK easier to maintain. However, I do think it would make sense to do some work on the test suite, figuring out which packages should downstream test which ones etc. So that when big change happens, we are more than ready for any potential consequences.

@ChrisRackauckas
Copy link
Member

That's a lot to unpack, but on the regular vs. non-regular jump issue I'd think it is easier to have users specify individual symbolic jumps (explicitly or as Brownians), and then aggregate them if one wants a RegularJump under the hood. This would be relatively straightforward to do right now, and even easier with the new Pre operator functionality, and I had planned/wanted to extend JumpSystems with this at some point as the leaping solvers get better developed.

There are also cases where one might not want to aggregate the jumps for leaping as one may need access to the individual jump rates/affects at different times. So those leaping algorithms would probably prefer the individual jump representation (for example, split-step methods as I previously mentioned).

Yeah that was the reasoning behind the SymbolicJump representation I mentioned in the other thread. It could give us a way to lower to a jump to rate and affect without committing to RegularJump vs ConstantRateJump vs VariableRateJump vs MassActionJump. Basically what's going on is that RegularJump/ConstantRateJump/VariableRateJump/MassActionJump are jump implementation details, they are pieces of code that describe how to compute the jump. JumpProcesses.jl is setup to take these representations and give you specialized fast code. But those are not a high-level representation of "the jump" itself, it's a computable instantiation. We need a different layer to be able to symbolically represent "the jump", and then as a lowering pass say "I want to compute a model with this jump as a ConstantRateJump", or "the things inside of the rate are continuous variables so I need to compute this with a VariableRateJump", or "this is a mass action term so I can specialize the computation using MassActionJump", but that's clearly a lowering pass from a higher level representation to a computable representation.

So what we've been missing is just the SymbolicJump(rate, affect), like SymbolicJump(p*A*B, [A ~ Pre(A) - 1]) as something that tells us what the jump to do but makes no statement on how to compute it, and then lowering passes that say "turn this into a mass action jump".

Thinking of how this lower_jumps function would work, it would effectively be like mentioned above:

  1. The standard mode makes everything into irregular jumps (I really hate the phrasing, I took it from Platen's description of having "regular time jumps" (tau-leap) vs "irregular time jumps" (i.e. adapt time and handle each jump exactly), but I think we've outgrown the phrasing and need to consider a better naming scheme).
    1.1. If any of the values in the rate function are continuous, then VariableRateJump
    1.2. If the term is mass action, make it a MassActionJump
    1.3 Otherwise make it a ContinuousRateJump
  2. A different mode (kwarg?) that makes everything into one big RegularJump
  3. Down the line, add smarter things, and a hybrid switching mode?

@isaacsas and this effectively solves the problem that stalled the regular jump stuff for awhile. The biggest issue there was that allowing different regular jumps would be a much nicer model that would match the other jumps better, but it has always required an aggregated jump itself. It's always been a bit of an odd interface break, but there is no form of (rate, affect) per jump that can match anywhere near the performance of the RegularJump, it's like orders of magnitude different. But unlike ConstantRateJump, the space of allowed changes is a lot lower, so it really doesn't need to allow an affect! function, so it was then always this question of, what is the right specification we should use for the user and will this completely change the solver infrastructure?

The real answer is that the jumps should just be symbolically modeled, and RegularJump solvers should always assume they have that aggregated function, and the building of that aggregated function can be done optimally with codegen. Trying to stuff a code-level representation of different Poisson jumps just isn't possible to make have good performance, and really the issue is that the previous idea was trying to mix user-level representation and solver-level implementation, when in reality they should be kept separate.

@ChrisRackauckas
Copy link
Member

Ideally, it would be nice not to rush into thing,s and sure we are in a stable place before we move on to this

Well we're definitely not rushing into this... we're 10 years into the process and we already started collapsing everything into a System type back in 2023 #2207. So the more proper thing to say is that after 8 years of building these interfaces, we sat down and whiteboarded this for about a year, figured out the ins and outs and tested a whole lot of things, and now have been in the middle of the redesign for about 2 years. That's hardly a rush 😅.

I realise maybe part of the idea is to make MTK easier to maintain. However, I do think it would make sense to do some work on the test suite

Now is really the time to do it because we're starting to do this combinatorial expansion of the feature set. The issue of the last 3 months has been this "SDEs promote to float but ODEs don't and we test it on ODEs but not SDEs", "ODE initialization has this hook in remake but Nonlinear is missing it", etc. Basically 90% @AayushSabharwal's fixes weren't actually doing new things when fixing bugs, it was just taking code that existed for one system and putting it on another system that we hadn't done it on yet. So it's very obviously the easiest way to eliminate a ton of maintenance work and the source of many (most?) bugs. Structural issues need structural solutions: you can't keep bandaid patching a space of combinatorial size number of feature combinations. We just need to fix the structure so all of those bugs are impossible.

But it is rather straightforward to do the split here, since this will be the main aspect of the v10. The other things are relatively minor to most users but are good to do in terms of concretizing the definition of the IR and its proper interfaces (#3204). The big visible change will likely be this system unification. So Catalyst can stay on v9 for a bit if it wants, and we can backport bug fixes as needed.

But again, the types of bugs you hit, like:

    # Create model. Checks when input type is `Float64` the produced values are also `Float64`.
    rn = @reaction_network begin
        (k1,k2), X1 <--> X2
    end

    # Checks when values are `Float32` (a valid type and should be preserved).
    u0 = [:X1 => 1.0f0, :X2 => 3.0f0]
    ps = [:k1 => 2.0f0, :k2 => 3.0f0]
    oprob = ODEProblem(rn, u0, 1.0f0, ps)
    osol = solve(oprob, Tsit5())
    @test eltype(osol[:X1]) == eltype(osol[:X2]) == typeof(oprob[:X1]) == typeof(oprob[:X2]) == Float32 # These are all `Float64`
    @test eltype(osol.t) == typeof(oprob.tspan[1]) == typeof(oprob.tspan[2]) == Float32

That's a structural problem which needs a structural solution, so we can patch a few things in v9 as needed to get your tests running but it's ultimately a waste of time. It needs this design change to actually be robust, so we should just change the design and then you update to v10 if you want that stuff to go away. That possibility to have those act differently is just a consequence of the v9 design.

@TorkelE
Copy link
Member

TorkelE commented Apr 9, 2025

But again, the types of bugs you hit, like:

I'd say this is just a CI issue (and maybe there are structural issues that are causing it). It is a bug (?), the reason we caught it in Catalyst is that we actually test this kind of things. Had MTK/OrdinaryDiffEq actually had tests for this kind of things (or checked what went on downstream) this would have been caught and not an issue.

Even right now, if System hits in 2 months, I don't even know how Catalyst would work when MTK drops different types (given that ReactionSystem supports all different system types). We would both have to actually come up with an interface and then implement it as well (while suddenly dropping everything we have had for the last 7 years). I want things to work out as much as you do, but I'm have really been pushing all my capacities with keeping up with stuff for quite a while now and I was really looking forward to getting v15 out and actually having a Sunday off. If something like this hits in June, I really do not know when an where I will be able to find the time to actually adapt Catalyst to this.

@ChrisRackauckas
Copy link
Member

I'd say this is just a CI issue (and maybe there are structural issues that are causing it). It is a bug (?), the reason we caught it in Catalyst is that we actually test this kind of things. Had MTK/OrdinaryDiffEq actually had tests for this kind of things (or checked what went on downstream) this would have been caught and not an issue. You're saying that the obvious answer is to have an exponential set of tests do an exponential set of things and try to catch everything? Isn't the easier solution to just have one thing to test?

That's really not understanding the issue. The Catalyst tests are already miniscule in comparison to the MTK tests. But there's a combinatorial set of features you can test w.r.t. system types. You just found something not covered in the SDESystem tests, cool. But is it covered in OptimizationSystem? DiscreteSystem? Every type for every system. Every combination.

The actual problem that you are finding is that Catalyst does more extensively test and stress the SDEs more than anything else. That is because 95% of contributors spend 100% of their time only focusing on ODESystem and maybe NonlinearSystem. So those two are by far the most robust. But any change they make, you probably need to mirror that change to SDESystem. And if someone forgets, now that's a lingering bug. And if everybody just has to remember to do the right thing every single time, that will inevitably not work sometimes. ODEs will get more extensive tests, more extensive debugging, etc. The best thing to do is to change the system so that every ODE test is also an SDE test. The way to do that is make it the same piece of code.

I don't see how you could think it makes anything easier for anyone. The most common PR review comment on MTK right now is to tell someone "oh you made a change? Do the same thing for XSystem for X in [SDESystem, OptimizationSystem, ...]". In order to contribute to MTK correctly, you do already need to learn what SDESystem is, because if you change ODESystem you better figure out what to do with SDESystem too. For example, look at #3514 adding CSE. This is exactly the kind of thing where you will come by a year and then say "why is XSystem slower?", oh because the CSE PR missed a system. Oops. Wait... why does adding a feature have to be copied 8 times, where if you forget ... one time you end up with an issue?

Who is that easier for? For a new contributor, doing it to one system type would be much easier than having to learn about some system you've never used (OptmizationSystem) and then realizing that needed the same change too. For a maintainer, doing it on one system type is much easier than having to paste around the same fix and double/triple checking you got them all. Like, I don't understand how this is an argument, it's clearly easier for nobody.

Even right now, if System hits in 2 months, I don't even know how Catalyst would work when MTK drops different types (given that ReactionSystem supports all different system types). We would both have to actually come up with an interface and then implement it as well (while suddenly dropping everything we have had for the last 7 years). I want things to work out as much as you do, but I'm have really been pushing all my capacities with keeping up with stuff for quite a while now and I was really looking forward to getting v15 out and actually having a Sunday off. If something like this hits in June, I really do not know when an where I will be able to find the time to actually adapt Catalyst to this.

What is with the fear mongering? It would take almost 0 changes for Catalyst to use this. I don't understand what is so scary here. In fact, Catalyst can adopt this in a completely non-breaking way just by find/replacing XSystem -> System. This literally would take you less than 5 minutes to do the whole repo. I will just do that myself and make the PR. Literally the code:

using Catalyst, ModelingToolkit, Latexify
rn = @reaction_network Repressilator begin
    hillr(P₃,α,K,n), ∅ --> m₁
    hillr(P₁,α,K,n), ∅ --> m₂
    hillr(P₂,α,K,n), ∅ --> m₃
    (δ,γ), m₁ <--> ∅
    (δ,γ), m₂ <--> ∅
    (δ,γ), m₃ <--> ∅
    β, m₁ --> m₁ + P₁
    β, m₂ --> m₂ + P₂
    β, m₃ --> m₃ + P₃
    μ, P₁ --> ∅
    μ, P₂ --> ∅
    μ, P₃ -->end

odesys = convert(ODESystem, rn)

psymmap  = (rn.α => .5, rn.K => 40, rn.n => 2, rn.δ => log(2)/120,
    rn.γ => 5e-3, rn.β => 20*log(2)/120, rn.μ => log(2)/60)
u₀symmap = [rn.m₁ => 0., rn.m₂ => 0., rn.m₃ => 0., rn.P₁ => 20.,
    rn.P₂ => 0., rn.P₃ => 0.]

# time interval to solve on
tspan = (0., 10000.)

# create the ODEProblem we want to solve
oprob = ODEProblem(rn, u₀map, tspan, pmap)

odesys = complete(odesys)
oprob2 = ODEProblem(odesys, u₀symmap, tspan, psymmap)

would work exactly the same way if you change this line of code to System: https://github.com/SciML/Catalyst.jl/blob/v14.4.1/src/reactionsystem_conversions.jl#L507 . It's 3 characters! You have spent more time writing in this thread than the total number of characters to change for the whole repo. I will just do the PR, it's literally not worth the effort to argue about how we will have somebody find the time to do a 5 minute change.

Now again, I would recommend that odesys = convert(ODESystem, rn) changes in Catalyst v16 to odesys = convert(MassActionODESystem, rn) to fix the ambiguity in the naming, but that change doesn't even have to happen. Even if Catalyst keeps wanting to use this syntax 5 years after MTK drops the deprecation path, you can still just do:

struct ODESystem end
export ODESystem

and the v15 syntax will still work without a change to the user interface! So you don't need to change anything regardless of whether the system type is unified.

@TorkelE
Copy link
Member

TorkelE commented Apr 10, 2025

Sorry, I thought this might be coupled with a generalisation of ODEProblem SDEProblem, etc. to Problem, and the full automation of the pipeline with other stuff! I agree, if we can still use ODEProblem everything becomes way more straightforward (and as you say, once they are no longer mandatory, keeping the convert calls becomes easier).

Sounds good, let's keep the loop open with this one, I agree it can make a lot of thing way easier for us (and Aayush especially!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants