Naming, Laziness, and Ordering #214

ChrisRackauckas · 2019-12-22T20:31:01Z

There are three issues that are hitting the same fundamental problem. Here's a few questions to get started:

When doing ODE order lowering, what should we name the outputs? The user has to know and understand the semantics in order to know how to set the inputs to the correct values, so this is not an internal detail.
When doing component-based systems, how do we name things? How does the user know how to set the parameters of component 1 vs component 2?
When building a simple ODE system with automated parameter and variable capturing, how can the user sanely set the parameters?
When doing index reduction, same thing.
Etc. etc. etc. this comes up with almost every transform.

These are all the same underlying issue that, when we generate code:

We need some compiler-y way to make sure we don't override namings.
It needs to be user-facing, because the user must understand the resulting semantics to know how to set values.

How can we solve it? @HarrisonGrodin and I talked in detail in a few different ways. One thing we can do is give up trying to come up with sane names and try to keep things lazy for as long as possible. D(x) makes sense, so try keeping it. The only place where names are then required would be at the very end when we flatten the system to compile it, in which case hopefully the user doesn't have to interact with it anymore.

However, the issue that came up with that is that, the user still has to understand the f that you built because they need to give u0 and p. Do we then move u0 and p into the ODESystem and directly generate ODEProblem? What about tspan? This is the line of thought I was going down before, and it doesn't seem so generalizable: we'd end up special casing a lot of things specifically for DESystem types, and then not a lot of reusability could be had to other domains. Also, it just feels wrong.

But recently an idea that I had was just about building "map vector" functions for the user. For example, if we know internally the order of the variables we are using, then:

map_vector_state(odesys,[x=>1.5,y=>2.0,D(x)=>5.0])

can be used to spit out a u0 vector that matches the system we are generating. The parameters can be handled similarly, and this idea then applies even to NonlinearSystem, optimization, and beyond. It can allow for handling the lazy stuff in a nice way as well. Also, this would be a good place to throw errors, asking for values. For example, if daesys comes from an index reduction, the user might not know up front what values they need to define, and the context of the system itself will be necessary for giving the user a clue. From this we can say Error: D(y) required for the system, and that is easily interpretable by the user.

This still needs more thinking, but it could possibly work. In order to do such matching though, we need to enforce a normal form, which is not something we do well with right now.

@ranjanan @YingboMa @shashi

The text was updated successfully, but these errors were encountered:

YingboMa · 2019-12-23T00:32:05Z

But recently an idea that I had was just about building "map vector" functions for the user. For example, if we know internally the order of the variables we are using, then:

map_vector_state(odesys,[x=>1.5,y=>2.0,D(x)=>5.0])

I like this idea. Each transformation pass should also return the names of the parameters that the function is expecting.

In the book Structure and Interpretation of Classical Mechanics, there is no such thing as differentiating an expression. It only deals with functions. I think we should do that, too. For scalar differentiation, it works like the following:

1 ]=> (D sin)
#| a-euclidean-derivative |#

1 ]=> ((D sin) 0)
#| 1 |#

1 ]=> ((D sin) 'x)
#|
(cos x)
|#


1 ]=> ((* 5 (D sin)) 'x)
#|
(* 5 (cos x))
|#

1 ]=> ((* 5 (D sin)) 0.1)
#| 4.975020826390129 |#

The D operator returns a functor that has arithmetics defined on it. This enables us to do cool things like taking Taylor series by exponentiation:

1 ]=> (series:print (((exp D) cos) 'h) 8)
(cos h)
(* -1 (sin h))
(* -1/2 (cos h))
(* 1/6 (sin h))
(* 1/24 (cos h))
(* -1/120 (sin h))
(* -1/720 (cos h))
(* 1/5040 (sin h))
#| ... |#

Gradient works like this:

1 ]=> (define g (literal-function 'g
    (-> (UP Real Real) Real))) ; g: R^2 -> R
#| g |#

1 ]=> ((D g) (up 'x 'y))
#|
(down (((partial 0) g) (up x y)) (((partial 1) g) (up x y)))
|#

1 ]=> (define (f x y) (* (expt x 3)
                   (expt y 5)))
#| f |#

1 ]=> ((D f) 'x 'y)
#|
(down (* 3 (expt x 2) (expt y 5)) (* 5 (expt x 3) (expt y 4)))
|#

1 ]=> (define (k x) (* (expt (ref x 0) 3)
                 (expt (ref x 1) 5)))
#| k |#

1 ]=> ((D k) (up 'x 'y))
#|
(down (* 3 (expt x 2) (expt y 5)) (* 5 (expt x 3) (expt y 4)))
|#

This approach generalizes well to high dimensional objects, too.

HarrisonGrodin · 2019-12-23T06:34:19Z

Ultimately, it seems this boils down to implementing and working with alpha equivalence.

If we're creating new names (e.g. name * "_" * "t"^n), we have to be very careful that names don't overlap, which seems like a losing game. The likely better option is to let names be completely arbitrary (i.e. let gensym deal with this for us) and figure out what to guarantee the user about the result. I think the better option would be to either (a) return a mapping (array, Dict, etc.) of Tuple{Variable,UInt} to Variable or (b) having a better canonical form for "derivative of x with respect to t" which anyone can produce. The latter seems more sustainable, so I quite like the idea of D(x) being a value.

Regarding @YingboMa's point, I agree - differentiation is really just a higher-order function diff : (R -> R) -> (R -> R). Taking variable binding sites seriously is probably long overdue.

For example, consider the expression sinⁿ(x), meaning sin(sin(...(sin(x))...)). What does it mean to differentiate with respect to x? (For that matter, does it mean to differentiate with respect to n?) It's probably more reasonable to treat this as a function, say, f : (x,n) ↦ sinⁿ(x). Then, diff(x ↦ sin(x,2)) makes actual sense! Food for thought...

shashi · 2019-12-23T18:48:29Z

I was going to open an issue and write something similar to what Yingbo wrote! Was working on the D operator with ForwardDiff + ModelingToolkit ;)

Gradient works like this:
This approach generalizes well to high dimensional objects, too.

More example: Jacobian

1 ]=>(define (foo x)
           (up (sin (ref x 0)) (cos (ref x 1)) (tan (ref x 2))))
1 ]=> ((D foo) (up 1 2 3))
#|
(down (up .5403023058681398 0 0) (up 0 -.9092974268256817 0) (up 0 0 1.0203195169424268))
|#

1 ]=> ((D foo) (up 'x 'y 'z))
#|
(down (up (cos x) 0 0) (up 0 (* -1 (sin y)) 0) (up 0 0 (/ 1 (expt (cos z) 2))))
|#

and (compose D D) is Hessian:

1 ]=> (define (bar x)
          (dot-product (foo x) (up 1 1 1)))

1 ]=> ((D (D bar)) (up 'x 'y 'z))
#|
(down (down (* -1 (sin x)) 0 0) (down 0 (* -1 (cos y)) 0) (down 0 0 (/ (* 2 (sin z)) (expt (cos z) 3))))
|#

Another very neat convention that the book uses is having t as the first element of any state. It's not treated as special like in DiffEq ecosystem, unrelated, but I love that.

Ultimately, it seems this boils down to implementing and working with alpha equivalence.

Our programming language has it, so we should try to use it without re-implementing if possible. As a side effect, users will already know it.

For example, if we know internally the order of the variables we are using, then:
map_vector_state(odesys,[x=>1.5,y=>2.0,D(x)=>5.0])

Can you say a bit about why the user wouldn't know the order of the variables?

D(x) makes sense

from what I can tell D(x) is just a name right like xdot right?

shashi · 2019-12-23T18:55:23Z

One more thing: ((partial i) g) (you can see it in one of Yingbo's examples), read in julia as paritial(i)(g), is an optimized version of x->D(g)(x)[i]. Super convenient to not have to remember names.

YingboMa · 2019-12-23T19:12:09Z

Another very neat convention that the book uses is having t as the first element of any state. It's not treated as special like in DiffEq ecosystem, unrelated, but I love that.

In fact, when plotting a solution object, time is treated as the 0-th state in DiffEqBase.jl. 😸 E.g.

plot(sol, vars=(0, 1))

plots t vs u[1]. We might want to do that in MTK, too?

ChrisRackauckas · 2019-12-23T22:38:35Z

That's not entirely correct though, since with MTK there's not just t, but also potentially x, y, z for example in PDEs.

shashi · 2019-12-23T22:43:51Z

The D operator returns a functor that has arithmetics defined on it.

Actually it just returns a plain scheme function! scmutils defines arithmetic directly on functions. It works quite well.

MasonProtter · 2019-12-24T18:13:03Z

If we're worried about overlap of variable names, how about returning some sort of locally scoped object like a module or closure where the bindings can be safely defined without worrying about variable name clashes?

ChrisRackauckas · 2020-04-11T09:55:29Z

Finished in #265

ChrisRackauckas mentioned this issue Dec 24, 2019

vars() function when the operation is a variable #211

Closed

ChrisRackauckas mentioned this issue Jan 3, 2020

correct ordering of sys.dvs in ode_order_lowering #220

Merged

ChrisRackauckas mentioned this issue Jan 16, 2020

Components #36

Closed

ChrisRackauckas closed this as completed Apr 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Naming, Laziness, and Ordering #214

Naming, Laziness, and Ordering #214

ChrisRackauckas commented Dec 22, 2019

YingboMa commented Dec 23, 2019

HarrisonGrodin commented Dec 23, 2019

shashi commented Dec 23, 2019

shashi commented Dec 23, 2019 •

edited

YingboMa commented Dec 23, 2019

ChrisRackauckas commented Dec 23, 2019

shashi commented Dec 23, 2019

MasonProtter commented Dec 24, 2019

ChrisRackauckas commented Apr 11, 2020

Naming, Laziness, and Ordering #214

Naming, Laziness, and Ordering #214

Comments

ChrisRackauckas commented Dec 22, 2019

YingboMa commented Dec 23, 2019

HarrisonGrodin commented Dec 23, 2019

shashi commented Dec 23, 2019

shashi commented Dec 23, 2019 • edited

YingboMa commented Dec 23, 2019

ChrisRackauckas commented Dec 23, 2019

shashi commented Dec 23, 2019

MasonProtter commented Dec 24, 2019

ChrisRackauckas commented Apr 11, 2020

shashi commented Dec 23, 2019 •

edited