<a href="https://colab.research.google.com/github/aniketjivani/generative_experiments/blob/master/ExploreHamiltonians.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

My somewhat silly attempt at a self-contained tutorial 🕵 on NNs + conservations laws + "real" applications. Hope this is useful! Things that are unclear to me or where potential connections need more investigation are highlighted in red, and there's plenty of these.

Hamiltonian mechanics is a branch of physics concerned with conservation laws and invariances. There is considerable interest in using deep neural networks to model dynamical systems, whether for system identification or long-term prediction purposes. Existing neural networks that are used to model dynamical systems, irrespective of whether they predict states in discrete timesteps or in a continuous fashion, do not learn exact conservation laws or invariant quantities - to quote one of the papers below, they may drift away from the true dynamics of the system as small errors accumulate.

One motivating work for this is the section on Hyperbolic CNNs from Ruthotto and Haber (https://doi.org/10.1007/s10851-019-00903-1) although their work only addresses parameter-efficient training of networks for image-classification type problems.

The methods we will see next, namely:

1. Hamiltonian NN (https://arxiv.org/abs/1906.01563) (see 🤦 P.S.)

2. Symplectic ODE Nets (https://arxiv.org/abs/1909.12077)

3. TRS-ODEN (https://arxiv.org/abs/2007.11362)

4. Hamiltonian OpInf (https://arxiv.org/abs/2107.12996)

are different types of neural network architectures / reduced-order models that attempt to endow models with better inductive biases, in the expectation that the resulting trained model can generalize better and enjoy predictable longer time rollouts.

We will start with the following questions:

a. What is the Hamiltonian? How do we interpret it in terms of conservation laws?

b. What is symplectic integration? How does it tie in to Hamiltonian mechanics?

c. How do we write out the Hamiltonian for different systems of interest?

d. How do we learn it from data without explicit knowledge of its functional form? How do these approaches scale and what are interesting applications?

e. What is the typical improvement seen by enforcing the Hamiltonian in predictions over time?


🤦 P.S. Btw, there's Lagrangian Neural Networks too, though my rudimentary understanding is that this is more for convenience of representation rather than any specific advantage: you can [obtain the Hamiltonian from the Lagrangian](https://physics.stackexchange.com/questions/190471/constructing-lagrangian-from-the-hamiltonian) - see https://greydanus.github.io/2020/03/10/lagrangian-nns . Their [GitHub](https://github.com/MilesCranmer/lagrangian_nns) also helps sum up the hype nicely if you doubt the claims 😏

|	| Neural Networks  | [Neural ODEs](https://arxiv.org/abs/1806.07366) | [HNNs](https://arxiv.org/abs/1906.01563)  | [DeLaN (ICLR'19)](https://arxiv.org/abs/1907.04490) | LNNs (this work) |
| ------------- |:------------:| :------------:| :------------:| :------------:| :------------:|
| Can model dynamical systems | ✔ | ✔ | ✔ | ✔ | ✔ |
| Learns differential equations | | ✔ | ✔ | ✔ | ✔ |
| Learns exact conservation laws | | | ✔ | ✔ | ✔ |
| Learns from arbitrary coords. |✔ | ✔|| ✔ | ✔ |
| Learns arbitrary Lagrangians | | |  | | ✔ |


$\color{red}{\textrm{Free energy}}$?! - https://arxiv.org/pdf/1706.09010.pdf

### Chapter 1: Through goes Hamilton!

Its an [F1 🏎 joke : )](https://youtu.be/TOWEAIG-OXU?si=QxoLCvvL89ygUxwZ&t=104) -

I took a stab at reading [David Morin's chapter](https://bpb-us-e1.wpmucdn.com/sites.harvard.edu/dist/0/550/files/2023/11/cmchap15.pdf) on this topic -  so we will begin by briefly defining the principles of Newtonian, Lagrangian and Hamiltonian mechanics (_the latter two reinterpreting the former in ways that enabled solving problems in other domains!_)

The quantity $E$ is defined as:

$$E \equiv \left(\sum_{i=1}^N \frac{\partial L}{\partial \dot{q}_i} \dot{q}_i\right) - L$$ is the energy of the system.

The above rewritten in a certain way defines the Hamiltonian $H$.

Consider a particle undergoing 1D motion under the influence of a potential $V(x)$. ($x$ is Cartesian coordinates)

Then $L \equiv T - V = m\dot{x}^2/2 - V(x)$. This means $E$ takes the form:

$$E \equiv \frac{\partial L}{\partial \dot{x}} \dot{x} - L = 2T - (T - V) = T + V$$

aka $\color{cyan}{\textrm{the total energy}}$. This is true even for arbitrary coordinates $q$ as long as $x=x(q)$ without any time dependence i.e. $x \neq x(q, t)$, or equivalently $q=q(x, t)$. These complications are not considered in the book. $\color{red}{\textrm{Write equations showing this below}}$



> $\color{lime}{\textrm{Theorem:}}$ A necessary and sufficient condition for $E$ to be total energy of a system when the Lagrangian is expressed in coordinates $q_i$ is that $q_i$ are related to Cartesian coordinates $x_i$ ($q$ may be of smaller dimension than $x$) via (invertible) expressions of the form:

$$x_1 = x_1(q_1, q_2, \cdots)$$

$$\vdots$$

$$x_N = x_N(q_1, q_2, \cdots)$$






$\color{red}{\textrm{Then what exactly is being conserved when there is time-dependence?}}$

Trivial example for dimension of $q$: Particle constrained to move in $xy$ plane, which implies $z=0$. Other examples? Motion of pendulum?

In closing, I'll quote from the text:



> When solving ordinary mechanics problems with the Hamiltonian formalism,
you should keep in mind that the purpose is not to gain anything in efficiency, but
rather to become familiar with a branch of physics that has numerous indispensable
applications to other branches


Recipe for solving problems with Hamiltonian Method:

1. Calculate $T$ and $V$, get $L$ in whatever coordinates is the most convenient.

2. Calculate $p_i \equiv \partial L / \partial \dot{q_i}$

3. Solve for



### Chapter 2: If you thought integration was hard...

It gets harder

### Chapter 3: Learning the Hamiltonian

### 3.1 When analytical form is known...

We are definitely considering harder problems than the canonical systems presented here, but it is instructive to see these anyways.






### 3.2 Learning from "real" data

Pendulum experiment from https://github.com/greydanus/hamiltonian-nn/blob/master/analyze-pixels.ipynb






**Hamiltonian Neural Networks**:



$\color{lime}{\textrm{Performance on non-conservative systems?}}$ - as it turns out, many of these papers do consider examples from actual experiments like measurements on a real pendulum, where there is dissipation on account of air resistance. The HNN has smaller error compared to the baseline NN, but always assumes a conserved quantity exists - so cannot account for friction. This implies that would have to be modeled separately.

This is also a very nice segue into reversibility:

### Chapter 4: Let's put it to the sword

We will perform experiments with Burgers equation to begin. Right now we will focus on predictive ability compared to a baseline NN (can this be a NeuralODE?), and even a simple shifted OpInf-type ROM. How much mileage can we get from the HNN formalism?

And that's it. No part 2 of the HCU 🎥 :H
 hopefully!