## The Scientific Python Ecosystem as a Platform for the Algebraic Collective Model

*Arthur Ryman, 2022-06-04*

### A symposium celebrating the life and work of Prof. David Rowe

## Abstract

* Advances in theory, such as The Algebraic Collective Model (ACM) of the Atomic Nucleus, extend the range of physical systems that computers can model
* Historically, many computer languages have been used for implementing physical models
    * e.g. Fortran, APL, Pascal, C, C++, Mathematica, Maple, Matlab, ...
* The recent explosion of Internet data has stimulated the development of a powerful software and hardware stack for efficiently processing it
* Python has emerged as the dominant platform for Data Science and Machine Learning
* Python now has a rich set of scientific capabilities that also make it well-suited for physics
* This presentation reports on my experience converting the Welsh-Rowe ACM code from Maple to Python
* Based on this exercise, I argue that:
    * Python is a very appropriate platform for the ACM and for quantum mechanics in general 
    * Python has many engineering and economic advantages over special purpose computer algebra systems such as Maple, Mathematica, and Matlab

## Outline

* Background
* The Algebraic Collective Model (ACM) of the Atomic Nucleus
* Python and its Scientific Ecosystem
* Next Steps
* Conclusion

## Background

* 1977: met David and George Rosensteel at Mathematical Physics conference in Tubingen, Germany
    * argued with George about bases of separable Hibert spaces :-)
* 1978-79: joined David's research team at U. of Toronto
    * worked with George and Julianna Carvalho
* 1980-2021: I become a professional software engineer, but ...
    * maintained close friendship with David
    * discussed math, physics, computing
    * played squash, badminton
    * enjoyed sushi, scotch
* 2019-06-24: hosted lunch with David and family
    * David is dissappointed with low adoption rate of ACM code :-(
        * cites Maple as an obstacle
    * I suggest conversion to Python and start `acmpy` project

## 2019-06-25 Email

Hi Arthur

...


Thus, we are optimistic that with a readily available computer code, the ACM, which can easily handle the full ACM with a variety of Hamiltonians, including its adiabatic and other approximations, would become a generally used tool.  But, no doubt I am being overly optimistic.  **That was our original hope but the use of Maple proved (I am assuming) to be the obstacle to this happening.**

My guess is that the most widely used computer languages are C, fortran, and python (but really I am only guessing).  It would seem to make most sense, if we consulted the journal Comput. Phys. Commun.
and if, for example, they would want to publish a short note with the inclusion of the more useable code **(I recall that, when we first submitted the code, they mentioned that the language would be a barrier to its usage, but at that point we were already committed)**. 

...

David

## Background cont'd

* 2019-2021: professional obligations delay `acmpy` progress
* 2022-03-04: symposium date locked in
    * push to complete `acmpy` development
* 2022-04-12: initial version complete!

## Collective Motion

* *Collective Motion* is a phenomenon that sometimes emerges when many individuals interact with each other in simple ways
* [schools of fish](https://www.sciencefocus.com/nature/how-do-schools-of-fish-swim-in-perfect-unison/)
* [murmurations of starlings](https://birdwatchireland.ie/have-you-seen-this-bird-and-tens-of-thousands-of-its-friends-murmuration-records-wanted/)
* protons and neutrons in an atomic nucleus
    * *The Liquid-Drop Model* by Niels Bohr
    * *The Collective Model* by Aage Bohr, Ben Mottelson, and James Rainwater [1975 Nobel Prize for Physics](https://physicstoday.scitation.org/doi/abs/10.1063/1.3069269?journalCode=pto)

## Collective Motion in a School of Fish

![How do schools of fish swim in perfect unison?](images/GettyImages-466221663-24bc935.jpeg)

## Collective Motion in a Murmuration of Starlings

![starling murmuration](images/wildlife-trusts-starling-murmuration-dusk-david-tipling-2020vision.jpeg)

## Algebraic Models

* quantum mechanics is normally formulated in terms of the Schrödinger differential equation for the wave function of a system
    * e.g. an atomic nucleus
* most differential equations do not have "simple" solutions, but can, in principle, be solved numerically by a sufficiently powerful computer
* often (always?) the existence of a simple solution owes its existence to an underlying symmetry group
    * e.g. rotations in space
* in this case, the differential equation can be expressed in terms of the group operators
    * the solutions can be found algebraically
    * i.e. "without" solving the differential equation
* the most well-known example is the harmonic oscillator
    * e.g. a weight on a spring
    * the underlying symmetry is symplectic


## Quantum Mechanics of the Atomic Nucleus

* consider a nucleus that contains $N$ nucleons (protons and neutrons)
    * e.g. $N = 235$ for $^{235}U$ 
* the position of each nucleon is given by three spatial coordinates: $x, y, z$
* the configuration of the nucleus is given by the position of each nucleon
* the total dimension of the configuration space for the nucleus is therefore $3N$
* the quantum state of a system is given by a complex wave function defined on its configuration space
* therefore the wave function of the nucleus is given by a complex function of $3N$ real variables
* **no conceivable classical computer could ever directly solve the Schrödinger equation for large $N$**
    * a future quantum computer *might* be able to solve it

## The Algebraic Collective Model (ACM) of the Atomic Nucleus

* the ACM dramatically reduces the dimension of the configuration space to just *five* 
* together these five coordinates form the *quadrupole moment tensor*
* the ACM has two underlying symmetry groups that enable an algebraic solution of the Schrödinger equation
* $SO(5)$ is associated with the *orientation* of the quadrupole moment
* $SU(1,1)$ is associated with the *magnitude* of the quadrupole moment

## Quadrupole Moments

* think of the swarm of $N$ nucleons as being approximated by a continuous mass distribution in space
* the quadrupole moment measures how far the distribution differs from perfect spherical symmetry
    * there are five independent components of the quadrupole moment
    * an $SO(3)$ rotation in position space induces an $SO(5)$ rotation in quadrupole moment space
* the quadrupole moment is mathematically related to the *covariance* of a data set in Statistics
* the quadrupole moment is mathematically related to the *volativity* or *beta* of an asset in Finance 
    * by coincidence (?) the ACM uses the variable *beta* ($\beta$) for the magnitude of the quadrupole moment!

## The 2015 Welsh-Rowe ACM Code

* [A computer code for calculations in the algebraic collective model of the atomic nucleus](https://doi.org/10.1016/j.cpc.2015.10.017)
    * 2015: available online
    * 2016: published in Computer Physics Communications
    * 5571 lines of extensively commented Maple source code
    * data files containing precomputed $SO(5) > SO(3)$ Clebsch-Gordan coefficients
    * large Maple worksheet containing examples of use and tests
* allows the user to define and solve very general Hamiltonians composed from the $SU(1,1)$ and $SO(5)$ operators
    * find the lowest energy levels and corresponding quantum states
    * compute electric dipole transition amplitudes and rates between the quantum states

## Demo of ACM Maple Code

![ACM Maple Demo](images/acm-maple-demo.png)

## `acm16` on GitHub

![acm16](images/github-agryman-acm16.png)

## Demo of `acmpy`

![acmpy Demo](images/acmpy-demo.png)

## `acmpy` on GitHub

![acmpy](images/github-agryman-acmpy.png)

## Programming Languages Previously Used ACM Work

| Year | Language | Authors | Code |
|------|------|---------|----------|
| 2008 | C++ | T.A. Welsh | Calculation of SO(5) > SO(3) Clebsch-Gordan coefficients |
| 2009 | Mathematica | M.A.Caprio, D.J.Rowe, T.A.Welsh | Construction of SO(5) > SO(3) spherical harmonics and Clebsch–Gordan coefficients |
| 2015 | Maple | T.A.Welsh, D.J.Rowe | A computer code for calculations in the algebraic collective model of the atomic nucleus |

* I suggest that standardizing on a common programming language would 
    * reduce training, 
    * promote reuse, and 
    * increase productivity
* I argue that Python is currently the best choice

## Python

* [Python](https://python.org) is a well-designed, general purpose programming language
    * dynamically typed
    * object-oriented
    * functional programming features
* interpretted (not compiled)
    * but performance-critical parts are coded in C
    * existing libraries (e.g. LAPACK) written in any compiled language can be called from Python
* built-in high-level mathematical features
    * exact integer arithmetic
    * high precision real and complex numbers
    * sets, lists, tuples, and maps (dictionaries)

## Python cont'd

* free and open source
    * governed by the [Python Software Foundation](https://python.org/psf-landing/)
* huge collection of third-party packages, e.g.
    * matplotlib, mpmath, NumPy, SciPy, SymPy, pandas, ...
    * [pytest](https://pytest.org/) for test case development and automation
    * [mypy](http://mypy-lang.org) for static type checking based on type hints
* [PyPi](https://pypi.org) Python Package Index
* many excellent integrated development environments
    * I use [JetBrains PyCharm](https://www.jetbrains.com/pycharm/)

## Python

![python.org](images/python-org.png)

## Python Software Foundation

![PSF](images/python-psf.png)

## mypy

![mypy](images/mypy-org.png)

## pytest

![pytest](images/pytest-org.png)

## PyPI

![PyPI](images/pypi-org.png)

## PyCharm

![PyCharm](images/jetbrains-pycharm.png)

## The Scientific Python Ecosystem

* [mpmath](https://mpmath.org) - arbitrary precision real and complex floating-point math
* [matplotlib](https://matplotlib.org) - data visualization
* [NumPy](https://numpy.org) - vectorized n-dimensional array operations
* [pandas](https://pandas.pydata.org) - data analysis
* [Jupyter](https://jupyter.org) - interactive notebooks for Python and other languages
* [SciPy](https://scipy.org) - numerical analysis, optimization, machine learning, special functions
* [SymPy](https://sympy.org) - symbolic math (aka computer algebra)

## mpmath

![mpmath](images/mpmath-org.png)

## matplotlib

![matplotlib](images/matplotlib-org.png)

## NumPy

![numpy](images/numpy-org.png)

## pandas

![pandas](images/pandas-pydata-org.png)

## Jupyter

![jupyter](images/jupyter-org.png)

## SciPy

![scipy](images/scipy-org.png)

## SymPy

![sympy](images/sympy-org.png)

## Scientific Python Distributions

* [SageMath](https://www.sagemath.org)
    * "Mission: Creating a viable free open source alternative to Magma, Maple, Mathematica and Matlab."
    * includes many scientific computing applications
    * aimed at interactive users
* [Anaconda](https://www.anaconda.com)
    * "Anaconda offers the easiest way to perform Python/R data science and machine learning on a single machine."
    * package manager for Python and other languages
        * manages dependencies
        * generates compiled code for many platforms
    * aimed at data scientists and others who need to analyse data but not necessarily develop code

## SageMath

![SageMath](images/sagemath-org.png)

## Anaconda

![Anaconda](images/anaconda-com.png)

## More About Jupyter Notebooks

* evolved from IPython
    * Julia, Python, and R
* supports many other languages via plug-in [kernels](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels)
* "[JupyterLab](https://jupyter.org/try-jupyter/lab/) is the latest web-based interactive development environment for notebooks, code, and data."
* "[JupyterHub](https://jupyter.org/hub) brings the power of notebooks to groups of users."
* several [websites](https://www.dataschool.io/cloud-services-for-jupyter-notebook/) host Jupyter notebooks
    * [Binder](https://mybinder.org)
    * [Kaggle](https://www.kaggle.com/code)
    * [Google Colab](https://colab.research.google.com)
    * [CoCalc](https://cocalc.com)
    * [JetBrains Datalore](https://datalore.jetbrains.com)

## JupyterLab

![JupyterLab](images/jupyterlab.png)

## JupyterHub

![JupyterHub](images/jupyterhub.png)

## Binder

![Binder](images/mybinder-org.png)

## Kaggle

![Kaggle](images/kaggle-com.png)

## Google Colab

![Colaboratory](images/google-colaboratory.png)

## CoCalc

![CoCalc](images/cocalc-com.png)

## JetBrains Datalore

![Datalore](images/datalore-jetbrains-com.png)

## My Experience Converting the ACM Code from Maple to Python

* Maple and Python have many language similarities
    * high-level
    * mathematical
    * interpretted
* the initial conversion was very direct
    * used SymPy for all symbolic and matrix functions
    * however, the code is actually a hybrid of symbolic and numeric computation
* initially Python was 80x slower than Maple :-(
    * used Python profiling tools to identify bottlenecks
    * selectively replaced slow SymPy functions with much faster NumPy and SciPy functions
* currently Python is 3x faster than Maple :-)
    * can improve performance even more using vectorized NumPy and SciPy functions

## Some SymPy Issues

* found a couple of SymPy issues
* reported these to [SymPy project on GitHub](https://github.com/sympy/sympy)
    * rapid reply from development team
* [Issue 23497](https://github.com/sympy/sympy/issues/23497): different value for `binomial(-1, -1)`
* [Issue 23510](https://github.com/sympy/sympy/issues/23510): sometimes`solveset()` failed to find solution

## SymPy on GitHub

![SymPy Project](images/sympy-github.png)

## Advantages of Python versus Maple, Mathematica, Matlab, etc.

* Python is a well-designed, modern, general purpose programming language
    * several excellent integrated development environments
    * great tools for debugging, testing, execution profiling
    * supports object-oriented design
* Python has proven infrastructure, PyPI, for publishing code
* Python is free and open software
    * no licence fees
    * ability to fix bugs and add enhancements
    * future-proofs your work
* Python has a large developer base
    * undergraduates are more likely to have Python skills
    * graduates who use Python will have enhanced employment opportunities

## Stack Overflow 2021 Developer Survey

* 80,000 developers responded to the [survey](https://insights.stackoverflow.com/survey/2021)
* "Which programming, scripting, and markup languages have you done extensive development work in over the past year, and which do you want to work in over the next year? (If you both worked with the language and want to continue to do so, please check both boxes in that row.)"
    * Python ranked \#3 at 48.24% of developers
    * Matlab ranked \#23 at 4.66% of developers
    * Maple, Mathematica did not appear in the ranking

## Top-Ranked Programming Languages

![Top-Ranked Programming Languages](images/survey-top-ranked.png)

## Bottom-Ranked Programming Languages

![Bottom-Ranked Programming Languages](images/survey-bottom-ranked.png)

## Next Steps

* `acmpy` version 1.0
    * finish testing
    * reproduce all results from the Maple worksheet
    * release code on GitHub
    * ask interested researchers to provide feedback
* `acmpy` version 2.0
    * rewrite all numeric code using vectorized NumPy and SciPy functions
    * refactor code using object-oriented design to improve API and extensibility
    * bring code documentation up to SymPy standards (Sphinx)
    * identify portions of code that are suitable as contributions to SymPy
    * publish code on PyPI
    * write article for Computer Physics Communications

## Future Thoughts

* What if in the future, scientific papers contained more than just informal mathematical prose?
* Formal Specifications
    * What if scientific papers contained formal specifications?
    * formal software specification languages exist and have been used on a limited basis
    * the use of formal languages enables some forms of automated correctness checking
        * type checking
        * model checking
    * I like [Z Notation](https://en.wikipedia.org/wiki/Z_notation) and have used it professionally

## Future Thoughts cont'd

* Proof Checkers
    * What if the mathematical derivations in scientific papers could be checked for correctness?
    * this first requires that all definitions be formally specified
    * the [Lean Theorem Prover](https://leanprover.github.io) is gaining some adherents in the mathematics community

## Future Thoughts cont'd

* by including formal specifications and proofs in scientific papers, mathematics and physics could take advantage of many of the tools that have powered the software revolution
    * greater reuse of results
    * better indexing and search
    * improved correctness of software implementations

## Conclusion

* Python is now a viable alternative to traditional special purpose computer algebra systems
* Python is being used daily for large scale, numeric-intensive machine learning tasks
* Python language features make it more suitable for developing large, complex applications
* Python is free and open software which makes Python applications more future-proof
* Python has a large and active developer community which means Python skills are more readily available for new research projects
* Python skills are in high demand by industry, thereby enhancing the future employment prospects of graduates who acquire those skills
* By standardizing on Python, the physics community should be able to more easily build on past work thereby accelerating progress

## References

* T.A. Welsh and D.J. Rowe, "A computer code for calculations in the algebraic collective model of the atomic nucleus", in Computer Physics Communications 200 (2016) 220-253, https://doi.org/10.1016/j.cpc.2015.10.017
* J. Hunter, F. Pérez and B. Granger, "Python: An Ecosystem for Scientific Computing" in Computing in Science & Engineering, vol. 13, no. 02, pp. 13-21, 2011, https://doi.ieeecomputersociety.org/10.1109/MCSE.2010.119