Disentangling Scientific Software
This is a lecture on dissecting a piece of scientific software.
Messy but realistic toy model
I'll first show a very simple physical model (many particles randomly moving around in a 2-dimensional box) that allows for calculating (or diagnosing) physical quantities. We'll use the center of mass and the moment of inertia across all the particles.
The implementation of this example will be messy but fairly realistic.
Dissecting the structure
I'll then go on to separate different parts of the model (space where the particles live, a group of particles, a source of randomness) to more clearly understand the relationship between the different parts.
The implementation of that part will be what scientific software developers do if the modularize their software for better understanding the structure, for separating the development into work packages, etc.
Distributing the different parts of the model
Finally, I'll use Dask to distribute the model. Each component will live on a different worker (can be threads, processes, nodes in a cluster, different data centres, ...). All parts will talk to each other using actors.
Room for own work
The distributed step will be used as a starting point to think about
- how to add components to the model
- how to optimize the layout of the different components under different conditions
- how to ensure reproducibility of the computation