Code design

Celeritas overall structure

We try to use HEP nomenclature where possible. We make up our own where necessary. We try to be explicit where a name might be ambiguous.

Assumptions for the short-medium term

All “transport” (propagation, physics, detectors, etc.) is done on the GPU.
GPU parallelism is over particle tracks. The more tracks we can run at once, the better our performance.
Multiple events can be transported simultaneously, but preference will be given to tracks with lower event IDs so that they can be finalized/flushed.

Code structure

Try to allow testing small components on CPU (see development guide).
Separate memory management from operations on the data.

Primary components

Simulation (manage events, relationships between tracks, simulation time, actions)
Random (RNG state for the tracks)
Geometry (model the real-world detector setup)
Propagation (move through the geometry in a field)
Materials (map physical volumes to atomic components)
Physics (model physical interactions and properties)
Detectors (tally and aggregate particle properties for output)

Memory limitations

Ordering in terms of anticipated memory burden; averages where applicable:

Sensitive detectors: Store energy deposition, positions, distance etc. during tracking: (total num tracks) * (num steps in detector)
Particle tracks: Basic state data plus cached physics: (num tracks) * (full geo state + ... + [physics state] * [num processes])
Track initializers: Pending "slimmed" versions of particle tracks, but possibly with full geo state to avoid reinitialization)
Secondaries: Output of an interaction, cutoffs are applied before creating track initializers from them: (num tracks) * (num secondaries per interaction)
Static data (cross sections/physics tables): (num particle types) * (num models) * (num materials)

Life cycle of a track

A "track" is comprised of several distinct structures, a member of the state vector for each of the above primary components. So a "track" is not a distinct code object. But the geometry state of a track is, and it can be accessed with the GeoTrackView.

We want to maximize the number of tracks running in parallel. I assume we determine some fixed (user-specified, run-time) number of tracks based on memory limitations (see section above). I also assume there's too much work to do and too many "long tails" to try to transport one event at a time on GPU.

Events on the CPU generate many primaries, equal to the number of empty "slots" in the device track vector.
These primaries are memcpy'd over to the GPU as "track initializers", and a kernel constructs track states for each corresponding empty slot.
When a particle undergoes an Interaction it may allocate and emit secondaries. Each Secondary object is a very thin reference to its parent ("mother" in Geant4 parlance) track plus the new direction/energy/identity of the particle. The resulting Interaction object is a change of state to the current track.
Secondaries are pruned based on user-provided cutoffs for particle types, and post-interaction tracks are also pruned.
Post-interaction tracks are pruned based on cutoffs.
Surviving secondaries create "track initializers" from their parent tracks.
Tracks that have cutoffs applied or exiting the world are "killed", leaving new slots to be filled in with the next iteration of track initializers.

The object that creates a track doesn't have to have the full set of data that a track contains: for example, it won't have any cached data or previous history, just a new track ID. There will be a performance/memory tradeoff for storing the "full" geometry state (regular state + one index per level) versus just the position/direction.

Stepping algorithm

...
Advance track clocks based on current speed
Apply interactions
...

Simulation

The simulation is responsible for choosing which action a particle undergoes. It maintains a count of the accumulated simulation time since the event and the accumulated path length of a track.

Initialize tracks from events
Initialize tracks from secondaries
Stepping loop: choose next interaction

Random

Contains parallel pseudorandom number generators and distributions that use those generators.

Classes

RngEngine

Maybe rename to RngEngineTrackView? Since this is the engine specific to a track. It still acts like a standard library "engine" locally.

Distributions

All have constructors that take the distribution parameters, and have an operator() that returns a result_type given a template<class Engine>.

Geometry

Classes

GeoTrackView (device)

Provide access to (and modify) the geometry state of a particle on the device

[describe other classes in use]

Propagation

Connections

Geometry: Field value depends on spatial position, and the Propagator moves the particle through space.
Simulation: Step limitation determines how far to propagate

Classes

FieldParams

Store a parameterized definition of the magnetic field. TODO: does this need to be just a function of space

FieldPropagator (device functor)

Integrate equation of motion
Constructor: takes FieldParamsPointers
operator(): takes GeoTrackView and maximum distance (from next physics interaction or step limiter), returns distance actually traveled
Additional device helper classes will probably be needed

Materials

Describes the atomic composition of a particular volume space. Each material is assumed to be completely homogeneous.

Classes

MaterialParams

Contains a list of individual materials, each of which has elements/nuclides and mass/number densities.

Physics

The physics models physical interactions and properties.

Connections

Random: Requires random number generation to sample physics
Geometry: Interactions change direction, so the physics takes an incident direction and calculates updated directions as well as the exiting directions of any secondary particles
Materials: A change in the material triggers a change in the cross sections.

Structural motivations and definitions

All physics processes and all models have common data (energy limits, applicable particles) and operations (cross section calculations, finding applicable processes and models, applying cutoffs). Take advantage of this as much as possible (more shared operations -> higher track parallelism).

Process: An observed physical phenomenon (e.g. pair production, Compton scattering)
Model: A mathematical description of a process

I think "decay" processes (AtRestDoIt) should be handled separately from processes that depend on path length.

Classes

ParticleParams

Parameters of fundamental particles: mass, charge, etc.
Unique ID for each particle type

ParticleState

Variables for a particular particle instance: energy, particle type

ParticleTrackView (device)

Combine Params and State for an interface to derivative quantities: momentum, speed, ...

SecondaryStackAllocator

Allocate space on a shared stack for secondaries.

Interactor: Encapsulates DoIt method of a process. Each of these will have to be hand-coded.

Sample (or hardcode) number of secondaries
Sample secondaries into SecondaryStack, return Interaction

ValueGridDef: quantities stored on an increasing grid, to be interpolated between grid points

Abscissa grid (energy)
Ordinate grid (cross section or other quantity)
Interpolation method enum (e.g. linear, log, semilog-x, spline (?), …)
Lookup acceleration option/data (e.g. known log-E grid)

ValueGridView: device lookup for an energy/value physics data (replaces PhysicsVector)

Take ValueGridPointers
Use interpolator classes to calculate values given energy

ModelDef: storage for basic physics properties

Applicable particle/energy range (?)
Map quantities (sigma, dE/dx) to ValueGridDef

ProcessDef: simple attributes common to all processes

Which particle(s?) it applies to (ParticleDefId particle_type)
What energy range it applies to (lower, upper)
What models it uses (??) (span<ModelDefId>: usually size one or two)
InteractorId that corresponds to Interactor class for this process

PhysicsParams

Manages all the different models, processes, physics tables for all the particles
Map particle + energy -> applicable processes
Has DeviceVector<ProcessDef>
Owns DeviceVector<ModelDef>
…
ProcessId is the index of the process in the list
ModelId for model

PhysicsState

MaterialId for current state (initialize to invalid; compare to geometry state to determine whether cross sections/tables need to be recalculated)
Cached cross sections (?)
Next collision distance (or CPL) and corresponding ProcessId
Flag (?) indicating the process Interactor successfully sampled secondaries

Execution sequence

Kernel: update material and calculate interaction length

Take MaterialTrackView (material ID) and compare against cached state material ID
Update/invalidate cached cross sections for processes
Sample distance-to-interaction for all processes, store distance and shortest ProcessId

Kernel(s): interact

Reduction to find all ProcessIds that are being used
Host-side virtual hierarchy, InteractorInterface abstract base class with a virtual function on host takes state vector and a “mask” for which ones apply to a process.
Interactor dispatcher (maybe in PhysicsParams) loops over all processes that have at least one track that selected this process interaction
Each virtual implementation launches a kernel asynchronously, masking so it calls Interactor only on applicable particles.
Alternatively we hard-code an enum of Interactor objects and use LLVM-like RTTI to dispatch among them

Kernel: cutoffs

Apply “production cuts” (kill secondaries produced by interaction as needed)

Detectors

Calculate/store/write out energy deposition (and other quantities as necessary) in particular volumes of space.

Connections

Simulation: Accumulated path lengths

Classes

DetectorParams

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code design

Celeritas overall structure

Assumptions for the short-medium term

Code structure

Primary components

Memory limitations

Life cycle of a track

Stepping algorithm

Simulation

Random

Classes

Geometry

Classes

Propagation

Connections

Classes

Materials

Classes

Physics

Connections

Structural motivations and definitions

Classes

Execution sequence

Detectors

Connections

Classes

Clone this wiki locally