<img src="https://creative.lbl.gov/wp-content/uploads/sites/3/2020/07/6_BL_Horiz_Rev_rgb.png"
     width="500px"
     alt="LBL logo"
     style="vertical-align:middle"/>


# Cloud microphysics training and aerosol inference with the Fiats deep learning library

**Authors:** [Damian Rouson](mailto:rouson@lbl.gov), [Zhe Bai](mailto:zhebai@lbl.gov), [Dan Bonachea](mailto:dobonachea@lbl.gov), [Baboucarr Dibba](mailto:bdibba@lbl.gov), [Ethan Gutmann](mailto:gutmann@ucar.edu), [Katherine Rasmussen](mailto:krasmussen@lbl.gov), [David Torres](mailto:davytorres@nnmc.edu), [Yunhao Zhang](mailto:yunhao2783@gmail.com), [Jordan Welsman](welsman@lbl.gov)

**Keywords:** deep learning, Fortran, cloud microphysics, aerosols, surrogate model, neural network

----------

## Abstract
This notebook presents two atmospheric sciences demonstration applications in the [Fiats](https://go.lbl.gov/fiats) software repository.  The first, `train-cloud-microphysics`, trains a neural-network cloud microphysics surrogate model that has been integrated into the [Berkeley Lab fork](https://go.lbl.gov/icar) of the Intermediate Complexity Atmospheric Research (ICAR) model. The second, `infer-aerosol`,  performs parallel inference with an aerosol dynamics surrogate pretrained in PyTorch using data from the Energy Exascale Earth System Model ([E3SM](https://e3sm.org/).  In addition to describing the structure and behavior of the demonstration applications, this notebook aims to provide enough information for the interested reader to gain experience with using Fiats for microphysics training inference and aerosols inference.  Toward this end, the notebook links to a pretrained aerosol model stored in the Fiats JavaScript Object Notation (JSON) file format for downloading, importing, and using to perform batch inference calculations with Fiats.  Because capturing even a single simulated year of physics-based ICAR cloud microphysics model inputs and outputs requires thousands of core-hours to produce and hundreds of gigabytes to store, this notebook links to a proxy application that gives the reader some experience with training one microphysics component: a saturated mixing ratio function.  The proxy application is the `learn-saturated-mixing-ratio` program in the `example` subdirectory of Fiats along with a `gnuplot` script that plots a measure of the accuracy of the resulting surrogate model across a domain bounded by the physics-based model input extrema.

Introduction
------------
### Background
Fortran programs serve an important role in earth systems modeling from weather and climate prediction {cite}`skamarock2008description,mozdzynski2015partitioned` to wildand fire modeling {cite}`vanella2021multi` and terrestrial ecosystem simulation {cite}`shi2024functionally`.   The cost of performing ensembles of runs of such complex, multiphysics applications at scale inspires investigations into replacing model components with neural-network surrogates in such earth systems as groundwater {cite}`asher2015review` to oceans {cite}`partee2022using`.  For the surrogates to be useful, they must provide comparable or better accuracy and execution speed as the physics-based components that the surrogates replace.  The prospects for achieving faster execution depend on the several properties of the surrogates, including their size, complexity, architecture, and implementation details.  Implementing the surrogates in the language of the supported application reduces interface complexity and increases runtime efficiency by reducing the need for wrapper procedures and data structure transformations.
Inspired partly by these concerns, native Fortran software packages are emerging to support the training and deployment of neural-network surrogate models.  Two examples are neural-fortran {cite}`curcic2019parallel` and ATHENA {cite}`taylor2024athena`.  This jupyter notebook presents two demonstration applications in the Fiats deep learning library {cite}`rouson2025automatically`:
one trains a cloud microphysics surrogate for the Intermediate Complexity Atmospheric Research (ICAR) model {cite}`gutmann2016intermediate` and the other uses a pretrained surrogate for aerosols in the Energy Exascale Earth Systems Model (E<sup>3</sup>SM) {cite}`golaz2019doe`.

Fiats, an acronym that expands to "Functional inference and training for surrogates" or "Fortran inference and training for science," targets high-performance computing applications (HPC) in Fortran 2023. Fiats provides novel support for functional programming styles by providing inference and training procedures declared to be `pure`, a language requirement for invoking a procedure inside Fortran's loop-parallel construct: `do concurrent.` Because pure procedures clarify data dependencies, at least four compilers are currently capable of automatically parallelizing `do concurrent` on central processing units (CPUs) or graphics processing units (GPUs): `ifx` from Intel, `flang` from LLVM, `nvfortran` from NVIDIA, and `crayftn` from HPE.  Fiats supports `flang` and partially supports `ifx` with only one known test failure when building with `ifx` at the time of this writing.  At the time of this writing, Fiats supports `flang` and offers automatic parallelization of batch inference calculations with strong scaling trends comparable to those achievable with OpenMP directives {cite}`rouson2025automatically`. Work is under way to support `ifx`.

Fiats provides a derived type that encapsulates neural-network parameters and provides generic bindings for invoking inference functions and training subroutines of various precisions. A novel feature of the Fiats design is that all procedures involved in inference and training have the `non-overridable` attribute, which eliminates the need for dynamic dispatch at call sites. In addition to simplifying the structure of the resulting executable program and potentially improving performance, we expect this feature to enable the automatic offload of inference and training to GPUs.

### Objectives
The primary objectives of this notebook are to describe the use of Fiats in the `infer-aerosol` and `train-cloud-microphysics` demonstration programs and to explain how to run those programs and some similar but simpler example programs.  The [Methodology](#methodology) section of the notebook describes the use of Fiats of the applications and examples.   The [Discussion of Results](#discussion-of-results) section explains how to run each of the programs.   We expect that the reader familiar with recent Fortran standards will take away an understanding of the program statements required to use Fiats.  We further expect that the interested reader who installs the prerequisite build system and compiler will take away an understanding of how to run the `infer-aerosol` demonstration application locally after downloading a pretrained neural network that this notebook provides.  In the case  of `train-cloud-microphysics`, however, effective use by the reader is impractical because of the costs assocated with generating, storing, and accessing training data.  An ICAR run covering one simulated year requires thousands of core-hours and produces hundreds of gigabytes of data.  To provide the reader experience with training a neural network using Fiats, this notebook describes a similar but much less demanding example that trains a neural-network surrogate to represent one function from an ICAR cloud microphysics model: a function that computes the saturated mixing ratio given a pressure and temperature.

## Methods
### Getting started
With the `tree` utility installed, the following `bash` shell commands will download the Fiats repository, checkout the `git` commit used in writing this notebook, and show the Fiats directory tree:
```bash
git clone --branch sea-iss-2025 git@github.com:berkeleylab/fiats 
cd fiats
tree -d
.
├── demo
│   ├── app
│   ├── include -> ../include
│   ├── src
│   └── test
├── doc
│   └── uml
├── example
│   └── supporting-modules
├── include
├── scripts
├── src
│   └── fiats
└── test
```
The `src` directory contains the source comprising the Fiats library that programs link against to access Fiats functionality by invoking the library's procedures.  As such, `src` contains the only files that a Fiats user needs and it contains no main programs.  The `fiats_m` `module`in `src/fiats_m.f90` contains all user-facing Fiats functions, procedures, derived types, and constants.  The `src/fiats` subdirectory contains the definitions of all of these entities in addition to any private entities not intended for users to acces.  For a program to access Fiats entities, it would suffice for a `use fiats_m` statement to appear in any program unit or subprogram that requires Fiats.

Apart from the library, the Fiats `git` repository contains main programs that demonstrate how to use Fiats.  The main programs are in two subdirectories: 
1. `example/` contains relatively short and mostly self-contained programs and
2. `demo/app/` contains demonstration applications developed with collaborators for production use.
The next three subsections describe programs in these locations.

### Demonstration application: aerosol inference
The `demo/app/infer-aerosol.f90` program demonstrates the use of Fiats to predict aerosol parameters for E<sup>3</sup>SM.   The following statement provides access to all Fiats entities employed by the program:
```fortran
   use fiats_m, only : unmapped_network_t, tensor_t, double_precision, double_precision_file_t
```
where the `unmapped_network_t` derived type encapsulates a neural network that performs no mappings on input and output tensors, `tensor_t` encapsulates network input and output tensors, `double_precision` is a kind type parameter that specifies the desired precision, and the `double_precision_file_t`  derived type provides a file abstraction that determines how numerical values in model files will be interpreted.  Because Fiats focuses on surrogate models that must be compact in order to be competitive with the physics-based models they replace, Fiats uses a JSON file format for its human readability because we have found the ability to inspect network parameters visually helfpul in the early stages of experimenting with new algorithms.  Users with models trained in PyTorch can use the Fiats companion network export software [Nexport](https://github.com/berkeleylab/nexport) to export models to the Fiats JSON format.

After chores such as printing usage information if a user omits a required command-line argument, the following object declaration demonstrates the first direct use of Fiats in the program: 
```fortran
   type(unmapped_network_t(double_precision)) neural_network
```
Fiats uses derived type parameters -- specifically kind type parameters -- so that one neural-network type can be used to declare objects with any supported `kind` parameter.  Currently, the supported `kind` parameters are `default_real` and `double_precision`, corresponding to the chosen compiler's `real` (with no specified `kind` parameter) and `double precision` types.  Fiats types with a kind type parameter provide a default initialization of the parameter to `default_real`.  

A later line defines the object:
```fortran
   neural_network = unmapped_network_t(double_precision_file_t(path // network_file_name))
```
where `unmapped_network_t` appears in this context as a generic interface patterned after Fortran's structure constructors that define new objects. Because the JSON specification does not differentiate types of numbers (e.g., JSON does not distinguish integers from real numbers), using the Fiats `double_precision_file_t` type specifies how to interpret values read from the model file.

Similarly, the later line 
```fortran
   type(tensor_t(double_precision)), allocatable :: inputs(:), outputs(:)
```
specifies the precision used for tensor objects and the `tensor_t` generic interface in the followingn statement
```fortran
   inputs(i) = tensor_t(input_components(i,:))
```
resolves to an invocation of a function that produces a double-precision object because of a declaration earlier in the code (not shown here) that declares `input_components` to be of type `double precision`.  Ultimately, inference happens by invoking a type-bound `infer` procedure on the `neural_network` object and providing `tensor_t` input object to produce the corresponding outputs:
```fortran
   !$omp parallel do shared(inputs,outputs,icc)
   do i = 1,icc
     outputs(i) = neural_network%infer(inputs(i))
   end do
   !$omp end parallel do
```
where we parallelize the loop using OpenMP directives.  Alternatively, the Fiats `example/concurrent-inferences.F90` program invokes `infer` inside a `do concurrent` construct, taking advantage of `infer` being `pure`.  This approach has the advantage that compilers can automatic parallelize the iterations without OpenMP directives.  Besides simplifying the code, switching to `do concurrent` means the exact same source code can run in parallel on a CPU or a GPU without change.  With most compilers, switching from running on one device to another requires simply recompiling with different flags.  See {cite:t}`rouson2025automatically`  for more details on automatically parallelizing inference, including strong scaling results on one node of the Perlmutter supercomputer at the National Energy Research Scientific Computing (NERSC) Center.

The remainder of `infer-aerosol` contains problem-specific statements not directly related to the use of Fiats and is therefore beyond the scope of this notebook.

### Demonstration application: microphysics training
Training a neural network is an inherently more involved process than using a neural network for inference.  As such, `train-cloud-microphysics` uses a larger number of of Fiats entities:
```fortran
  use fiats_m, only : tensor_t, trainable_network_t, input_output_pair_t, mini_batch_t, &
    tensor_map_t, training_configuration_t, training_data_files_t, shuffle
```
where only the `tensor_t` type intersects with the set of entitites that `infer-aerosols` uses.  The remaing entities in the above `use` statement all relate to training neural networks.

The `trainable_network_t` type extends the `neural_network_t` type and thus offers the same type-bound procedures by inheritance. 
Outwardly, `trainable_network_t` differs from `neural_network_t` only in that the former provides public `train` and `map_to_training_ranges` generic bindings that the latter lacks.  Calling `train` performs a forward pass followed by a back-propagation pass that adjusts the neural-network weights and biases.  If the network input and output ranges for training differ from the corresponding tensor values for the application (e.g., we often find it useful to map tensor values to the unit interval [0,1] for training), then calling `map_to_training_ranges` performs the desired transformation and the resulting `tensor_map_t` type encapsulates the forward and inverse mappings.  Privately, the `trainable_network_t` type stores a `workspace_t` object containing a training scratchpad that gets dynamically sized in a way that is invisible to Fiats users.  Hiding this implementation detail without necessitating the definition of `neural_network_t` components needed only for training is the primary reason `trianable_network_t` exists.

The `input_output_pair_t` derived type encapsulates training-data pairings ensuring a one-to-one connection between `tensor_t` inputs and outputs as required for supervised learning {cite}`goodfellow2016deep`.  The `mini_batch_t` type supports the formation of `input_output_pair_t` subgroups. The ability to form mini-batches and the listed `shuffle` procedure combine to facilitate the implementation of the foundational stochastic gradient descent optimization algorithm for training.

Finally, the `training_configuration_t` and `training_data_files_t` types encapsulate file formats that Fiats users employ to define training hyperparameters (e.g., learning rate) and to specify the series of file names that contain training data.  With all of the aforementioned derived types in place, `train-cloud-microphysics` uses a capability of the external Julienne framework {cite}`julienne` to group the training data into bins:

```fortran
      bins = [(bin_t(num_items=num_pairs, num_bins=n_bins, bin_number=b), b = 1, n_bins)]
```
and then these bins are shuffled into new mini-batche subsets in each pass (epoch) through the data set epoch:

```fortran
   do epoch = first_epoch, last_epoch

     if (size(bins)>1) call shuffle(input_output_pairs) ! set up for stochastic gradient descent
     mini_batches = [(mini_batch_t(input_output_pairs(bins(b)%first():bins(b)%last())), b = 1, size(bins))]

     call trainable_network%train(mini_batches, cost, adam, learning_rate)
```
which completes the presentation of essential Fiats capabilities employed by `train-cloud-microphysics`.

# Discussion
This section aims to provide the interested reader with experience in running programs that use Fiats for predicting atmospheric aerosol dynamics and for training a cloud microphysics model. 

### Prerequisites
Building Fiats requires the Fortran Package Manager (`fpm`) and the LLVM `flang` Fortran compiler...

```bash
git clone --branch sea-iss-2025 git@github.com:berkeleylab/fiats # skip this step if you have already cloned Fiats
cd fiats
fpm test
```

### Running infer-aerosol
Before running the `infer-aerosol` program, download the pretrained aerosol model, [model.json](./assets/model.json), and save it to the `demo/` subdirectory inside your local clone of Fiats.  Then set your present working directory to the same location and run the demonstration application setup script in a terminal window running the `bash` or `zsh` shell: 
```bash
cd demo
./setup.sh
```
If the setup script detects the macOS operating system, the script will attempt to use Homebrew to install three prerequisite software packages that the demonstration applications require: [HDF5](https://github.com/HDFGroup/hdf5/), [NetCDF](https://github.com/Unidata/netcdf-c/), and [NetCDF-Fortran](https://github.com/Unidata/netcdf-fortran).  On Linux, you will need to install the packages yourself -- possibly using your Linux distribution's package manager -- (re)run the script after setting the following environment varialbes to specify the installation paths for the aformementioned prerequisites:
```
HDF5_LIB_PATH
NETCDF_LIB_PATH
NETCDFF_LIB_PATH
```
If the setup script completes successfully, it creates a script in `demo/build/run-fpm.sh` that you will use to run `infer-aerosol`. All example programs or demonstration applications in Fiats print usage information if you run them without passing any arguments so you might start by running `infer-aerosol` to see the required arguments:
```
./build/run-fpm.sh run infer-aerosol
```

[Fig. 2](aerosol-viz) shows a visualization of predictions made by the aerosol model that we used with the `infer-aerosol` demonstration application.  The visualization is produced by software unrelated to Fiats and is provided for purposes of understanding the model data the model produces.

```{figure} ../assets/ncvis_output_0104_a1.png
:name: aerosol-viz
:align: center

A global view of the number concentration of accumulation-mode aerosol particles as predicted by the model used in this section of the notebook.
```

### Running learn-saturated-mixing ratio
As explained in the [Objectives](#objectives) section of this notebook,...
The `learn-saturated-mixing-ratio` function in the `example/` subdirectory uses an overlapping subset of the Fiats entities that `train-cloud-microphysics` uses:
```fortran
  use fiats_m, only : neural_network_t, trainable_network_t, mini_batch_t, tensor_t, &
    input_output_pair_t, shuffle
```
where the only entity included in the latter `use` staement but not in `train-cloud-microphysics` is `neural_network_t`, which is primarily for convenience in terms of the constructor functions provided by `neural_network_t` but not `trainable_network_t`.  Due to the overlap between `learn-saturated-mixing-ratio` and `train-cloud-microphysics`, requires no further presentation.


# Conclusions

### Referencing and Cross-Referencing 

#### Linking to equations

You can easily link to these equations using `{eq}label` syntax. For example here is a link to equation {eq}`my_label`.

#### Referencing figures
To reference a figure in your notebook, first add the figure with a `name`. Next use the name to reference it. 

🛠 Double click the next cell to see the MyST `{figure}` syntax.

```{figure} https://artsourceinternational.com/wp-content/uploads/2018/04/P-1984-1.jpg
:name: mountain-fig
:align: center

My **bold** mountain 🏔🚠.
```

Check out how we referenced this figure: [](mountain-fig)!!

#### Referencing Tables

To reference a table, first create a table and give it a `name`. 

```{table} My table title
:name: my-table-ref

| Month      | Temperature (°C) |
|------------|------------------|
| January    | 5                |
| February   | 6                |
| March      | 10               |
| April      | 15               |
| May        | 20               |
| June       | 25               |
| July       | 30               |
| August     | 30               |
| September  | 25               |
| October    | 18               |
| November   | 10               |
| December   | 5                |
```

Now, you can reference this table [](my-table-ref)!! 


:::{seealso}
To see more examples on cross-referencing figures, please see [this page](https://jupyterbook.org/en/stable/content/references.html#reference-figures).
:::

## References

You can add references to your paper by adding them to the `references.bib` file. You can then cite them in your paper using the `[@citekey]` syntax.

Please see the [Jupyter Book documentation](https://jupyterbook.org/en/stable/content/citations.html#basic-citations) for more information on how to add references to your paper.

To add a reference to your paper, you can use the following steps:

1. Add your reference to the `references.bib` file. You can use [Google Scholar](https://scholar.google.com/) to find the BibTeX entry for your reference. 

    Here is an example: 
    ```
    @article{perez2011python
    ,	title	= {Python: an ecosystem for scientific computing}
    ,	author	= {Perez, Fernando and Granger, Brian E and Hunter, John D}
    ,	journal	= {Computing in Science \\& Engineering}
    ,	volume	= {13}
    ,	number	= {2}
    ,	pages	= {13--21}
    ,	year	= {2011}
    ,	publisher	= {AIP Publishing}
    }
    ```

2. Add the citation to your paper using the `[@citekey]` syntax. For example, to cite the paper above, you can use the following syntax: `[@perez2011python]`.


3. Add references to the end of your paper. You can do this by adding the following code to the end of your paper:

    ````md
    ```{bibliography}
    ```
    ````

    The above code will add the references, for example:
    ```{bibliography}
    ```


4. **Alternatively**, if you choose to have a seperate sections for the different parts of the paper, update the `_toc.yml` file to include a markdown file with the `references.bib` content.  This will add the references to the end of your paper.

```{bibliography}
```