Merge pull request #56 from BerkeleyLab/update-readme

doc(README): mention experimental training feature
BerkeleyLab · May 24, 2023 · ed0afa5 · ed0afa5
2 parents a2b7027 + 42b5921
commit ed0afa5
Showing 1 changed file with 5 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -30,20 +30,19 @@ Table of contents
 Overview
 --------
 
-Inference-Engine is a software library for researching concurrent, large-batch inference and training of deep, feed-forward neural networks.  Inference-Engine targets high-performance computing (HPC) applications with performance-critical inference and training needs.  The initial target application is _in situ_ training of a cloud microphysics model proxy for the Intermediate Complexity Atmospheric Research ([ICAR]) model.  Such a proxy must support concurrent inference at every grid point at every time step of an ICAR run.  As of this writing, the code on the main branch supports concurrent inference.  A draft pull request supports concurrent training.  For validation purposes, Inference-Engine can also import neural networks exported from Python by the companion package [nexport].
+Inference-Engine is a software library for researching concurrent, large-batch inference and training of deep, feed-forward neural networks.  Inference-Engine targets high-performance computing (HPC) applications with performance-critical inference and training needs.  The initial target application is _in situ_ training of a cloud microphysics model proxy for the Intermediate Complexity Atmospheric Research ([ICAR]) model.  Such a proxy must support concurrent inference at every grid point at every time step of an ICAR run.  For validation purposes, Inference-Engine can also import neural networks exported from Python by the companion package [nexport]. The training capability is currently experimental. Current unit tests verify that Inference-Engine's network-training feature works for single-layer perceptrons.  Future work will include developing unit tests that verify that the training works for deep neural networks.
 
 Inference-Engine's implementation language, Fortran 2018, makes it suitable for integration into high-performance computing (HPC).
 The novel features of Inference-Engine include
 
 1. Exposing concurrency via 
-  - An `elemental` and implicitly `pure` inference function: `infer`.
-  - An `elemental` and implicitly `pure` activation strategy pattern.
+  - An `elemental`, polymorphic, and implicitly `pure` inference strategy,
+  - An `elemental`, polymorphic, and implicitly `pure` activation strategy , and
+  - A `pure` training subroutine.
 2. Gathering network weights and biases into contiguous arrays
 3. Runtime selection of inferences strategy and activation strategy. 
 
-Item 1 ensures that the `infer` procedure can be invoked inside Fortran's `do concurrent` construct, which some compilers can offload automatically to graphics processing units (GPUs).  We envision offload being useful in applications that require large numbers of independent inferences.  Item 2 exploits the special case where the number of neurons is uniform across the network layers.  The use of contiguous arrays facilitates spatial locality in memory access patterns.  Item 3 offers the possibility of adaptive inference method selection based on runtime information.  The current methods include ones based on intrinsic functions, `dot_product` or `matmul`.  Future work will include
-1. Exploring tradeoffs associated with language-based (`do concurrent`) versus directives-based (OpenMP and OpenACC) vectorization, multithreading, and accelerator offloading and
-2. Tradeoffs associated with different approaches to varying the number of neurons in each layer by daisy-chained `inference_engine_t` objects versus sparsely-connected `inference_engine_t` objects.
+Item 1 facilitates invoking Inference-Engine's `infer` function inside Fortran's `do concurrent` constructs, which some compilers can offload automatically to graphics processing units (GPUs).  We envision this being useful in applications that require large numbers of independent inferences or networks to to train.  Item 2 exploits the special case where the number of neurons is uniform across the network layers.  The use of contiguous arrays facilitates spatial locality in memory access patterns.  Item 3 offers the possibility of adaptive inference method selection based on runtime information.  The current methods include ones based on intrinsic functions, `dot_product` or `matmul`.  Future options will explore the use of OpenMP and OpenACC for vectorization, multithreading, and/or accelerator offloading.
 
 Downloading, Building and Testing
 ---------------------------------