Skip to content
MikeMineter edited this page Jun 27, 2022 · 22 revisions

Welcome to the OptClim2 wiki! This is a draft being written in June 2022 for the archer2 branch.

What is OptClim?

OptClim is a software framework to that uses a cyclic workflow optimise parameters in models by generating configurations (parameter sets) of models, running them and then comparing model results against observations.

overview1JPG

Models that include parametrised processes are commonplace. Among examples in atmospheric modelling are those representing cloud processes – their formation and processes concerning both rain and reflection/radiation. Parameters have ranges of possible values, and with OptClim these parameters can be tuned so models better represent past observations, and so can be projected forward with more confidence – e.g. to understand the impacts of the increasing CO2 concentrations on climate.

OptClim runs require:

  • the definition of parameters, each with range of allowed values and default values (4 or 5 is typical)
  • the location of each parameter in the namelists of the model, to permit it to be edited. Currently this information has to be added to the OptClim code specific to the model. (It should be table-driven in future) There is also the concept of a "metaparameter" whereby one user-defined parameter can cause an array, or multiple other dependent namelist parameters to be set.
  • observations (single numbers, such as global averaged outgoing shortwave radiation, for a climate model) Typically one more than the number of parameters being optimised.
  • user-provided code to generate simulated observations from the model results
  • selection of a supported optimisation method, currently DFOLS is generally used in preference to the supported alternative of Gauss-Newton. This sets parameter values for the runs to be orchestrated by OpClim.

These are all held in a JSON file used by the OptClim software. A study - the workflow and model instances set up by OptCim - is generated in the directory holding the JSON file. The JSON file includes:

JSON element purpose
Name Name of the study (directory to be created in the directory of the JSON file)
baseRunID This is a _prefix for the directory of a run, e.g. if yd, then runs are in yd001, yd002...
runCode the Archer2 budget code agains which jobs are accounted
machineName Must be "slurm" for Archer2
modelName one of MITgcm, CESM, UKESM
study.referenceModelDirectory the model directory that is cloned (Note - not used for UKESM)
optimise.dfols settings for the dfols optimiser
Parameters define each parameter: range of values, initial value to be used by OptClim)
postProcess specify the code and added data used for generating simobs from model outputs
targets the target values of the simulated observations
simulatedObservations names and associated data for the simulated observations, to connect the outputs of the postProcess script to the optimiser

The OptClim2 software provides the following:

  1. Run the optimiser script that assimilates all results to date and determines the next set of models to be run, with their corresponding parameter values.
  2. Queue an array of sequential jobs, one task for each of the models to be run. These are in state “held”
  3. Queue a job to await completion of all the array tasks, to run the optimiser script again.
  4. Clone each model and modify their parameters
  5. Start each model, the model scripts including a “release command” for its array task, this being run on completion of the model. For each cloning, a “base model” is replicated – one already tested and amended so it interfaces with OptClim as described below.

A glossary of terms used is in a separate wiki page.

History of OptClim

The initial prototype of OptClim, termed OptClim1 was developed and used on the University of Edinburgh cluster Eddie in a collaboration between Prof. Simon Tett, Prof. Coralia Cartis and Dr Mike Mineter..... links to papers (Any other names to include, Simon?)

OptClim2 was coded by Prof. Tett. This added functionality and reimplemented some of the bash scripts of OptClim1 in Python.

The Archer2 branch of GitHub was developed with support from eCSE to port OptClim2 to Archer2 with minimum amendments to code and with extensions for the MITgcm, CESM2 and UKESM models.

For guidance on installation see https://github.com/optclim/ModelOptimisation/wiki/Installing-OptClim

Documentation specific to each supported model

A separate wiki page exists for each supported model. An example study for each exists on Archer2.

Support

It is expected that first-time users will require some support and guidance - to help finalise their plans as well as with initial configuration. This is on a best-efforts basis, and can be requested via a list called optclim_developers in the email domain mlist.is.ed.ac.uk.

Extensions made to the code for Archer2

Module Amendments
runAlgorithm.py added import of each new model's class - it imports all valid models; no other change
UKESM.py Class for UKESM; child of ModelOptimisation; saves parameters to the stub directory zd001 etc.
MITgcm.py Class for models using MITgcm
CESM.py CLass for CESM2
config.py Specific functions for SLURM