Add Diagnostics documents converted from markdown

NCAR · Mar 19, 2021 · 3ad06cb · 3ad06cb
1 parent 691e822
commit 3ad06cb
Show file tree

Hide file tree

Showing 10 changed files with 598 additions and 11 deletions.
diff --git a/README.rst b/README.rst
@@ -202,12 +202,24 @@ References
    guide/mpi_intro
    guide/filters
    guide/inflation
+   guide/controlling-files-output
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Observations
+
+   guide/obs-seq-file
 
 .. toctree::
    :maxdepth: 2
    :caption: Diagnostics
 
+   guide/checking-your-assimilation
+   guide/computing-filter-increments
    guide/how-does-output-differ-from-input-increments
+   guide/dart-missing-data-value
+   guide/dart-quality-control
+   guide/examining-obs-seq-final
    guide/matlab-observation-space
 
 .. toctree::

diff --git a/guide/checking-your-assimilation.rst b/guide/checking-your-assimilation.rst
@@ -0,0 +1,42 @@
+Checking your initial assimilation
+==================================
+
+You may require several attempts to get your assimilation configured correctly.
+The next section, :doc:`computing-filter-increments`, describes how to take the
+difference between two assimilation stages to determine whether your initial
+assimilation worked as intented.
+
+If your assimilation does not change anything in the model state, you may need
+to rerun ``filter`` multiple times to understand what is wrong.
+
+Thus you should make ``filter`` very fast to run. You can do this by:
+
+1. Making an observation sequence file containing a single observation.
+2. Configuring your run so that filter does a single assimilation and exits
+   without having to advance the ensemble of models or do other work.
+
+Making an observation sequence file containing a single observation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You can use one of these methods to make an ``obs_seq`` with just a single
+observation:
+
+1. Run ``create_obs_sequence`` to make a new, short, observation sequence file.
+2. Use the ``obs_sequence_tool`` to cut an existing ``obs_seq.out`` file down
+   to just a few obs by selecting only a subset of the types and setting a very
+   short time window, such as a second or two when you know there are
+   observations available.
+
+Configuring your run so that filter does a single assimilation and exits
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To configure ``filter`` to only do a single assimilation:
+
+1. Edit the ``&filter_nml`` namelist in ``input.nml`` to set the
+   ``init_time_days`` and ``init_time_seconds`` to match the observation time
+   in your truncated observation sequence file. This overrides any times in the
+   input files and ensures that ``filter`` will only assimilate and not try to
+   advance the model.
+2. Make sure the truncated observation sequence file contains only a single 
+   observation or observations close enough together in time to fit into a
+   single assimilation window.
diff --git a/guide/computing-filter-increments.rst b/guide/computing-filter-increments.rst
@@ -0,0 +1,45 @@
+Computing filter increments
+===========================
+
+.. note::
+
+   This document is written as if your experiment was run with
+   ``single_file_out = .true.``. The potential permutations of filenames output
+   by filter is enormous, so it isn't feasible to write documentation for all
+   possible cases.
+
+After *filter* executes without error and produces an ``obs_seq.final`` file, a
+``preassim.nc`` file, and an ``analysis.nc`` file, the first questions to ask
+are:
+
+1. Is the model state output from ``filter`` different from the input?
+2. Were any observations successfully assimilated?
+
+You can check check if the output model state data was changed by the
+assimilation by using the ``ncdiff`` tool to create a file containing the
+difference of the ``preassim.nc`` and ``analysis.nc`` files. If you are running
+with ``single_file_in = .true.`` and ``single_file_out = .true.`` use
+``ncdiff`` on the files output for the analysis and preassim stages:
+
+.. code-block::
+
+   $ ncdiff analysis.nc preassim.nc increments.nc
+  
+Otherwise, if you are running with ``single_file_in = .false.`` and
+``single_file_out = .false.``, use ``ncdiff`` on the ensemble mean files for
+the analysis and preassim stages:
+
+.. code-block::
+
+   $ ncdiff analysis_mean.nc preassim_mean.nc increments.nc
+
+``ncdiff`` generates a file, ``increments.nc``, that contains the increments,
+or innovations, created by ``filter``. You can view the increments using
+``ncview``:
+
+.. code-block::
+
+   $ ncview increments.nc
+
+to examine the ensemble mean variables. If all values are 0, then the
+assimilation changed nothing in the state.
diff --git a/guide/controlling-files-output.rst b/guide/controlling-files-output.rst
diff --git a/guide/dart-missing-data-value.rst b/guide/dart-missing-data-value.rst
@@ -0,0 +1,18 @@
+DART missing data value
+=======================
+
+If all the prior and posterior mean values are -888888.0, which is the DART
+"missing data" value), those observations were not assimilated.
+
+.. note::
+
+   Some observations have precomputed values and the posterior values for these
+   will always be -888888.0, no matter if the observation was assimilated or
+   not.
+
+If it is not already set, edit the ``&filter_nml`` name list in ``input.nml``
+to set ``num_output_obs_members`` to be the same as the ensemble size.
+
+This will give you all the forward operator values for all the ensemble
+members. You can determine if all ensemble members are failing in the same way,
+or if only a few are problematic.
diff --git a/guide/dart-quality-control.rst b/guide/dart-quality-control.rst
@@ -0,0 +1,99 @@
+DART quality control field
+==========================
+
+For failing observations, the DART quality control (QC) field may indicate the
+reason. To learn more about how to intepret the QC field, see
+:doc:`obs-seq-file`.
+
+`How to locate the different values in an obs_seq.final file. <Observations.md#obs_seq_overview>`__
+
+The ‘DART QC’ field is usually the second of the 2 “quality control” copies.)
+A list of all the DART QC values can be found in the QC table.
+
+-  If the DART QC values are 4, the forward operators have failed. Look at the
+   *model_interpolate()* routine in your model_mod.f90 file, or the forward
+   operator code in *observations/forward_operators/obs_def_xxx_mod.f90* for
+   your observation type. A successful forward operator must return a valid
+   obs_val and an *istatus = 0*. If the forward operator code returns different
+   istatus values for different error types, you can set
+   *&filter_nml::output_forward_op_errors = .true.* and rerun *filter* to see
+   exactly what error istatus codes are being set. See `the filter
+   webpage <../../assimilation_code/programs/filter/filter.html>`__ for more
+   information on how to use the ‘output_forward_op_errors’ option. Negative
+   istatus values are reserved for the system, *istatus = 0* is success, and any
+   positive value indicates a failed forward operator. The code is free to use
+   different positive values to signal different types of errors.
+
+-  If the DART QC values are 5, those observation types were intentionally
+   ignored because they were not listed in the &obs_kind_nml namelist, in the
+   ‘assimilate_these_obs_types’ stringlist.
+
+-  If the DART QC values are 6, the data quality control that came with the
+   original observation data indicates this is a bad quality observation and it
+   was skipped for this reason.
+
+-  If the DART QC values are 7, the observation value is too far away from the
+   ensemble mean. Set *&filter_nml::outlier_threshold = -1* to ignore this for
+   now and rerun. In general, this is not the optimal strategy as the number of
+   observations inconsistent with the ensemble is a very powerful indicator of
+   the success or failure of the assimilation.
+
+If the prior and posterior values in the ``obs_seq.final`` are not -888888.0 but
+are identical, your obs are being assimilated but are having no impact.
+
+The most common reasons assimilated obs have no impact on the model state
+include:
+
+-  **Zero spread in ensemble members**
+   Your initial ensemble members must have different values for each state item.
+   If all members have identical values, the observations cannot make a change.
+   To diagnose this condition, look at the prior ensemble spread. This is either
+   in ``preassim.nc`` or ``preassim_sd.nc``, depending on your model. If all the
+   values are 0, this is your problem. One way to generate an ensemble with some
+   spread is to set *&filter_nml::perturb_from_single_instance = .false.,*
+   (which will still require a single filter initial condition file) but then
+   the *filter* code will add random gaussian perturbations to each state vector
+   item to generate an initial ensemble with spread. The magnitude of the
+   gaussian noise added is controlled by the
+   *&filter_nml::perturbation_amplitude*. It is also possible to write your own
+   perturbation routine in your ``model_mod.f90`` code.
+-  **Cutoff value too small**
+   If the localization radius is too small, the observation may not be ‘close
+   enough’ to the model grid to be able to impact the model. Check the
+   localization radius (*&assim_tools_nml::cutoff*). Set it to a very large
+   number (e.g. 100000) and rerun. If there is now an impact, the cutoff was
+   restricting the items in the state vector so your obs had no impact before.
+   Cutoff values are dependent on the location type being used. It is specified
+   in radians for the threed_sphere locations module (what most large models
+   use), or in simple distance (along a unit circle) if using a low order model
+   (lorenz, ikeda, etc).
+-  **Obs error values too large (less likely)**
+   If the observation error is very large, it will have no impact on the model
+   state. This is less likely a cause than other possibilities.
+-  **No correlation (unlikely)**
+   If there is no correlation between the distribution of the forward
+   observation values and the state vector values, the increments will be very
+   tiny. However there are generally still tiny increments applied, so this is
+   also a low likelyhood case.
+-  **Errors in forward operator location computations, or get_close_obs()**
+   If there is an error in the ``model_mod.f90`` code in either
+   *get_state_meta_data()*, *model_interpolate()*, or the vertical conversion
+   code in *get_close_obs()*, it is possible for the forward operators to appear
+   to be working correctly, but the distances computed for the separation
+   between the obs and the state vector values can be incorrect. The most
+   frequent problem is that the wrong locations are being passed back from
+   *get_state_meta_data()*. This can result in the increments being applied in
+   the wrong locations or not at all. This is usually one of the things to test
+   carefully when developing a new model interface, and usually why we recommend
+   starting with a single observation at a known location.
+-  **Incorrect vertical conversion**
+   If the model is using 3d coordinates and needs the capability to convert
+   between pressure, height, and/or model level, the conversion may be
+   incorrect. The state vector locations can appear to be too high or too low to
+   be impacted by an observation. Some models have a height limit built into
+   their model_mod code to avoid trying to assimilate observations at the model
+   top. The observations cannot make meaningful changes to the model state there
+   and trying to assimilate them can lead to problems with the inflation. If the
+   code in the model_mod is excluding observations incorrectly, or you are
+   testing with observations at the model top, this can result in no impact on
+   the model state.
diff --git a/guide/examining-obs-seq-final.rst b/guide/examining-obs-seq-final.rst
@@ -0,0 +1,24 @@
+Examining the obs_seq.final file
+================================
+
+If there are no changes in the model state after assimilation, then examine the
+``obs_seq.final`` file. There are two ways to do this.
+
+1. If you are testing with a single observation, just look in the file. If this
+   file is in binary format, edit the ``&obs_sequence_nml`` namelist in 
+   ``input.nml`` so the output observation sequence file will be written in
+   ASCII:
+
+   .. code-block:: fortran
+
+      &obs_sequence_nml
+        write_binary_obs_sequence = .false.
+      /
+    
+   Then rerun ``filter`` to regenerate an ``obs_seq.final`` file in ASCII. For 
+   an explanation of the contents of your ``obs_seq.final`` file, see
+   :doc:`obs-seq-file`.
+
+2. If you are using many observations, run the ``obs_diag`` program appropriate
+   for your model. The :doc:`matlab-observation-space` will help to summarize
+   your output and to explore what is going on.
diff --git a/guide/how-does-output-differ-from-input-increments.rst b/guide/how-does-output-differ-from-input-increments.rst
@@ -1,6 +1,5 @@
-##############################################
-Increments: How does output differ from input?
-##############################################
+Computing filter increments using a complex model
+===================================================
 
 The innovations to the model state are easy to derive. Use the `NCO
 Operator <http://nco.sourceforge.net/>`__ *ncdiff* to difference the two DART
@@ -24,7 +23,7 @@ image. It should be possible to get a sense of the magnitude of the innovations
 as a function of time.
 
 Example from a model of intermediate complexity: the bgrid model
-================================================================
+----------------------------------------------------------------
 
 I ran a perfect model experiment with the bgrid model in the DART-default
 configuration and turned on some adaptive inflation for this example. To fully
@@ -136,4 +135,4 @@ Clicking on any location generates a time series.
 This is fundamentally the same as the previous panel except that I have now
 selected the ‘**u**’ **u_mean** variable. Despite the fact the observations were
 only of ‘**t**’, the assimilation has generated (rightly so) increments to the
-‘**u**’ state variable.
+‘**u**’ state variable.
diff --git a/guide/obs-seq-file.rst b/guide/obs-seq-file.rst
@@ -0,0 +1,3 @@
+Observation sequence files
+==========================
+
diff --git a/theory/conditional-probability-bayes-theorem.rst b/theory/conditional-probability-bayes-theorem.rst
@@ -8,14 +8,15 @@ Conditional probability
 -----------------------
 
 Most real-world events involve uncertainty because the occurence of a specific
-outcome isn't guaranteed. You can sense in these hypothetical situations:
+outcome isn't guaranteed. You can sense that in situations in which these are
+possible outcomes:
 
 - your flight departs on time
 - you keep your New Year's resolution
 - your car needs repairs in the next 6 months
 
-that the stated outcomes aren't guaranteed. There is a chance that the opposite
-outcome occurs.
+there is a chance that the opposite outcome might occur. Describing such
+situations accurately requires making probabilistic statements.
 
 In mathematical notation, the probability of an event, :math:`A`, is denoted by
 :math:`P(A)`. If the event :math:`A` means that your flight departs on time,
@@ -84,7 +85,7 @@ the posterior, :math:`P(A|B)`. The theorem is:
 
     posterior = \frac{likelihood \times prior}{normalization}
 
-or
+or:
 
 .. math::
 
@@ -99,7 +100,7 @@ Prior
 You can estimate the probability of a carbon monoxide exposure event in your
 house, :math:`P(A)`, by dividing the number of carbon monoxide exposure events
 that occur annually in houses by the total number of houses in the United
-States, which is 140 million houses.
+States, which is 140 million houses:
 
 .. math::
 
@@ -110,7 +111,7 @@ Likelihood
 
 You can estimate the probability your detector sets off its alarm given that
 there is a carbon monoxide exposure event in your house, :math:`P(B|A)`, since
-you know the error rate of the detector, 0.1%.
+you know the error rate of the detector, 0.1%:
 
 .. math::