gammapy · adonath · Mar 24, 2021 · Mar 18, 2021 · Mar 19, 2021 · Mar 19, 2021
diff --git a/docs/estimators/index.rst b/docs/estimators/index.rst
@@ -11,80 +11,83 @@ estimators - High level estimators
 Introduction
 ============
 The `gammapy.estimators` submodule contains algorithms and classes
-for high level flux and significance estimation such as flux maps,
-flux points, flux profiles and flux light curves. All estimators
-feature a common API and allow to estimate fluxes in bands of reconstructed
-energy.
+for high level flux and significance estimation. This includes
+estimation flux points, flux maps, flux points, flux profiles and
+flux light curves. All estimators feature a common API and allow
+to estimate fluxes in bands of reconstructed energy.
 
 The core of any estimator algorithm is hypothesis testing: a reference
 model or counts excess is tested against a null hypothesis. From the
-best fit reference model a flux is derived and a corresponding :math:`\sqrt{\Delta TS}`
-value from the difference in fit statistics to the null hypothesis,
-assuming one degree of freedom. In this case
-:math:`\sqrt{\Delta TS}` represents an approximation of the
-"classical significance".
+best fit reference model a flux is derived and a corresponding :math:`\Delta TS`
+value from the difference in fit statistics to the null hypothesis.
+Assuming one degree of freedom, :math:`\sqrt{\Delta TS}` represents an
+approximation (`Wilk's theorem <https://en.wikipedia.org/wiki/Wilks%27_theorem>`_)
+of the "classical significance". In case of a negative best fit flux,
+e.g. when the background is overestimated, the significance is defined
+as :math:`-\sqrt{\Delta TS}` by convention.
 
-In general the flux can be estimated using methods:
+In general the flux can be estimated using two methods:
 
-1. Based on model fitting: given a (global) best fit model with multiple model components,
-the flux of the component of interest is re-fitted in the chosen energy, time or spatial
-region. The new flux is given as a ``norm`` with respect to the global reference model.
-Optionally other component parameters in the global model can be re-optimised.
+#. **Based on model fitting:** given a (global) best fit model with multiple model components,
+   the flux of the component of interest is re-fitted in the chosen energy, time or spatial
+   region. The new flux is given as a ``norm`` with respect to the global reference model.
+   Optionally other component parameters in the global model can be re-optimised. This method
+   is also named **forward folding**.
 
-2. Based on excess: in the case of having one energy bin, neglecting the PSF and not re-optimising
-other parameters, once can estimate the flux based on excess and derive the significance
-analytically from the classical Li & Ma solution.
+#. **Based on excess:** in the case of having one energy bin, neglecting the PSF and
+   not re-optimising other parameters, one can estimate the significance based on the
+   analytical solution by [LiMa1983]. In this case the "best fit" flux and significance
+   are given by the excess over the null hypothesis. This method is also named
+   **backward folding**.
 
 
-The technical implementation follows the concept of a reference
-best fit model. Given a global best fit model, the source of interest
-(for which flux points are computed) is scaled in amplitude by fitting a ``norm``
-parameter. The fitting is done by grouping the data in time
-and reconstructed energy bins (reference?).
-
-Based on this algorithm most estimators compute the same basic quantities:
+Uniformly for both methods most estimators compute the same basic quantities:
 
 ================= =================================================
 Quantity          Definition
 ================= =================================================
-e_ref			  Reference energy
-e_min			  Minimum energy
-e_max			  Maximum energy
-norm			  Norm with respect to the reference spectral model
-norm_err		  Symmetric rrror on the norm derived from the Hessian matrix
-ts				  Difference in fit statistics (`stat_sum - null_value` )
-sqrt_ts			  Square root of TS, corresponds to significance (Wilk's theorem)
+norm              Best fit norm with respect to the reference spectral model
+norm_err          Symmetric error on the norm derived from the Hessian matrix
+stat              Fit statistics value of the best fit hypothesis
+stat_null         Fit statistics value of the null hypothesis
+ts                Difference in fit statistics (`stat - stat_null` )
+sqrt_ts           Square root of ts time sign(norm), in case of one degree of freedom, corresponds to significance (Wilk's theorem)
+npred             Predicted counts of the best fit hypothesis, equivalent to correlated counts for backward folding
+npred_null        Predicted counts of the null hypothesis, equivalent to correlated null counts for backward folding
+npred_excess      Predicted counts of the excess over `npred_null`, equivalent to (`npred - npred_null`), equivalent to correlated counts for backward folding
 ================= =================================================
 
+
 In addition the following optional quantities can be computed:
 
 ================= =================================================
 Quantity          Definition
 ================= =================================================
-norm_errp		  Positive error of the norm
-norm_errn	      Negative error of the norm
-norm_ul			  Upper limit of the norm
-norm_scan		  Norm scan
-stat_scan		  Fit statistics scan
-stat			  Fit statistics value of the best fit model
-null_value		  Fit statistics value of the null hypothesis
+norm_errp         Positive error of the norm
+norm_errn         Negative error of the norm
+norm_ul           Upper limit of the norm
+norm_scan         Norm scan
+stat_scan         Fit statistics scan
 ================= =================================================
 
-
-To compute the assymetric errors as well as upper limits one can
+To compute the error, assymetric errors as well as upper limits one can
 specify the arguments ``n_sigma`` and ``n_sigma_ul``. The ``n_sigma``
-arguments are translated into a TS value assuming ``ts = sigma ** 2``.
+arguments are translated into a TS difference assuming ``ts = n_sigma ** 2``.
 
-In addition to the norm values a reference spectral model is given.
-Using this reference spectral model the norm values can be converted
+In addition to the norm values a reference spectral model and energy ranges
+are given. Using this reference spectral model the norm values can be converted
 to the following different SED types:
 
 ================= =================================================
 Quantity          Definition
 ================= =================================================
-dnde 		      Differential flux at ``e_ref``
-flux 			  Integrated flux between ``e_min`` and ``e_max``
-eflux			  Integrated energy flux between ``e_min`` and ``e_max``
+e_ref             Reference energy
+e_min             Minimum energy
+e_max             Maximum energy
+dnde              Differential flux at ``e_ref``
+flux              Integrated flux between ``e_min`` and ``e_max``
+eflux             Integrated energy flux between ``e_min`` and ``e_max``
+e2dnde            Differential energy flux between ``e_ref``
 ================= =================================================
 
 The same can be applied for the error and upper limit information.
@@ -102,14 +105,13 @@ Getting Started
 Tutorials
 =========
 
-The main tutorial that demonstrates how to extract light curves from 1D and 3D datasets:
-
-* `Light Curve tutorial <../tutorials/light_curve.html>`__
-
-Light curve extraction on small time bins (i.e. smaller than the observation scale) for flares
-is demonstrated in the following tutorial:
+The main tutorial that demonstrates how to use Estimator classes are:
 
-* `Flare tutorial <../tutorials/light_curve_flare.html>`__
+* `Light Curve tutorial (LightCurveEstimator) <../tutorials/analysis/time/light_curve.html>`__
+* `Flare tutorial (LightCurveEstimator) <../tutorials/analysis/time/light_curve_flare.html>`__
+* `Source detection (TSMapEstimator) <../tutorials/analysis/2D/detect.html>`__
+* `Spectral analysis (FluxPointEstimator) <../tutorials/analysis/1D/spectral_analysis.html>`__
+* `Detailed 3D analysis (ExcessMapEstimator) <../tutorials/analysis/3D/analysis_3d.html>`__
 
 
 Reference/API