Update calibration.ipynb

Added line in intro about still needing optimization. Misc. other minor text changes.
casangi · Apr 15, 2021 · 8b5763e · 8b5763e
1 parent 158c462
commit 8b5763e
Showing 1 changed file with 5 additions and 3 deletions.
diff --git a/docs/calibration.ipynb b/docs/calibration.ipynb
@@ -37,9 +37,11 @@
       "source": [
         "# Calibration\n",
         "\n",
-        "The initial demonstration of calibration in CNGI/ngCASA is centered around a streamlined implementation of on-the-fly self-calibration for synthesis visibility data. The purpose of the prototype implementation of self-cal is to demonstrate the use and throughput of the dask-based parallelization framework in the calibration context, and begin to explore how some essential components of general synthesis calibration (gain solving, application) might appear in python, and built upon the xarray visibility data structures. As such, this prototype is limited in several ways compared to the more conventional approach implemented in traditional CASA. In fact, there is no direct single-execution analog of ngCASA selfcal available within CASA, where self-calibration involves a sequence of executions of gaincal and applycal (and includes multiple passes through the data). In contrast, ngCASA selfcal is a single-pass implementation that takes an input data group (visibilities, weights, flags, and model), performs an antenna-based gain solution (at the native time granularity of the data), applies the result to the input data, and returns a calibrated output data group. Most notably, there is no way to examine the solved-for gain solutions themselves. This is mainly a practical consequence of not yet having designed the calibration solution container (caltable), nor the mechanisms for filling and examining it. Rather, the data product (for now), are the corrected visibilities, appropriate for comparison with the input visibilities and for imaging. More comprehensive tools for generating, managing, and examining calibration will be designed and implemented in ngCASA in future. Nonetheless, it is thought that a single-pass selfcal implementation of the sort demonstrated here will have a role in the processing of data from the larger arrays of the future (ngVLA, etc.) since self-calibration is likely the most I/O- and computationally-intensive calibration use case, insofar as it must process the (usually) most-voluminous science target visibilities in any synthesis visibility dataset. It will also be possible to integrate this self-calibration mechanism within the imaging regime, e.g., to implement a form a difference mapping. Thus, this demonstration is well-tuned to the specific question of applicability of the CNGI/ngCASA framework to the calibration domain.\n",
+        "The initial demonstration of calibration in CNGI/ngCASA is centered around a streamlined implementation of on-the-fly self-calibration for synthesis visibility data. The purpose of the prototype implementation of self_cal is to demonstrate the use and throughput of the dask-based parallelization framework in the calibration context, and begin to explore how some essential components of general synthesis calibration (gain solving, application) might appear in python, and built upon the xarray visibility data structures. As such, this prototype is limited in several ways compared to the more conventional approach implemented in traditional CASA. In fact, there is no direct single-execution analog of ngCASA self_cal available within CASA, where self-calibration involves a sequence of executions of gaincal and applycal (and includes multiple passes through the data). In contrast, ngCASA self_cal is a single-pass implementation that takes an input data group (visibilities, weights, flags, and model), performs an antenna-based gain solution (at the native time granularity of the data), applies the result to the input data, and returns a calibrated output data group. Most notably, there is not yet a way to examine the solved-for gain solutions themselves. This is mainly a practical consequence of not yet having designed the calibration solution container (caltable), nor the mechanisms for filling and examining it. Rather, the data product (for now), are the corrected visibilities, appropriate for comparison with the input visibilities and for imaging. More comprehensive tools for generating, managing, and examining calibration will be designed and implemented in ngCASA in future.  Nonetheless, it is thought that a single-pass implementation of the sort demonstrated here will have a role in the processing of data from the larger arrays of the future (ngVLA, etc.) since self-calibration is likely the most I/O- and computationally-intensive calibration use case, insofar as it must process the (usually) most-voluminous science target visibilities in any synthesis visibility dataset. It will also be possible to integrate this self-calibration mechanism within the imaging regime, e.g., to implement a form a difference mapping. Thus, this demonstration is well-tuned to the specific question of applicability of the CNGI/ngCASA framework to the calibration domain.\n",
         "\n",
-        "The ngCASA selfcal function ingests an appropriately time-chunked xarray dataset including (possibly nominally calibrated) visibility data, weights, flags, and model. For each dask chunk in the xarray data group, the input visibility data, model, and weights are conditioned for the solve by (a) zeroing the weights for flagged or absent data, (b) slicing out just the parallel-hands, (c) forming the ratio of visibility and model (including weight update), (d) weighted averaging over frequency channels, (e) (optionally) combining the parallel-hand correlations for a single-pol ('T') solution (including weight update), (f) (optionally) weighted averaging of the time axis up to the virtual dask chunking (including weight update), (g) (optionally) dividing by the visibility amplitudes for phase-only solutions (including weight update). This results in data and weight arrays properly prepared and maximally collapsed for sliced ingestion within the solve loop. Then, for each timestamp and polarization, the solve loop will: (a) detect the available antenna-based constraints, zeroing the weights for all baselines involving antennas with insufficient data according to minblperant, (b) calculate a first guess for the gains based on the available baselines to the specified reference antenna, (c) perform the scalar gain solve via scipy.optimize.least_squares, supplying a weighted-residual calculation function that embodies the (scalar, for now) multiplicative algebra of the visibilities and gains, (d) derive solution error information, and (e) store the solved-for gains in a (temporary) array. Upon completion of the solve loop: (a) phase-only gains are enforced to have unit amplitude (if necessary), (b) the user-specified SNR threshold is applied, and (c) the original data arrays (all channels, correlations, times) are corrected. These results for all dask chunks are then aggregated into a new xarray data group.\n",
+        "Note that the self_cal function has not undergone the degree of performance optimization that imaging/gridding has, and is therefore not intended as a demonstration of framework performance.\n",
+        "\n",
+        "The ngCASA self_cal function ingests an appropriately time-chunked xarray dataset including (possibly nominally calibrated) visibility data, weights, flags, and model. For each dask chunk in the xarray data group, the input visibility data, model, and weights are conditioned for the solve by (a) zeroing the weights for flagged or absent data, (b) slicing out just the parallel-hands, (c) forming the ratio of visibility and model (including weight update), (d) weighted averaging over frequency channels, (e) (optionally) combining the parallel-hand correlations for a single-pol ('T') solution (including weight update), (f) (optionally) weighted averaging of the time axis up to the virtual dask chunking (including weight update), (g) (optionally) dividing by the visibility amplitudes for phase-only solutions (including weight update). This results in data and weight arrays properly prepared and maximally collapsed for sliced ingestion within the solve loop. Then, for each timestamp and polarization, the solve loop will: (a) detect the available antenna-based constraints, zeroing the weights for all baselines involving antennas with insufficient data according to minblperant, (b) calculate a first guess for the gains based on the available baselines to the specified reference antenna, (c) perform the scalar gain solve via scipy.optimize.least_squares, supplying a weighted-residual calculation function that embodies the (scalar, for now) multiplicative algebra of the visibilities and gains, (d) derive solution error information, and (e) store the solved-for gains in a (temporary) array. Upon completion of the solve loop: (a) phase-only gains are enforced to have unit amplitude (if necessary), (b) the user-specified SNR threshold is applied, and (c) the original data arrays (all channels, correlations, times) are corrected. These results for all dask chunks are then aggregated into a new xarray data group.\n",
         "\n",
         "Current notable limitations:\n",
         "\n",
@@ -464,4 +466,4 @@
       ]
     }
   ]
-}
+}