<center><strong><font size=+3>A Generalized Approach to Redundant Calibration with JAX</font></center>
<br><br>
</center>
<center><strong><font size=+2>Matyas Molnar and Bojan Nikolic</font><br></strong></center>
<br><center><strong><font size=+1>Astrophysics Group, Cavendish Laboratory, University of Cambridge</font></strong></center>

Example notebook that performs redundant calibration with a generalized MLE framework (§1, see [HERA Memorandum #84](http://reionization.org/wp-content/uploads/2013/03/HERA084__A_Generalized_Approach_to_Redundant_Calibration_with_JAX.pdf)) and that compares redundantly calibrated visibilities across days by solving for degenerate parameter offsets between them (§2, see [HERA Memorandum #94](http://reionization.org/manual_uploads/HERA094__Comparing_Visibility_Solutions_from_Relative_Redundant_Calibration_by_Degenerate_Translation.pdf)).

[JAX](https://github.com/google/jax) is used for the calibration computations, which offers considerable speed-up, with great ease (compared to pure NumPy and SciPy), even when moving away from Gaussianity and without the need to linearize or approximate.

In [None]:
import os

import numpy
from matplotlib import pyplot as plt

from fit_diagnostics import abs_residuals, norm_residuals
from plot_utils import cplot, plot_red_vis
from red_likelihood import condenseMap, degVis, doRelCal, doRelCalD, doOptCal, \
doDegVisVis, flt_ant_pos, group_data, gVis, makeEArray, norm_rel_sols, red_ant_sep, \
relabelAnts, rotate_phase, split_rel_results, XDgVis
from red_utils import find_flag_file, find_nearest, find_zen_file, get_bad_ants, \
match_lst
from xd_utils import XDgroup_data

In [None]:
numpy.set_printoptions(threshold=500)

In [None]:
plt.rcParams['figure.figsize'] = (12, 8)
%matplotlib inline

In [None]:
plot_figs = False
if plot_figs:
    import matplotlib as mpl
    mpl.rcParams['figure.dpi'] = 300

from matplotlib import rc
rc('font',**{'family':'serif','serif':['cm']})
rc('text', usetex=True)
rc('text.latex', preamble=r'\usepackage{amssymb} \usepackage{amsmath}')

In [None]:
# Inputs
JD = 2458098.43869
JD_comp = 2458099 # for degenerate comparison
pol = 'ee' # polarization
freq_channel = 605 # frequency channel
time_integration = 0 # time integration of the 1st dataset on day JD
noise_dist = 'gaussian' # assumed noise distribution for neg log-likelihood minimizations

In [None]:
# Other options (development purposes only)
rel_cal_coords = 'cartesian' # parameter coordinate system
bounded_rel_cal = False # bound gain and visibility amplitudes in relative calibration
logamp = False # log(gain amplitude) parameter method to force positive amplitudes
lovamp = False # log(vis amplitude) parameter method to force positive amplitudes
rc_ref_ant_idx = None # constrain gain of reference antenna
tilt_reg = False # regularization term to constrain tilt shifts to 0
gphase_reg = False # regularization term to constrain the gain phase mean to 0
rot_phase = False # rotate phases of relative calibration gains with negative amplitudes 

## Retrieving the data

In [None]:
zen_fn = find_zen_file(JD)
bad_ants = get_bad_ants(zen_fn)
flags_fn = find_flag_file(JD, 'first') # import flags from firstcal

In [None]:
# Finding and loading the zen_file on JD_comp that matches the LAST of the 1st dataset
JD2 = match_lst(JD, JD_comp, tint=time_integration)
if len(str(JD2)) < 13:
    JD2 = str(JD2) + '0' # add a trailing 0 that is omitted in float
zen_fn2 = find_zen_file(JD2)
bad_ants2 = get_bad_ants(zen_fn2)
flags_fn2 = find_flag_file(JD2, 'first') # import flags from firstcal

# Taking the union of the bad antenna arrays for both datasets, if they're not equal
if not numpy.array_equal(bad_ants, bad_ants2):
    print('The visibilities for datasets {} and {} do not have the same bad antennas. '\
          'Selecting the union of the bad antennas for those datasets in this analysis.'.\
          format(JD, JD2))
    bad_ants = numpy.union1d(bad_ants, bad_ants2)

In [None]:
hdraw, RedG, cMData = group_data(zen_fn, pol, freq_channel, None, bad_ants, flags_fn)
cData = cMData.filled() # filled with nans for flags
flags = cMData.mask

# mitigating for multiple freqs - only chooses first one
if cData.shape[0] > 1:
    cData = cData[0, ...]
    flags = flags[0, ...]
    print('Frequency channel {} selected for notebook analysis\n'.format(freq_channel[0]))
cData = numpy.squeeze(cData)
flags = numpy.squeeze(flags)

if all(numpy.isnan(cData[time_integration, :])):
    print('All visibilities for channel {} and time integration {} are flagged '\
          '- choose different values'.format(freq_channel, time_integration))

ants = numpy.unique(RedG[:, 1:])
no_ants = ants.size
no_unq_bls = numpy.unique(RedG[:, 0]).size
cRedG = relabelAnts(RedG)
ant_pos_arr = flt_ant_pos(hdraw.antpos, ants)

In [None]:
plot_red_vis(cData, RedG, vis_type='amp')

# Redundant calibration

Fundamentally, the problem of calibration boils down to the measurement equation:

$$ V_{ij}^{\text{obs}} (\nu) = g_i (\nu) g_j^* (\nu) V_{ij}^{\text{true}}(\nu) + n_{ij} (\nu) $$

where the observed visibility $V_{ij}^{\text{obs}}$ between antennas $i$ and $j$ at a given time and frequency is related to the true underlying visibility $V_{ij}^{\text{true}}$ by a pair of complex and frequency-dependent gain factors, $g_i$ and $g_j$, if we assume per-antenna gains, along with uncorrelated random noise $n_{ij}$. The ultimate aim of calibration is to solve for these gains and true visibilities. 

An array with regularly spaced antennas has many redundant visibilities that are sensitive to the same modes on the sky. Redundant calibration uses the fact that the true visibilities from redundant baselines are equal. Supposing there are no direction-dependent calibration effects, we therefore have a system of equations for all antenna pairs $i$ and $j$:

$$ V_{ij}^{\text{obs}} (\nu) = g_i (\nu) g_j^* (\nu) U_{\alpha}(\nu) + n_{ij} (\nu) $$

where $U_{\alpha}(\nu) = V(\mathbf{r}_i-\mathbf{r}_j)$, the visibility for the baseline vector $\mathbf{b}_{ij} = \mathbf{r}_i-\mathbf{r}_j$, corresponds to a redundant baseline set that we index by $\alpha$.

For the planned full HERA array, there will be 331 elements in the hexagonal core, corresponding to $N_{\mathrm{bl}} = 331(331-1)/2 = 54,615$ baselines. The core only has 630 unique baselines, which means that we have a non-linear system of $54,615$ equations to determine the 630 true visibilities and 331 gains.

## Relative calibration

With the redundant calibration prior, an MLE for the gains and true visibilities can be constructed by assuming a distribution for the observed visibility noise.

### Gaussian distribution

Assuming Gaussian uncorrelated noise with variance $\sigma_{ij}^2$, which is the expected noise from the receivers and the sky, through MLE considerations, the gains and true visibilities can be found by minimizing the following negative log-likelihood function:

$$ -\ln(\mathcal{L}^G_{\mathrm{rel}})(\nu) = \frac{1}{2} \sum_{\alpha} \sum_{\{i,j\}_{\alpha}} \ln(2 \pi \sigma_{ij}^2(\nu)) + \frac{ \left| V_{ij}^{\text{obs}} (\nu) - g_i (\nu) g_j^{*} (\nu) U_{\alpha}(\nu) \right|^2}{\sigma_{ij}^2(\nu)} $$

where $\{i,j\}_{\alpha}$ are sets of antennas that belong to baseline group $\alpha$. This minimization is equivalent to minimizing the $\chi^2$:

$$ \chi_{\mathrm{rel}}^{2} (\nu) = \sum_{\alpha} \sum_{\{i,j\}_{\alpha}} \frac{ \left| V_{ij}^{\text{obs}} (\nu) - g_i (\nu) g_j^* (\nu) U_{\alpha}(\nu) \right|^2 }{\sigma_{ij}^2(\nu)} $$

This non-linear least-squares optimization can be done independently between frequencies and time. Solving $ \chi_{\mathrm{rel}}^{2}$ has been the main focus of redundant calibration methods, with current efforts opting to linearize the measurement equation for computational ease.

### Cauchy distribution

Empirically, it is found that the noise from visibility observations from a redundant set may not be Gaussian, due to non-redundancies from instrumental effects and the presence of outliers (from e.g. RFI), and that the visibilities may follow a distribution with fatter tails. To be insensitive to outliers, so can employ robust statistics.

We can extend the MLE analysis to different distributions for the visibility noise. As an example, we can assume a Cauchy distribution for the visibility noise, which has the median as its location parameter. This is a robust measure of central tendency and is used to reduce the effect of RFI; it is given by

$$ f(x; x_0, \gamma) = \frac{1}{\pi \gamma \left[ 1 + \left( \frac{x - x_0}{\gamma} \right)^2 \right] }$$

where $x_0$ is the location parameter (the median) and $\gamma$ is the scale parameter, which specifies the HWHM.

We no longer assume that the noise in the measurement equation is Gaussian, and instead, assume it is Cauchy distributed when solving for the relative redundant calibration parameters.

Working in a maximum likelihood framework, the likelihood when solving for the measurement equation for redundant baseline sets is given by

If we assume Cauchy distributed data, the negative log-likelihood when solving for redundant baseline sets is given by

$$ -\ln(\mathcal{L}^C_{\mathrm{rel}}) (\nu) = \sum_{\alpha} \sum_{\{i,j\}_{\alpha}} \ln(\pi \gamma_{ij} (\nu)) + \ln \left( 1 + \left( \frac{\left| V_{ij}^{\text{obs}} (\nu) - g_i (\nu) g_j^* (\nu) U_{\alpha}(\nu) \right|}{\gamma_{ij}(\nu)} \right)^2 \right) $$

This MLE with Cauchy-distributed noise fully encapsulates the distribution of the data, without being distorted by outliers, and is the best median estimator of the data.

The advantage of the Cauchy distribution in fitting for redundant baselines is not clear-cut: while it does reduce the impact of outliers, it only performs better than the Gaussian in the presence of RFI, which is in the minority of cases. However, the use of the Cauchy could still be significant, especially if weak RFI is not picked up by flagging and the observations are integrated down. Further investigation is required to see if there is any merit in using the Cauchy distribution in such MLE computations.

In [None]:
obs_vis = cData[time_integration, :]
res_rel, initp = doRelCal(cRedG, obs_vis, no_unq_bls, no_ants, distribution=noise_dist, \
                   coords=rel_cal_coords, bounded=bounded_rel_cal, norm_gains=True, \
                   logamp=logamp, lovamp=lovamp, tilt_reg=tilt_reg, ant_pos_arr=ant_pos_arr, \
                   gphase_reg=gphase_reg, ref_ant_idx=rc_ref_ant_idx, max_nit=5000, return_initp=True)
res_relx = numpy.array(res_rel['x'])
if rel_cal_coords == 'polar' and rot_phase and (res_relx[-2*no_ants::2] < 0).any():
    # adjustement if negative amplitude solutions are found, where the absolute values
    # of amplitudes are taken, and phases of affected antennas are rotated by +pi
    print('Rotating gain and visibility phases to have positive amplitudes.')
    res_relx = rotate_phase(res_relx, no_unq_bls, norm_gains=True)
    
res_rel_vis, res_rel_gains = split_rel_results(res_relx, no_unq_bls, coords=rel_cal_coords)

In [None]:
# gain amplitudes
if rel_cal_coords == 'polar':
    gamps = res_relx[-2*no_ants::2]
if rel_cal_coords == 'cartesian':
    gamps = numpy.abs(res_rel_gains)
print(gamps)

In [None]:
# gain phases
if rel_cal_coords == 'polar':
    gphases = res_relx[-2*no_ants+1::2]
if rel_cal_coords == 'cartesian':
    gphases = numpy.angle(res_rel_gains)
print(gphases)

In [None]:
# some gain stats
print('Gains - average amp: {}, product of amps: {}, average phase: {}'.format(gamps.mean(), \
      gamps.prod(), gphases.mean()))

In [None]:
# tilt shifts
tiltx = (gphases * ant_pos_arr[:, 0]).sum()
tilty = (gphases * ant_pos_arr[:, 1]).sum()
print('Tilt in x-coordinate: {}\nTilt in y-coordinate: {}'.format(tiltx, tilty))

In [None]:
# Residuals for relative redundant step
pred_rel_vis = gVis(res_rel_vis, cRedG, res_rel_gains)
rel_residuals = obs_vis - pred_rel_vis
cplot(rel_residuals, xlabel='Baseline', ylabel='Residual')

In [None]:
# Relative residuals normalized by amplitude
norm_rel_residuals = norm_residuals(obs_vis, pred_rel_vis)
cplot(norm_rel_residuals, xlabel='Baseline', ylabel='Normalized residual')
print('Median absolute normalized residual - Real: {}, Imag: {}'\
      .format(*abs_residuals(norm_rel_residuals)))

## Constraining degeneracies (optimal calibration)

See [HERA Memo 63](http://reionization.org/wp-content/uploads/2013/03/HERA063_abs_cal_compare.pdf) for further details about constraining these degeneracies.

Relative calibration yields solutions with degeneracies that can be parameterized as four terms per frequency:
 - Overall amplitude $A(\nu)$
 - Overall phase $\Delta(\nu)$
 - Phase gradient components $\Delta_x(\nu)$ and $\Delta_y(\nu)$
 
since the below transformations of these degenerate parameters leave $-\ln(\mathcal{L})$ (for both the Gaussian and Cauchy distributions) unchanged:
 - $g_i \rightarrow A g_i$ accompanied by $U_{\alpha} \rightarrow A^{-2} U_{\alpha}$
 - $g_k = |g_k|e^{i\phi_k} \rightarrow |g_k|e^{i(\phi_k + \Delta)}$, s.t. $g_k g_l^{*} = |g_k| |g_l| e^{i(\phi_k - \phi_l)} \rightarrow |g_k| |g_l| e^{i(\phi_k + \Delta - \phi_l - \Delta)} = g_k g_l^{*}$
 - $g_k = |g_k|e^{i\phi_k} \rightarrow |g_k|e^{i(\phi_k + \Delta_x x_k + \Delta_y y_k)}$ accompanied by $U_{\alpha} = |U_{\alpha}| e^{i\phi_{\alpha}} \rightarrow |U_{\alpha}| e^{i(\phi_{\alpha} - \Delta_x x_{\alpha} - \Delta_y y_{\alpha})}$
 
where in the last line, the array is assumed to be co-planar, and $(x_k, y_k)$ are the positional coordinates of antenna $k$, and $(x_{\alpha}, y_{\alpha})$ are the separations of the antennas that form baselines in redundant set $\alpha$.

The degenerate parameters can be calculated from a sky model in an absolute calibration step. Alternatively, we can solve for these degenerate parameters by calculating them directly from the $-\ln(\mathcal{L})$ by applying a few conditions. This method, however, still ultimately needs to reference the sky to set the flux scale and phase centre.

We first define a set of parameters $h_i$ to be the gains that obey the following constraints:

$$
\begin{align}
    \frac{1}{N} \sum_i^N |h_i| = 1 \quad \rightarrow \quad & \text{mean gain amplitude of 1} \\
	\frac{1}{N} \sum_i^N \mathrm{Arg} (h_i) = 0 \quad \rightarrow \quad & \text{mean gain phase of 0} \\
    \sum_i^N x_i \mathrm{Arg} (h_i) = 0 \quad \rightarrow \quad & \text{phase gradient of 0 in }x \\
	\sum_i^N y_i \mathrm{Arg} (h_i) = 0 \quad \rightarrow \quad & \text{phase gradient of 0 in }y
\end{align}
$$

such that the antenna gains can be written as

$$ g_i (\nu) = A(\nu) e^{i \left[ \Delta (\nu) + \Delta_{x} (\nu) x_{i} + \Delta_{y} (\nu) y_{i} \right]} h_i (\nu) $$

where $(x_i, y_i)$ is the position of antenna $i$, so that all degenerate dependencies are removed from $h_i$. We note that these constraints are arbitrary.

Non-degenerate formulations of the negative log-likelihoods from relative calibration are therefore given by 

 - Gaussian distribution:
 
$$ -\ln(\mathcal{L}^G_{\mathrm{constr}})(\nu) = \frac{1}{2} \sum_{\alpha} \sum_{\{i,j\}_{\alpha}} \ln(2 \pi \sigma_{ij}^2(\nu)) + \frac{ \left| V_{ij}^{\text{obs}} (\nu) - h_i (\nu) h_j^{*} (\nu) W_{\alpha} (\nu) \right|^2 }{\sigma_{ij}^2(\nu)} $$

 - Cauchy distribution:

$$ -\ln(\mathcal{L}^C_{\mathrm{constr}}) (\nu) = \sum_{\alpha} \sum_{\{i,j\}_{\alpha}} \ln(\pi \gamma_{ij} (\nu)) + \ln \left( 1 + \left( \frac{\left| V_{ij}^{\text{obs}} (\nu) - h_i (\nu) h_j^{*} (\nu) W_{\alpha} (\nu) \right|}{\gamma_{ij}(\nu)} \right)^2 \right) $$

where

$$ W_{\alpha} (\nu) = A^2(\nu) e^{i \left[ \Delta_{x} (\nu) x_{\alpha} + \Delta_{y} (\nu) y_{\alpha} \right]} U_{\alpha} $$

and $(x_{\alpha}, y_{\alpha})$ are the baseline coordinates of redundant set $\alpha$.
 
The overall phase is also degenerate and is set by requiring that the phase of the gain of a reference antenna is null; it is an arbitrary convention with no physical significance:

$$ \mathrm{Arg} (h_\text{ref}) = 0 $$

In [None]:
ref_ant = 85 # to set the overall phase
ref_ant_idx = condenseMap(ants)[ref_ant]
ant_sep = red_ant_sep(RedG, hdraw.antpos)

res_opt = doOptCal(cRedG, obs_vis, no_ants, ant_pos_arr, ant_sep, res_rel_vis, \
                   distribution=noise_dist, ref_ant_idx=ref_ant_idx, logamp=False)

In [None]:
new_gain_params, new_deg_params = numpy.split(res_opt['x'], [no_ants*2,])
new_amps = new_gain_params[:no_ants*2:2]
new_phases = new_gain_params[1:no_ants*2:2]
new_gains = makeEArray(new_gain_params)

print('Degenerate parameters: {}'.format(str(new_deg_params)[1: -1]))
print('Amplitude mean: {}'.format(numpy.mean(new_amps)))
print('Phase mean: {}'.format(numpy.mean(new_phases)))

In [None]:
# Optimal residuals for optimal redundant step
opt_w_alpha = degVis(ant_sep, res_rel_vis, *new_deg_params[[0, 2, 3]])
pred_opt_vis = gVis(opt_w_alpha, cRedG, new_gains)
opt_residuals =  obs_vis - pred_opt_vis
cplot(opt_residuals, xlabel='Baseline', ylabel='Residual')

In [None]:
# Normalized amplitude residuals
fig, ax = plt.subplots(figsize=(12, 8))

ax.plot(norm_residuals(numpy.abs(obs_vis), numpy.abs(pred_opt_vis)))

ax.set_xlabel('Baseline')
ax.set_ylabel('Normalized residual')
ax.set_ylim((-1, 1))

fig.tight_layout()
plt.show()

In [None]:
# Phase residuals
diff_phases = numpy.angle(obs_vis) - numpy.angle(pred_opt_vis)
# wrap between {-pi, pi}
diff_phases_wrapped = (diff_phases + numpy.pi) % (2 * numpy.pi) - numpy.pi

fig, ax = plt.subplots(figsize=(12, 8))

ax.plot(diff_phases)
ax.plot(diff_phases_wrapped, label='wrapped')

ax.set_xlabel('Baseline')
ax.set_ylabel('Normalized residual')
ax.legend(loc='best')

fig.tight_layout()
plt.show()

In [None]:
# Residuals normalized by amplitude
norm_opt_residuals = norm_residuals(obs_vis, pred_opt_vis)
cplot(norm_opt_residuals, xlabel='Baseline', ylabel='Residual', ylim=(-1, 1))
print('Median absolute normalized residual - Real: {}, Imag: {}'\
      .format(*abs_residuals(norm_opt_residuals)))

# Comparing relative calibrations

To check the stability of the true visibilities for each baseline set, we perform relative calibration on neighbouring datasets in time, frequency, or JD, and compare their visibility solutions. These should be consistent up to the degenerate parameters $A$, $\Delta_x$ and $\Delta_y$. Marginalizing for these parameters with an MLE framework enables us to compare these solutions without having to first constrain their degeneracies, which can be computationally expensive.

To compare datasets, we need to minimize:

- Gaussian distribution:

$$ -\ln(\mathcal{L}^G_{\mathrm{deg}}) (\nu) = \frac{1}{2} \sum_{\alpha} \sum_{\{i,j\}_{\alpha}} \ln(2 \pi \sigma_{ij}^2(\nu)) + \frac{ \left| U_{\alpha}' (\nu) - W_{\alpha} (\nu) \right|^2 }{\sigma_{ij}^2(\nu)} $$

- Cauchy distribution:

$$ -\ln(\mathcal{L}^C_{\mathrm{deg}}) (\nu) = \sum_{\alpha} \sum_{\{i,j\}_{\alpha}} \ln(\pi \gamma_{ij} (\nu)) + \ln \left( 1 + \left( \frac{\left| U_{\alpha}' (\nu) - W_{\alpha} (\nu) \right|}{\gamma_{ij} (\nu)} \right)^2 \right) $$

In [None]:
hdraw2, RedG2, cMData2 = group_data(zen_fn2, pol, freq_channel, None, bad_ants, flags_fn2)
cData2 = cMData2.filled()
flags2 = cMData2.mask

if cData2.shape[0] > 1:
    cData2 = cData2[0, ...]
    flags2 = flags2[0, ...]
cData2 = numpy.squeeze(cData2)
flags2 = numpy.squeeze(flags2)  

ants2 = numpy.unique(RedG2[:, 1:])
no_ants2 = ants2.size
no_unq_bls2 = numpy.unique(RedG2[:, 0]).size
cRedG2 = relabelAnts(RedG2)

redg_eq = numpy.array_equal(RedG, RedG2)
print('Do the visibilities for JDs {} and {} have:\n'\
      'the same flags? {}\n'\
      'the same redundant grouping? {}'.format(JD, JD2, \
      numpy.array_equal(flags, flags2), redg_eq))

In [None]:
# Find time integration in dataset 2 that corresponds to closest LST to that of dataset 1
# This currently assumes that dataset 2 contains the correct time integration...
time_integration2 = find_nearest(hdraw2.lsts, hdraw.lsts[time_integration])[1]

In [None]:
plot_red_vis(cData2, RedG2, vis_type='amp')

In [None]:
# Relative calibration for the 2nd dataset
obs_vis2 = cData2[time_integration2, :]
if not redg_eq:
    initp = None
    phase_reg_initp = False
else:
    phase_reg_initp = True
res_rel2 = doRelCal(cRedG2, obs_vis2, no_unq_bls2, no_ants2, distribution=noise_dist, \
                    coords=rel_cal_coords, bounded=bounded_rel_cal, norm_gains=True, \
                    logamp=logamp, lovamp=lovamp, tilt_reg=tilt_reg, ant_pos_arr=ant_pos_arr, \
                    gphase_reg=gphase_reg, ref_ant_idx=rc_ref_ant_idx, max_nit=5000, initp=initp, \
                    phase_reg_initp=phase_reg_initp)
res_relx2 = numpy.array(res_rel2['x'])

if rel_cal_coords == 'polar' and rot_phase and (res_relx2[-2*no_ants::2] < 0).any():
    print('Rotating gain and visibility phases to have positive amplitudes.')
    res_relx2 = rotate_phase(res_relx2, no_unq_bls, norm_gains=True)

res_rel_vis2, res_rel_gains2 = split_rel_results(res_relx2, no_unq_bls2, coords=rel_cal_coords)

In [None]:
print('The negative log-likelihoods of the 1st and 2nd relative redundant calibrations '\
      'are:\n{} and\n{}'.format(res_rel['fun'], res_rel2['fun']))

In [None]:
# Gain amplitudes for 2nd relative calibration results
if rel_cal_coords == 'polar':
    gamps2 = res_relx2[-2*no_ants::2]
elif rel_cal_coords == 'cartesian':
    gamps2 = numpy.abs(res_rel_gains2)
print(gamps2)

In [None]:
# Gain phases for 2nd relative calibration results
if rel_cal_coords == 'polar':
    gphases2 = res_relx2[-2*no_ants+1::2]
elif rel_cal_coords == 'cartesian':
    gphases2 = numpy.angle(res_rel_gains2)
print(gphases2)

In [None]:
print('Gains - average amp: {}, product of amps: {}, average phase: {}'.format(gamps2.mean(), \
      gamps2.prod(), gphases2.mean()))

In [None]:
ant_pos_arr2 = flt_ant_pos(hdraw2.antpos, ants2)
tiltx2 = (gphases2 * ant_pos_arr2[:, 0]).sum()
tilty2 = (gphases2 * ant_pos_arr2[:, 1]).sum()
print('Tilt in x-coordinate: {}\nTilt in y-coordinate: {}'.format(tiltx2, tilty2))

In [None]:
# Visibility amplitudes from the 1st relative calibration
numpy.abs(res_rel_vis)

In [None]:
# Visibility amplitudes from the 2nd relative calibration
numpy.abs(res_rel_vis2)

In [None]:
# Visibility phases from the 1st relative calibration
numpy.angle(res_rel_vis)

In [None]:
# Visibility phases from the 2nd relative calibration
numpy.angle(res_rel_vis2)

In [None]:
# Translating between relatively calibrated visibility sets
res_deg = doDegVisVis(ant_sep, res_rel_vis, res_rel_vis2, \
                      distribution=noise_dist)
deg_tr_params = res_deg['x']

print('Degenerate parameters from degenerate fitting are: {}'.format(deg_tr_params))
print('The negative log-likelihood for this fitting is: {}'.format(res_deg['fun']))

In [None]:
# Residuals for degenerate comparison
deg_w_alpha = degVis(ant_sep, res_rel_vis, *deg_tr_params)
deg_residuals = res_rel_vis2 - deg_w_alpha
cplot(deg_residuals, xlabel='Redundant baseline type', ylabel='Residual')

In [None]:
# Degenerate residuals normalized by amplitude
norm_deg_residuals = norm_residuals(res_rel_vis2, deg_w_alpha)
cplot(norm_deg_residuals, xlabel='Redundant Baseline Type', ylabel='Residual')
print('Median absolute normalized residual - Real: {}, Imag: {}'\
      .format(*abs_residuals(norm_deg_residuals)))

## Comparing optimally calibrated solutions

In [None]:
ant_sep2 = red_ant_sep(RedG2, hdraw2.antpos)
res_opt2 = doOptCal(cRedG2, obs_vis2, no_ants2, ant_pos_arr2, ant_sep2, res_rel_vis2, \
                    distribution=noise_dist, ref_ant_idx=ref_ant_idx)

In [None]:
# 2nd optimally calibrated dataset
new_gain_params2, new_deg_params2 = numpy.split(res_opt2['x'], [no_ants*2,])
new_amps2 = new_gain_params2[:no_ants*2:2]
new_phases2 = new_gain_params2[1:no_ants*2:2]
new_gains2 = makeEArray(new_gain_params2)

print('Degenerate parameters: {}'.format(str(new_deg_params2)[1: -1]))
print('Amplitude mean: {}'.format(numpy.mean(new_amps2)))
print('Phase mean: {}'.format(numpy.mean(new_phases2)))

In [None]:
_, new_deg_params2 = numpy.split(res_opt2['x'], [no_ants*2,])
opt_w_alpha2 = degVis(ant_sep, res_rel_vis2, *new_deg_params2[[0, 2, 3]])

In [None]:
res_opt_deg = doDegVisVis(ant_sep, opt_w_alpha, opt_w_alpha2, \
                          distribution=noise_dist)
deg_opt_tr_params = res_deg['x']

print('Degenerate parameters from degenerate fitting are: {}'.format(deg_tr_params))
print('The negative log-likelihood for this fitting is: {}'.format(res_deg['fun']))

In [None]:
# Degenerate residuals normalized by amplitude
deg_opt_w_alpha = degVis(ant_sep, opt_w_alpha, *deg_opt_tr_params)
norm_deg_opt_residuals = norm_residuals(opt_w_alpha2, deg_opt_w_alpha)
cplot(norm_deg_opt_residuals, xlabel='Redundant baseline type', ylabel='Residual')
print('Median absolute normalized residual - Real: {}, Imag: {}'\
      .format(*abs_residuals(norm_deg_opt_residuals)))

# Redundant calibration across JDs

As an extension of the redundant calibration presented in §1, we wish to solve for all data across JDs simultaneously to find a single set of redundant visibility solutions for any LAST}, which will give the best location and scale estimates for the redundant visibilities. This unified solver is advantageous for several reason:

 - The total number of parameters to be solved, compared with solving separately for each JD, is reduced, since only a single set of redundant visibilities is solved for: we require $2 \times N_{\text{days}} \times N_{\text{ants}} + 2 \times N_{\text{unq_bls}}$ parameters instead of $2 \times N_{\text{days}} \times N_{\text{ants}} + 2 \times N_{\text{days}} \times N_{\text{unq_bls}}$.
 - We avoid having to perform the degenerate translation step that finds the degenerate parameter offsets between the redundant visibilities so that they can be compared (see §2), which is additional computation.
 - The statistics of statistics of non-Gaussian distributions can be meaningless (e.g. the median of the median $\neq$ the median of the whole dataset). By considering the entire dataset, we can obtain the best location and scale parameter estimates that fully encapsulate the data. This is especially relevant when dealing with robust distributions, such as the Cauchy distribution.
 
The solved gains and redundant visibilities can be found by minimizing the following negative log-likelihood functions:

 - Gaussian uncorrelated noise with variance $\sigma_{ij}^2$, which is the expected noise from the receivers and the sky

$$ -\ln(\mathcal{L}^G_{\mathrm{xd\_rel}})(\nu) = \frac{1}{2} \sum_{D} \sum_{\alpha} \sum_{\{i,j\}_{\alpha}} \ln(2 \pi \sigma_{ij, d}^2(\nu)) + \frac{ \left| V_{ij, d}^{\text{obs}} (\nu) - g_{i, d} (\nu) g_{j, d}^{*} (\nu) U_{\alpha}(\nu) \right|^2}{\sigma_{ij, d}^2(\nu)} $$

 - Cauchy assumed distribution for the noise

$$ -\ln(\mathcal{L}^C_{\mathrm{xd\_rel}}) (\nu) = \sum_{D} \sum_{\alpha} \sum_{\{i,j\}_{\alpha}} \ln(\pi \gamma_{ij, d} (\nu)) + \ln \left( 1 + \left( \frac{\left| V_{ij, d}^{\text{obs}} (\nu) - g_{i, d} (\nu) g_{j, d}^{*} (\nu) U_{\alpha}(\nu) \right|}{\gamma_{ij, d}(\nu)} \right)^2 \right) $$

where there is now an added sum across JDs ($D$).

The MLE with Cauchy-distributed noise fully encapsulates the distribution of the data, without being distorted by outliers, and is the best median estimator of the data.

In [None]:
_, _, xd_cdata, xd_cndata = XDgroup_data(JD, [int(JD), JD_comp], pol, chans=freq_channel, \
    tints=time_integration, bad_ants=True, use_flags='first', noise=True)

xd_cdata = numpy.squeeze(xd_cdata.data)
no_days = xd_cdata.shape[0]

In [None]:
xd_res_rel = doRelCalD(cRedG, xd_cdata, no_unq_bls, no_ants, \
                       distribution=noise_dist, noise=None, initp=None, \
                       return_initp=False, xd=True)

In [None]:
xd_res_rel_vis, xd_res_rel_gains = split_rel_results(xd_res_rel['x'], no_unq_bls, \
                                                     coords='cartesian')
xd_res_rel_gains = xd_res_rel_gains.reshape(no_days, -1)
xd_res_rel_vis = numpy.tile(xd_res_rel_vis, no_days).reshape((no_days, -1))

In [None]:
# Residuals with observed raw data
pred_xd_rel_vis = XDgVis(xd_res_rel_vis, cRedG, xd_res_rel_gains)
xd_norm_rel_residuals = norm_residuals(xd_cdata, pred_xd_rel_vis)

cplot(xd_norm_rel_residuals.transpose(), xlabel='Baseline', ylabel='Residual', alpha=0.5)

In [None]:
# Residuals with individually solved redundant calibration
# amplitude only since phases are not degenerately consistent
res_day1 = norm_residuals(numpy.abs(xd_res_rel_vis[0, :]), numpy.abs(res_rel_vis))
res_day2 = norm_residuals(numpy.abs(xd_res_rel_vis[1, :]), numpy.abs(res_rel_vis2))

fig, ax = plt.subplots(figsize=(12, 8))

ax.plot(res_day1, label=int(JD))
ax.plot(res_day2, label=JD_comp)

ax.set_ylabel('Normalized residual of amplitudes')
ax.set_xlabel('Baseline')
ax.legend(loc='best')

fig.tight_layout()
plt.show()