# HERA Data Part II : Calibration Exploration Solutions

**CHAMP Bootcamp**
<br>
**June 10, 2022**
<br>

In this demo we will explore HERA data, calibration solutions, and how to apply them to the data.

In [14]:
%matplotlib notebook
import matplotlib.pyplot as plt
import numpy as np
from hera_cal.io import HERAData, HERACal
from pyuvdata import utils as uvutils, UVData, UVCal
import uvtools.plot as plotter

## 1) Load the un-calibrated data into a `HERAData` object

The data file is at `/users/zxu/lustre/champ_data/HERADataPartII/zen.2458116.24482.xx.HH.uvh5`

In [15]:
# load the raw data: this may take up to ~15 seconds...
hd = HERAData("/users/zxu/lustre/champ_data/HERADataPartII" \
              "/zen.2458116.24482.xx.HH.OCRU.uvh5")
data, flags, _ = hd.read()

Fixing auto-correlations to be be real-only, after some imaginary values were detected in data_array.


In [16]:
# recall DataContainer is a pseudo-dictionary: print its keys
print(data.keys())

odict_keys([(0, 0, 'ee'), (0, 1, 'ee'), (0, 2, 'ee'), (0, 11, 'ee'), (0, 12, 'ee'), (0, 13, 'ee'), (0, 14, 'ee'), (0, 23, 'ee'), (0, 24, 'ee'), (0, 25, 'ee'), (0, 26, 'ee'), (0, 27, 'ee'), (0, 36, 'ee'), (0, 37, 'ee'), (0, 38, 'ee'), (0, 39, 'ee'), (0, 40, 'ee'), (0, 41, 'ee'), (0, 50, 'ee'), (0, 51, 'ee'), (0, 52, 'ee'), (0, 53, 'ee'), (0, 54, 'ee'), (0, 55, 'ee'), (0, 65, 'ee'), (0, 66, 'ee'), (0, 67, 'ee'), (0, 68, 'ee'), (0, 69, 'ee'), (0, 70, 'ee'), (0, 71, 'ee'), (0, 82, 'ee'), (0, 83, 'ee'), (0, 84, 'ee'), (0, 85, 'ee'), (0, 86, 'ee'), (0, 87, 'ee'), (0, 88, 'ee'), (0, 98, 'ee'), (0, 120, 'ee'), (0, 121, 'ee'), (0, 122, 'ee'), (0, 123, 'ee'), (0, 124, 'ee'), (0, 136, 'ee'), (0, 137, 'ee'), (0, 138, 'ee'), (0, 139, 'ee'), (0, 140, 'ee'), (0, 141, 'ee'), (0, 142, 'ee'), (0, 143, 'ee'), (1, 1, 'ee'), (1, 2, 'ee'), (1, 11, 'ee'), (1, 12, 'ee'), (1, 13, 'ee'), (1, 14, 'ee'), (1, 23, 'ee'), (1, 24, 'ee'), (1, 25, 'ee'), (1, 26, 'ee'), (1, 27, 'ee'), (1, 36, 'ee'), (1, 37, 'ee'), (1, 3

In [17]:
# make a waterfall plot of data amplitude between antenna 24 & antenna 25
plt.figure(figsize=(9, 5))
baseline = (24, 25, 'ee')
plotter.waterfall(np.abs(data[baseline]), mode='real', mx=.05)
plt.xlabel('frequency channel' , fontsize=14)
plt.ylabel('times' , fontsize=14)
plt.colorbar(label='amplitude')
plt.title("data amplitude for 24 -- 25")

<IPython.core.display.Javascript object>

Text(0.5, 1.0, 'data amplitude for 24 -- 25')

In [18]:
# this time, make a waterfall plot of *visibility phase* between antenna 24 & antenna 25
plt.figure(figsize=(9, 5))
baseline = (24, 25, 'ee')
plotter.waterfall(np.angle(data[baseline]), mode='real')
plt.xlabel('frequency channel' , fontsize=14)
plt.ylabel('times' , fontsize=14)
plt.colorbar(label='phase')
plt.title("data phase for 24 -- 25")

<IPython.core.display.Javascript object>

Text(0.5, 1.0, 'data phase for 24 -- 25')

## start breakout for (2) & (3)

I have created a template below, but you need to fill in the details!

## 2) Load gain solutions into a `calfits` file

Use the `zen.2458116.24482.xx.HH.uv.abs.calfits` file.

In [19]:
# load the gain solutions: gains is a dictionary with antenna numbers as keys! Inspect it!
hc = HERACal("/users/zxu/lustre/champ_data/HERADataPartII"
             "/zen.2458116.24482.xx.HH.uv.abs.calfits")
gains, gflags, _, _ = hc.read()

telescope_location is not set. Using known values for HERA.
antenna_positions is not set. Using known values for HERA.


In [20]:
# print the gains keys
print(gains.keys())

odict_keys([(1, 'Jee'), (11, 'Jee'), (12, 'Jee'), (13, 'Jee'), (14, 'Jee'), (23, 'Jee'), (24, 'Jee'), (25, 'Jee'), (26, 'Jee'), (27, 'Jee'), (36, 'Jee'), (37, 'Jee'), (38, 'Jee'), (39, 'Jee'), (40, 'Jee'), (41, 'Jee'), (51, 'Jee'), (52, 'Jee'), (53, 'Jee'), (54, 'Jee'), (55, 'Jee'), (65, 'Jee'), (66, 'Jee'), (67, 'Jee'), (68, 'Jee'), (69, 'Jee'), (70, 'Jee'), (71, 'Jee'), (82, 'Jee'), (83, 'Jee'), (84, 'Jee'), (85, 'Jee'), (86, 'Jee'), (87, 'Jee'), (88, 'Jee'), (120, 'Jee'), (121, 'Jee'), (122, 'Jee'), (123, 'Jee'), (124, 'Jee'), (137, 'Jee'), (138, 'Jee'), (139, 'Jee'), (140, 'Jee'), (141, 'Jee'), (142, 'Jee'), (143, 'Jee')])


In [21]:
# make a waterfall plot of antenna 24 gain amplitude for "Jee" polarization
plt.figure(figsize=(9, 5))
antenna = (24, 'Jee')
# you need to input the gain array from the "gains" dictionary into the plotter
plotter.waterfall(np.abs(gains[antenna]) , mode='real', mx=.05)
plt.xlabel('frequency channel' , fontsize=14)
plt.ylabel('times' , fontsize=14)
plt.colorbar(label='amplitude')
plt.title("calibration amplitude antenna 24")

<IPython.core.display.Javascript object>

Text(0.5, 1.0, 'calibration amplitude antenna 24')

In [22]:
# given that gains[key] returns a 2D array of shape (Ntimes, Nfrequencies),
# how do I slice the array to get all times for just frequency channel 512?
gains[(24, 'Jee')][:, 512]

array([-0.02192443-0.01993183j, -0.01992052-0.01993538j,
       -0.02236155-0.01927051j, -0.02057734-0.02012827j,
       -0.01960187-0.02076929j, -0.01987572-0.01829819j,
       -0.02080974-0.02063646j, -0.01879953-0.02061766j,
       -0.01966285-0.0203867j , -0.0197662 -0.0214253j ,
       -0.02286029-0.01944869j, -0.02011409-0.01925742j,
       -0.01976778-0.02009801j, -0.02165381-0.01902633j,
       -0.02131193-0.01912317j, -0.01932811-0.0218056j ,
       -0.01934283-0.02074288j, -0.01864468-0.02064276j,
       -0.02077   -0.01941248j, -0.01829339-0.0210166j ,
       -0.01997005-0.01962966j, -0.01901973-0.02008016j,
       -0.02039805-0.01901028j, -0.01920239-0.01999248j,
       -0.02103276-0.01781506j, -0.02078491-0.01981796j,
       -0.02031302-0.02040905j, -0.02066882-0.01848871j,
       -0.02092815-0.0189808j , -0.02074181-0.02012991j,
       -0.02061887-0.02045363j, -0.02089133-0.01864762j,
       -0.02068172-0.01985409j, -0.02003721-0.0199572j ,
       -0.0225697 -0.01831683j,

In [23]:
# make a line plot of all antenna gain amplitudes in one plot, and in another plot gain phases
# at frequency channel 512
# Hint 1: use a FOR loop over hc.ant_array

fig = plt.figure(figsize=(9, 5))

# this plot below will be for phase: recall, how do you convert the array to phase?
plt.subplot(2, 1, 2)
for ant in hc.ant_array:
    # plot antenna phases here
    plt.plot(np.angle(gains[(ant, 'Jee')])[:, 512])

plt.xlabel('time')
plt.ylabel('phase')
    
# this plot below will be for amplitude: recall, how do you convert the array to amplitude?
plt.subplot(2, 1, 1)
for ant in hc.ant_array:
    # plot antenna amplitudes here
    plt.plot(np.abs(gains[(ant, 'Jee')])[:, 512])

plt.ylabel("amplitude")
plt.legend(hc.ant_array, ncol=2, borderaxespad=-3, loc=0, fontsize=8)

<IPython.core.display.Javascript object>

<matplotlib.legend.Legend at 0x2b2a3057a590>

How much does the gain amplitude vary over the course of the file?

Do certain antennas look different than others? Which ones? Hint: Try plotting just a few antennas to better distinguish them.

### next plot the antenna bandpass spectrum (i.e. as a function of frequency) at time bin index 10

In [24]:
# make a line plot of all antenna gain amplitudes in one plot at time index 30
# Hint 1: use a FOR loop over hc.ant_array
# Hint 2: use plt.ylim(min, max) to change the y-axis range to focus on the broadband structure

fig = plt.figure(figsize=(9, 5))

for ant in hc.ant_array:
    # plot antenna amplitudes here
    plt.plot(np.abs(gains[(ant, 'Jee')])[30, :])
    
plt.ylim(0, .075) # uncomment this line to set a min and max value to the y axis
plt.xlabel('frequency channel')
plt.ylabel('ampltude')
plt.legend(hc.ant_array, ncol=2, borderaxespad=-3, loc=1, fontsize=10)

<IPython.core.display.Javascript object>

<matplotlib.legend.Legend at 0x2b2a305799f0>

What do you see in the gain amplitude across frequency? What might cause these features?

## 3) Load the model visibility

Use the `zen.2458116.24482.xx.HH.uvXRS2` file.

In [25]:
# load the model: this may take up to ~15 seconds
hd2 = HERAData('/users/zxu/lustre/champ_data/HERADataPartII'
               '/zen.2458116.24482.xx.HH.XRS2.uvh5')
model, _, _ = hd2.read(bls=[(24, 25)])

In [26]:
# make a waterfall plot of model amplitude between antenna 24 & antenna 25
# use previous visibility waterfall plot as a template!
plt.figure(figsize=(9, 5))
baseline = (24, 25, 'ee')
plotter.waterfall(np.abs(model[baseline]), mode='real', mx=30)
plt.xlabel('frequency channel' , fontsize=14)
plt.ylabel('times' , fontsize=14)
plt.colorbar(label='amplitude')
plt.title("data amplitude for 24 -- 25")

<IPython.core.display.Javascript object>

Text(0.5, 1.0, 'data amplitude for 24 -- 25')

What is different about the model visibility compared to the original data visibiltiy?

## 4) Apply the calibration solution to the data

Recall that for a visibility betweeen antenna i and antenna j, the calibration equation reads:

\begin{align}
\Large V_{ij}^{\rm data} = g_ig_j^\ast V_{ij}^{\rm model}
\end{align}

such that we can calculate

\begin{align}
\Large V_{ij}^{\rm updated\ data} = V_{ij}^{\rm data} / (g_i g_j^\ast)
\end{align}

__Note:__ the conjugation $^\ast$ operation can be done using `np.conj(...)`

In [30]:
# apply calibration from the gains dictionary to data[(24, 25, 'ee')]
V_data = data[(24, 25, 'ee')]
g_24 = gains[(24, 'Jee')]
g_25 = gains[(25, 'Jee')]
V_update = V_data / (g_24 * np.conj(g_25))

In [31]:
# make a waterfall plot of updated data amplitude
# make a waterfall plot of model amplitude between antenna 24 & antenna 25
# use previous visibility waterfall plot as a template!
plt.figure(figsize=(9, 5))
baseline = (24, 25, 'ee')
plotter.waterfall(np.abs(V_update), mode='real', mx=30)
plt.xlabel('frequency channel' , fontsize=14)
plt.ylabel('times' , fontsize=14)
plt.colorbar(label='amplitude')
plt.title("updated data amplitude for 24 -- 25")

<IPython.core.display.Javascript object>

Text(0.5, 1.0, 'updated data amplitude for 24 -- 25')

In [32]:
# plot waterfall of model and updated data phase side-by-side
plt.figure(figsize=(9, 5))

# plot the updated data phase
plt.subplot(1, 2, 1)
plotter.waterfall(np.angle(V_update), mode='real')
plt.xlabel('frequencies')
plt.ylabel('times')
plt.title("updated data phase [rad]")
plt.colorbar()

# plot the model phase
plt.subplot(1, 2, 2)
plotter.waterfall(np.angle(model[(24, 25, 'ee')]), mode='real')
plt.xlabel('frequencies')
plt.ylabel('times')
plt.title("model phase [rad]")
plt.colorbar()

<IPython.core.display.Javascript object>

<matplotlib.colorbar.Colorbar at 0x2b2a316c0ca0>

That's the end of the calibration demo! Hopefully you've gained some intuition for how calibration is done and applied to the data. Next we'll look to image the data after applying calibration!