# **SSINS Tutorial**
Notebook by: Imani Ware 

Tutorial by: Michael J. Wilensky

June 2019
>Also check the UVData documentation at https://pyuvdata.readthedocs.io/en/latest/

### SS Tutorial

 **SS:** Initializing an SS object and reading in raw data

In [1]:
from SSINS import SS

# has all the attributes of the UVData object
ss = SS()

# visibilities are differenced in time wihtin the read function, and not frequency(hence fft)
ss.read('SSINS/data/1061313128_99bl_1pol_half_time.uvfits', ant_str = 'cross')
# diff=False always changing the differencing to later but not recommended for most situations

# Checks that required parameters exist. Checks that parameters have
# appropriate shapes and optionally that the values are acceptable.
ss.check()

True

**SS:** Passing keyword arguments to SS.read

In [2]:
import numpy as np
inpath = 'SSINS/data/1061313128_99bl_1pol_half_time.uvfits'

# necessary to reset data_array to empty
ss = SS()

# reads the file and only reads in metadata of elements from the time_array that we want to look at
ss.read(inpath, read_data=False)

# time_array attribute (metadata) reading all the data except the first and last integrations
times = np.unique(ss.time_array)[1:-1]   #puts the unique values of the ss.time_array and makes an array from all the data except the first and last integrations

# read_data=True means all data is read(not just metadata) and times are from the second and second to last element of the unique times_array
ss.read(inpath, read_data=True, times=times)

**SS:** Applying flags

In [3]:
# SS.data_array is a numpy masked array. To "apply flags" is to change the mask of the data_array.
# The proper way to apply flags to the sky-subtracted data is to use the apply_flags method

# To apply the original flags in the raw data file, make the following call
ss.apply_flags(flag_choice='original')

# Note that the original flags are always stored in the flag_array attribute
# The flag_choice keyword is stored in an attribute(WHICH ATTRIBUTE??)
print(ss.flag_choice)

original


In [4]:
# creates a custom flag array to then use when flagging data a specific way from the SS object
# You can apply flags from a custom flag array that is the same shape as the data
custom = np.zeros_like(ss.flag_array, dtype=bool) 

# flagging only the first frequency channel 
custom[:, 0, 0, :] = True

# applying the flags
# flags everything in the zeroth (first) freq. channel
ss.apply_flags(flag_choice='custom', custom=custom)
print(ss.flag_choice)

custom


In [5]:
# Unflag the data by setting flag_choice=None (note this is actually the default!!)
ss.apply_flags(flag_choice=None)

# Check if anything is flagged, for demonstration purposes
print(np.any(ss.data_array.mask))

False


**SS:** Plotting using Catalog_Plot

In [6]:
from SSINS import Catalog_Plot as cp
# python package that allows jupyter notebook to use the standard operating system commands 
import os    

# The Catalog_Plot library contains wrappers around plot_lib functions for basic plotting needs
# See the documentation: https://ssins.readthedocs.io/en/latest/Catalog_Plot.html
# Each function in Catalog_Plot requires a class instance and a filename prefix as arguments (a suffix is appended by the wrapper)
# Whatever unique identifying information for the plot should be specified in the prefix
prefix = 'tutorial_outputs/tutorial_'   #prefix for the output file onto which will be attached a tag

# To make a Histogram of the Visibility Differences (a VDH, figure 1 of paper), and save it as a pdf, do the following
# This also plots a fit estimated from the data
# VDH_plot function is from Catalog_Plot
cp.VDH_plot(ss, prefix, file_ext='pdf', post_flag=False, xlabel="Visibility Differences", legend=True)

# Check to see that the file exists
#Wherever there is a %(red), the string, variable, or tuple following the %(purple) will fill in the string into the location(s) of a string
print(os.path.exists('%s_VDH.pdf' % (prefix))) 


No handles with labels found to put in legend.


True


In [7]:
# Let's apply flags and plot the flagged data alongside the unflagged data, without fits
# We also want legend labels and a legend
ss.apply_flags('original')
new_prefix = '%s_flag_unflag_nofits' % prefix
cp.VDH_plot(ss, new_prefix, file_ext='pdf', pre_flag=True,post_flag=True, pre_model=False, post_model=False,post_label='Post-Flag Data', pre_label='Pre-Flag Data',legend=True)

#checking to see if plot exists
print(os.path.exists('%s_VDH.pdf' % (new_prefix)))

True


### INS Tutorial

**INS:** Making an INS from SS data

In [8]:
from SSINS import INS

# This averages the amplitudes of the sky-subtracted data over the baselines, taking into account flags that were applied
ins = INS(ss)

**INS:** Plotting using Catalog_Plot

In [9]:
# Plotting INS is similar to plotting a VDH, just with a different function
# This plots all polarizations present in the file separately
# The first column are the baseline-averaged amplitudes, while the second column shows the mean-subtracted data (z-scores)
cp.INS_plot(ins, prefix, file_ext='pdf')
print(os.path.exists('%s_SSINS.pdf' % prefix))


True


In [10]:
# You can specify various plotting nuances with keywords
# Let's set some frequency ticks every 50 channels
xticks = np.arange(0, len(ins.freq_array), 50)
xticklabels = ['%.1f' % (ins.freq_array[tick]* 10 ** (-6)) for tick in xticks]
tick_prefix = '%s_ticks' % prefix

cp.INS_plot(ins, tick_prefix, file_ext='pdf', xticks=xticks, xticklabels=xticklabels)
print(os.path.exists('%s_SSINS.pdf' % tick_prefix))

True


**INS:** Plotting using the plot_lib library

In [11]:
import matplotlib.pyplot as plt
from matplotlib import cm
from SSINS import plot_lib

# Let's plot the first polarization data and z-scores
fig, ax = plt.subplots(nrows=2, figsize=(16, 9))

# The averaged amplitudes are stored in the metric_array parameter
plot_lib.image_plot(fig, ax[0], ins.metric_array[:, :, 0],title='XX Amplitudes', xticks=xticks,xticklabels=xticklabels)

# The z-scores are stored in the metric_ms parameter.
# Let's choose a diverging colorbar and center it on zero using the cmap and midpoint keywords.
plot_lib.image_plot(fig, ax[1], ins.metric_ms[:, :, 0],title='XX z-scores', xticks=xticks,xticklabels=xticklabels, cmap=cm.coolwarm,midpoint=True)
fig.savefig('%s_plot_lib_SSINS.pdf' % prefix)
print(os.path.exists('%s_plot_lib_SSINS.pdf' % prefix))

True


**INS:** Saving out and reading in a spectrum

In [12]:
# The INS.write method saves out h5 files that can be read both by INS objects and UVFlag objects
# By default it saves out the metric_array in the file, z-scores must be saved separately
# Set clobber=True to overwrite files with the same prefix (default is False)
ins.write(prefix, clobber=True)
ins.write(prefix, output_type='z_score', clobber=True)
print(os.path.exists('%s_SSINS_data.h5' % prefix))
print(os.path.exists('%s_SSINS_z_score.h5' % prefix))


File tutorial_outputs/tutorial__SSINS_data.h5 exists; clobbering
File tutorial_outputs/tutorial__SSINS_z_score.h5 exists; clobbering
True
True


In [13]:
# This file can later be read upon instantiation of a new object
# The z-scores will be recalculated on instantiation, so no need to read in the z-scores
new_ins = INS('%s_SSINS_data.h5' % prefix)

# Check equality
print(np.all(ins.metric_array == new_ins.metric_array))

True


### match_filter (MF) Tutorial

**MF:** Constructing a filter with no additional sub-bands

In [14]:
from SSINS import MF

In [15]:
# The MF class requires a frequency array and significance threshold as positional arguments
# We will disable searching for broadband streaks and provide no additional sub-bands for the filter
mf = MF(ins.freq_array, 5, streak=False)

**MF:** Constructing a filter for streaks and Western Australian DTV in MWA EoR Highband

### Writing

We can write the information from an INS out to h5 files using the write method. There are ***three main data products*** to write out: 

(1) The baseline averaged visibility difference amplitudes, 

(2) The z-scores from mean-subtraction, and 

(3) any mask that may have come from flagging.

**INS:** Writing the three main data products

In [16]:
#prefix = 'SSINS/data/tutorial_'

# writing the data
ins.write('joy', output_type='data')

# writing the z-scores
# what are the z-scores?? the magnitude of standard deviation from 
# the mean of the data (if there are 3.1sigma, the z-score is 3.1)
ins.write('job', output_type='z_score')   

# We detail how to use the match_filter to flag an INS in the match_filter section
# This will apply masks to the data, which we write as follows
ins.write('jog', output_type='mask')

# We can apply these on read from the output file using the mask_file keyword on init

**INS:** Writing time-propagated flags

In [17]:
print(type(ins.metric_array))

<class 'numpy.ma.core.MaskedArray'>
