# Sample Pipeline

This notebook shows an end to end radio interferometry pipeline from the simulation of the sky to the final image. The pipeline consists of three modules:

- Simulation
    - Sky module: OSKAR
    - Telescope module incl. calibration: OSKAR
- Processing
    - Calibration after observation: RASCIL
    - Deconvolution: RASCIL
- Analysis & comparison
    - Quantitative and qualitative analysis of algorithms

In [None]:
import sys
import oskar
import matplotlib
import matplotlib.pyplot as plt
from astropy.visualization import astropy_mpl_style
import numpy as np

from astropy.utils.data import get_pkg_data_filename
from astropy.io import fits
from astropy import wcs

In [None]:
plt.style.use(astropy_mpl_style)

## Simulation

The sky and telescope simulation is currently provided completely by OSKAR.

### Sky Module

The sky module of OSKAR contains radiation sources, which are defined as array and can be passed to `oskar.Sky.from_array`.

In [None]:
# Set the numerical precision to use.
precision = "single"

# Create a sky model containing three sources from a numpy array.
sky_data = np.array([
        [20.0, -30.0, 1, 0, 0, 0, 100.0e6, -0.7, 0.0, 0,   0,   0],
        [20.0, -30.5, 3, 2, 2, 0, 100.0e6, -0.7, 0.0, 600, 50,  45],
        [20.5, -30.5, 3, 0, 0, 2, 100.0e6, -0.7, 0.0, 700, 10, -10]])
sky = oskar.Sky.from_array(sky_data, precision)  # Pass precision here.

Instead of using completely artificial sources, an external catalog can also be used. Here we use as an example the GLEAM survey, which can be downloaded from the [VizieR](https://cdsarc.unistra.fr/viz-bin/cat/VIII/100) service. We only use right ascension (RAJ2000) and declination (DEJ2000) and the Stokes I peak flux intensity at 76 MHz(Fpwide). In addition, we drop the sources who do not have any value in the corresponding frequency band.

In [None]:
from astropy.table import Table

gleam = Table.read('./GLEAM_EGC.fits')
df_gleam = gleam.to_pandas()
df_gleam

In [None]:
start_frequency_hz = 76e6
df_gleam = df_gleam[~df_gleam['Fp076'].isna()]
ra, dec, fp = df_gleam['RAJ2000'], df_gleam['DEJ2000'], df_gleam['Fp076']
sky_array = np.column_stack((ra, dec, fp, np.zeros(ra.shape[0]), np.zeros(ra.shape[0]), 
                             np.zeros(ra.shape[0]), [start_frequency_hz]*ra.shape[0]))
precision = "single"
sky = oskar.Sky.from_array(sky_array, precision) # 307455 sources
sky.num_sources

Before further steps are taken, the phase center of the telescope must be defined.

In [None]:
ra0 = 250
dec0 = -80
phase_center = [ra0,dec0]

In [None]:
def plot_sky(sky, phase_center):
    ra0, dec0 = phase_center[0], phase_center[1]
    data = sky.to_array()  # Get sky model data as numpy array.
    ra = np.radians(data[:, 0] - ra0)
    dec = np.radians(data[:, 1])
    log_flux = np.log10(data[:, 2])
    x = np.cos(dec) * np.sin(ra)
    y = np.cos(np.radians(dec0)) * np.sin(dec) - \
                np.sin(np.radians(dec0)) * np.cos(dec) * np.cos(ra)
    sc = plt.scatter(x, y, s=5, c=log_flux, cmap='plasma',
                vmin=np.min(log_flux), vmax=np.max(log_flux))
    #_ = plt.scatter(ra0, dec0)
    plt.axis('equal')
    plt.xlabel('x direction cosine')
    plt.ylabel('y direction cosine')
    plt.colorbar(sc, label='Log10(Stokes I flux [Jy])')
    plt.show()
    
plot_sky(sky, phase_center)

Now that we have determined the phase center we can take a closer look at how the point sources are distributed around the center. For this we use a coordinate 2d image projection with the phase center as the origin using the the world coordinate system ([wcs](https://docs.astropy.org/en/stable/wcs/index.html)) of astropy. Specific information about the wcs parameters can be found [here](https://docs.astropy.org/en/stable/api/astropy.wcs.Wcsprm.html). For this illustration, we do not take into account the flux intensity, so that all sources are clearly visible.

In [None]:
#### Construct Fits file ######
w = wcs.WCS(naxis=2)
# define coordinate reference pixel per axis
w.wcs.crpix = [0, 0]
# coordinate increments on sphere per axis
w.wcs.cdelt = np.array([-1, 1])
# coordinate reference values per axis (ra/deg)
w.wcs.crval = phase_center
# coordinate axis type
w.wcs.ctype = ["RA---AIR", "DEC--AIR"]
# coordinate transformation
px, py = w.wcs_world2pix(ra, dec, 1)

plt.scatter(px,py)
plt.title('Zoomed Projection of the Sky around the Phase Center')
plt.xlim(-1,1)
plt.ylim(-1,1)
plt.show()

Now, to have only a partition of the sky, we can use the `filter_by_radius`, which filters from the phase center with an inner and outer radius in degrees. Oskar also provides the function `filter_by_flux` to filter by Stokes-I flux. For more details we refer to the Oskar [documentation](https://fdulwich.github.io/oskarpy-doc/sky.html).

In [None]:
sky.filter_by_radius(0, .55, phase_center[0], phase_center[1])
data = sky.to_array()
ra_filtered = data[:,0]
dec_filtered = data[:,1]
stokes_i_flux = data[:,2]
sky.num_sources

Now let's look at the filtered sources.

In [None]:
px, py = w.wcs_world2pix(ra_filtered, dec_filtered, 1)
plt.scatter(px, py)
plt.title('Filtered Projection of the Sky around the Phase Center')
plt.show()

### Telescope Module

Various observation parameters and meta information `params` must be passed to the telescope module `oskar.Interferometer` of OSKAR as `oskar.SettingsTree`.

In [None]:
# Basic settings. (Note that the sky model is set up later.)
params = {
    "simulator": {
        "use_gpus": False
    },
    "observation" : {
        #"num_channels": 64,
        "num_channels": 16,
        "start_frequency_hz": start_frequency_hz,
        "frequency_inc_hz": 20e6,
        "phase_centre_ra_deg": phase_center[0],
        "phase_centre_dec_deg": phase_center[1],
        "num_time_steps": 24,
        "start_time_utc": "01-01-2000 12:00:00.000",
        "length": "12:00:00.000"
    },
    "telescope": {
        "input_directory": "../data/telescope.tm"
    },
    "interferometer": {
        "ms_filename": "visibilities_gleam.ms",
        "channel_bandwidth_hz": 1e6,
        "time_average_sec": 10
    }
}
settings = oskar.SettingsTree("oskar_sim_interferometer")
settings.from_dict(params)

precision = "single"
if precision == "single":
    settings["simulator/double_precision"] = False

# Set the sky model and run the simulation.
sim = oskar.Interferometer(settings=settings)

### Observation Simulation

Now the sky module must be passed to the interferometer and the simulation of the observation must be started to generate the measurement set.

In [None]:
sim.set_sky_model(sky)
sim.run()

## Processing

After the observation is made with the telescope, a calibration of the measured data must be performed, followed by the reconstruction of the image.

### Calibration after Observation

toDo

In [None]:
# Code here

### Imaging

Start an mmclean algorithm with the visibilites.ms as an input to deconvolve. 
To use dask cluster where you can see the progress, first create a dask cluster in the dask-extension on the left. 
Then copy the scheduler adress into the variable below. It might be correct already.

If you don't do this, remove the --dask_scheduler option from the options in the start_imager call.
Then RASCIL starts its own scheduler, you will however not be able to see the dashbaord, as the port is probably not forwarded by docker.

In [None]:
from rascil.apps import rascil_imager
from rascil.processing_components.util.performance import (
    performance_store_dict,
    performance_environment,
)
    
def start_imager(rawargs):
    parser = rascil_imager.cli_parser()
    args = parser.parse_args(rawargs)
    performance_environment(args.performance_file, mode="w")
    performance_store_dict(args.performance_file, "cli_args", vars(args), mode="a")
    image_name = rascil_imager.imager(args)

start_imager(
    [
        '--ingest_msname','visibilities_gleam.ms',
        '--ingest_dd', '0', 
        #'--ingest_vis_nchan', '64',
        '--ingest_vis_nchan', '16',
        #'--ingest_chan_per_blockvis', '4',
        '--ingest_chan_per_blockvis', '1' ,
        '--ingest_average_blockvis', 'True',
        '--imaging_npixel', '2048', 
        '--imaging_cellsize', '3.878509448876288e-05',
        '--imaging_weighting', 'robust',
        '--imaging_robustness', '-0.5',
        '--clean_nmajor', '2' ,
        '--clean_algorithm', 'mmclean',
        '--clean_scales', '0', '6', '10', '30', '60',
        '--clean_fractional_threshold', '0.3',
        '--clean_threshold', '0.12e-3',
        '--clean_nmoment' ,'5',
        '--clean_psf_support', '640',
        '--clean_restored_output', 'integrated'
    ])

## Analysis and Comparison

toDo

Now we have the image of the pipeline and the corresponding pixel coordinates as ground truth. Important are the number of set pixels and the cellsize, from which the coordinates are calculated.

In [None]:
hdulist = fits.open('./visibilities_gleam_nmoment5_cip_deconvolved.fits')
w_fits = wcs.WCS(hdulist[0].header)

#### Construct Fits file ######
w = wcs.WCS(naxis=2)
w.wcs.crpix = w_fits.wcs.crpix[0:2] # coordinate reference pixel per axis --> on image
w.wcs.cdelt = w_fits.wcs.cdelt[0:2] # coordinate increments on sphere per axis
w.wcs.crval = [phase_center[0], phase_center[1]]
w.wcs.ctype = ["RA---AIR", "DEC--AIR"] # coordinate axis type
px, py = w.wcs_world2pix(ra_filtered, dec_filtered, 1) # coordinate conversion
plt.scatter(px,py)
plt.show()

In [None]:
# matplotlib.use("Agg") from rascil_imager.py causes the problem that matplotlib isn't able to plot in the notebook
image_file = get_pkg_data_filename('visibilities_gleam_nmoment5_cip_deconvolved.fits')
fits.info(image_file)

In [None]:
image_data = fits.getdata(image_file)
image_data = np.log(image_data.sum(axis=(0,1)))
_ = plt.figure(figsize=(8,6))
_ = plt.imshow(image_data, cmap='gray')
_ = plt.colorbar()