# Review Instructions

Please review the MSv4 `antenna_xds` schema and the XRADIO interface (`ps['MSv4_name'].antenna_xds`). Note that the PS (processing set) interface or the main_xds should not be reviewed.

The `antenna_xds` schema specification: https://docs.google.com/spreadsheets/d/14a6qMap9M5r_vjpLnaBKxsR9TF4azN5LVdOxLacOX-s/edit#gid=257301047

The processing set is a loose collection of MSv4 which might come from multiple MSv2 (or ASDMS). Consequently, arbitrary ids are avoided in favor of descriptive strings.

Two example datasets will be used VLBA_TL016B_split_lsrk.ms (VLBI) and uid___A002_X1015532_X1926f.small.ms (Single Dish).

## Preparatory Material
Go over Xarray nomenclature and selection syntax:
- https://docs.xarray.dev/en/latest/user-guide/terminology.html
- https://docs.xarray.dev/en/latest/user-guide/indexing.html

MSv2 and CASA documentation:
- MSv2 schema: https://casacore.github.io/casacore-notes/229.pdf
- MSv3 schema: https://casacore.github.io/casacore-notes/264.pdf

VLBI MSv2 extension: 
https://casacore.github.io/casacore-notes/265.pdf

## `antenna_xds` Schema
The ANTENNA, FEED, GAIN_CURVE, PHASE_CAL, and INTERFEROMETER_MODEL (VLBI, not yet included) tables in the MSv2 contain closely related information that is all related to the antenna. The MSv4 contains a single SPW, consequently, an antenna will only have a single feed associated with it.

Antenna_ids are no longer used in favor of the antenna name (for antennas that can move the name consists of the name + "_" + station). This allows for easily comparing MSv4s from different observations.

## Key Questions to Answer
### Schema Questions
- 1.1) Are there missing use cases?
- 1.2) Is all the information present needed for offline processing?
- 1.3) (VLBI) Instead of storing BASELINE_REFERENCE in main_xds can we store it in the antenna_xds? This would assume that for the duration of the MS v4 that the reference antennas remain constant.
- 1.4) Is the order of the dims correct (antenna_id)?
- 1.5) Should BEAM_OFFSET be sky_dir_label (Ra, Dec) or local_sky_label (Az, Alt)?
- 1.6) Should we add prefex to organize data variables? For example PHASE_DELAY -> VLBI_PHASE_DELAY?
- 1.7) Should we include the POLARIZATION_RESPONSE doesn't seem to be used?
- 1.8) Can we get rid of POLARIZATION_RESPONSE?
- 1.9) ANTENNA_FEED_OFFSET is a new data variable that is calculated using (Antenna_Table.offset + Feed_Table.position). Is this any need to store these values separately?

### XRADIO
2.1) Please check all the data variable (names,dimensions,measures) and coordinates (names,dimensions,measures) in both the google spreadsheet and ipynb.

# Environment instructions

It is recommended to use the conda environment manager to create a clean, self-contained runtime where xradio and all its dependencies can be installed:

```bash
conda create --name xradio python=3.11 --no-default-packages
conda activate xradio
```

Clone the repository, checkout the review branch and do a local install:

```bash
git clone https://github.com/casangi/xradio.git
git checkout 168-review-ms_xdsattrsantenna_xds-schema-and-xradio-interface
cd xradio
pip install -e .
```

On macOS it is required to pre-install python-casacore using ```bash conda install -c conda-forge python-casacore```.

# Data
Two examples will be given VLBA_TL016B_split_lsrk.ms (VLBI) and  uid___A002_X1015532_X1926f.small.ms (Single Dish). But you are welcome to change things up by downloading (or using your own MSv2 dataset). The GraphVIPER function list_files will give you a list of available datasets.


In [11]:
from graphviper.utils.data import list_files
list_files()

# VLBI Example

## Download Dataset

In [1]:
from xradio.vis.convert_msv2_to_processing_set import convert_msv2_to_processing_set
from xradio.vis.read_processing_set import read_processing_set
import graphviper

graphviper.utils.data.download(file="VLBA_TL016B_split_lsrk.ms")

[[38;2;128;05;128m2024-08-08 15:34:55,852[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m  graphviper: [0m Updating file metadata information ...  
 

[[38;2;128;05;128m2024-08-08 15:34:56,770[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m  graphviper: [0m File exists: VLBA_TL016B_split_lsrk.ms 


## Start Dask cluster 
Choose an approriate number of cores and memory_limit (this is per core).

In [2]:
from graphviper.dask.client import local_client

viper_client = local_client(cores=4, memory_limit="4GB")
viper_client

[[38;2;128;05;128m2024-08-08 15:34:56,853[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m  graphviper: [0m Checking parameter values for [38;2;50;50;205mclient[0m.[38;2;50;50;205mlocal_client[0m 
[[38;2;128;05;128m2024-08-08 15:34:56,854[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m  graphviper: [0m Module path: [38;2;50;50;205m/Users/jsteeb/Downloads/yes/envs/zinc/lib/python3.11//site-packages/[0m 
[[38;2;128;05;128m2024-08-08 15:34:57,573[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m Created client <MenrvaClient: 'tcp://127.0.0.1:51975' processes=4 threads=4, memory=14.90 GiB> 


0,1
Connection method: Cluster object,Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status,

0,1
Dashboard: http://127.0.0.1:8787/status,Workers: 4
Total threads: 4,Total memory: 14.90 GiB
Status: running,Using processes: True

0,1
Comm: tcp://127.0.0.1:51975,Workers: 4
Dashboard: http://127.0.0.1:8787/status,Total threads: 4
Started: Just now,Total memory: 14.90 GiB

0,1
Comm: tcp://127.0.0.1:51987,Total threads: 1
Dashboard: http://127.0.0.1:51989/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:51978,
Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-dtvmdi_o,Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-dtvmdi_o
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 0.0%,Last seen: Just now
Memory usage: 63.59 MiB,Spilled bytes: 0 B
Read bytes: 0.0 B,Write bytes: 0.0 B

0,1
Comm: tcp://127.0.0.1:51988,Total threads: 1
Dashboard: http://127.0.0.1:51991/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:51980,
Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-kgzwtc9t,Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-kgzwtc9t
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 0.0%,Last seen: Just now
Memory usage: 63.17 MiB,Spilled bytes: 0 B
Read bytes: 0.0 B,Write bytes: 0.0 B

0,1
Comm: tcp://127.0.0.1:51986,Total threads: 1
Dashboard: http://127.0.0.1:51990/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:51982,
Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-oqd8x6l0,Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-oqd8x6l0
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 0.0%,Last seen: Just now
Memory usage: 63.91 MiB,Spilled bytes: 0 B
Read bytes: 0.0 B,Write bytes: 0.0 B

0,1
Comm: tcp://127.0.0.1:51995,Total threads: 1
Dashboard: http://127.0.0.1:51996/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:51984,
Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-hwwgimx8,Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-hwwgimx8
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 0.0%,Last seen: Just now
Memory usage: 63.48 MiB,Spilled bytes: 0 B
Read bytes: 0.0 B,Write bytes: 0.0 B


## Convert dataset

In [3]:
from xradio.vis.convert_msv2_to_processing_set import convert_msv2_to_processing_set

in_file = "VLBA_TL016B_split_lsrk.ms"
out_file = "VLBA_TL016B_split_lsrk.vis.zarr"

convert_msv2_to_processing_set(
    in_file=in_file,
    out_file=out_file,
    parallel=True,
    overwrite=True,
)

[[38;2;128;05;128m2024-08-08 15:34:57,609[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m Partition scheme that will be used: ['DATA_DESC_ID', 'OBSERVATION_ID', 'FIELD_ID'] 
[[38;2;128;05;128m2024-08-08 15:34:57,615[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m Number of partitions: 4 
[[38;2;128;05;128m2024-08-08 15:34:57,615[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [0], STATE [-1], FIELD [0], SCAN [0] 
[[38;2;128;05;128m2024-08-08 15:34:57,859[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [0], STATE [-1], FIELD [1], SCAN [0] 
[[38;2;128;05;128m2024-08-08 15:34:58,161[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [1], STATE [-1], FIELD [0], SCAN [0] 
[[38;2;128;05;128m2024-08-08 15:34:58,429[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID 

## Inspect Processing Set

In [5]:
import pandas as pd

# Set the maximum number of rows displayed before scrolling
pd.set_option("display.max_rows", 1000)

from xradio.vis.read_processing_set import read_processing_set

ps = read_processing_set("VLBA_TL016B_split_lsrk.vis.zarr")
ps.summary()

Unnamed: 0,name,obs_mode,shape,polarization,spw_name,field_name,source_name,field_coords,start_frequency,end_frequency
0,VLBA_TL016B_split_lsrk_3,obs_0,"(540, 55, 6, 2)","[RR, LL]",spw_1,[J1154+6022_1],[Unknown],"[fk5, 11h54m04.54s, 60d22m20.82s]",5068199000.0,5070699000.0
1,VLBA_TL016B_split_lsrk_2,obs_0,"(200, 55, 6, 2)","[RR, LL]",spw_1,[4C39.25_0],[Unknown],"[fk5, 9h27m03.01s, 39d02m20.85s]",5068199000.0,5070699000.0
2,VLBA_TL016B_split_lsrk_0,obs_0,"(200, 55, 6, 2)","[RR, LL]",spw_0,[4C39.25_0],[Unknown],"[fk5, 9h27m03.01s, 39d02m20.85s]",5004196000.0,5006697000.0
3,VLBA_TL016B_split_lsrk_1,obs_0,"(540, 55, 6, 2)","[RR, LL]",spw_0,[J1154+6022_1],[Unknown],"[fk5, 11h54m04.54s, 60d22m20.82s]",5004196000.0,5006697000.0


## Inspect antenna_xds:

In [6]:
ant_xds = ps['VLBA_TL016B_split_lsrk_0'].attrs['antenna_xds'].load()
ant_xds

# Single Dish Example

## Convert Dataset

In [15]:
from xradio.vis.convert_msv2_to_processing_set import convert_msv2_to_processing_set
from xradio.vis.read_processing_set import read_processing_set
import graphviper

graphviper.utils.data.download(file="uid___A002_X1015532_X1926f.small.ms")

[[38;2;128;05;128m2024-08-08 16:15:32,994[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m Updating file metadata information ...  
 

 

uid___A002_X1015532_X1926f.small.ms.zip:   0%|          | 0.00/6.01M [00:00<?, ?iB/s]

## Convert Dataset

In [16]:
from xradio.vis.convert_msv2_to_processing_set import convert_msv2_to_processing_set

in_file = "uid___A002_X1015532_X1926f.small.ms"
out_file = "uid___A002_X1015532_X1926f.small.vis.zarr"

convert_msv2_to_processing_set(
    in_file=in_file,
    out_file=out_file,
    parallel=True,
    overwrite=True,
)

[[38;2;128;05;128m2024-08-08 16:16:46,110[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m Partition scheme that will be used: ['DATA_DESC_ID', 'OBS_MODE', 'OBSERVATION_ID', 'FIELD_ID'] 
[[38;2;128;05;128m2024-08-08 16:16:46,142[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m Number of partitions: 20 
[[38;2;128;05;128m2024-08-08 16:16:46,142[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [0], STATE [10], FIELD [1], SCAN [2 4] 
[[38;2;128;05;128m2024-08-08 16:16:46,143[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [0], STATE [11], FIELD [1], SCAN [2 4] 
[[38;2;128;05;128m2024-08-08 16:16:46,143[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [0], STATE [12], FIELD [1], SCAN [2 4] 
[[38;2;128;05;128m2024-08-08 16:16:46,143[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: 

## Inspect Processing Set

In [18]:
import pandas as pd

# Set the maximum number of rows displayed before scrolling
pd.set_option("display.max_rows", 1000)

from xradio.vis.read_processing_set import read_processing_set

ps = read_processing_set("uid___A002_X1015532_X1926f.small.vis.zarr")
ps.summary()

Unnamed: 0,name,obs_mode,shape,polarization,spw_name,field_name,source_name,field_coords,start_frequency,end_frequency
0,uid___A002_X1015532_X1926f.small_9,OBSERVE_TARGET#ON_SOURCE,"(864, 4, 4, 2)","[XX, YY]",X780709176#ALMA_RB_07#BB_2#SW-01#FULL_RES_1,[Jupiter_1],[Jupiter_1],Ephemeris,345029700000.0,345076500000.0
1,uid___A002_X1015532_X1926f.small_7,"CALIBRATE_ATMOSPHERE#HOT,CALIBRATE_WVR#HOT","(30, 4, 4, 2)","[XX, YY]",X780709176#ALMA_RB_07#BB_2#SW-01#FULL_RES_1,[Jupiter_1],[Jupiter_1],Ephemeris,345029700000.0,345076500000.0
2,uid___A002_X1015532_X1926f.small_0,"CALIBRATE_ATMOSPHERE#OFF_SOURCE,CALIBRATE_WVR#...","(30, 4, 4, 2)","[XX, YY]",X780709176#ALMA_RB_07#BB_1#SW-01#FULL_RES_0,[Jupiter_1],[Jupiter_1],Ephemeris,345764900000.0,345765000000.0
3,uid___A002_X1015532_X1926f.small_19,OBSERVE_TARGET#ON_SOURCE,"(864, 4, 4, 2)","[XX, YY]",X780709176#ALMA_RB_07#BB_4#SW-01#FULL_RES_3,[Jupiter_1],[Jupiter_1],Ephemeris,354473200000.0,354473600000.0
4,uid___A002_X1015532_X1926f.small_10,"CALIBRATE_ATMOSPHERE#OFF_SOURCE,CALIBRATE_WVR#...","(30, 4, 4, 2)","[XX, YY]",X780709176#ALMA_RB_07#BB_3#SW-01#FULL_RES_2,[Jupiter_1],[Jupiter_1],Ephemeris,355117600000.0,355119000000.0
5,uid___A002_X1015532_X1926f.small_17,"CALIBRATE_ATMOSPHERE#HOT,CALIBRATE_WVR#HOT","(30, 4, 4, 2)","[XX, YY]",X780709176#ALMA_RB_07#BB_4#SW-01#FULL_RES_3,[Jupiter_1],[Jupiter_1],Ephemeris,354473200000.0,354473600000.0
6,uid___A002_X1015532_X1926f.small_1,"CALIBRATE_ATMOSPHERE#AMBIENT,CALIBRATE_WVR#AMB...","(30, 4, 4, 2)","[XX, YY]",X780709176#ALMA_RB_07#BB_1#SW-01#FULL_RES_0,[Jupiter_1],[Jupiter_1],Ephemeris,345764900000.0,345765000000.0
7,uid___A002_X1015532_X1926f.small_6,"CALIBRATE_ATMOSPHERE#AMBIENT,CALIBRATE_WVR#AMB...","(30, 4, 4, 2)","[XX, YY]",X780709176#ALMA_RB_07#BB_2#SW-01#FULL_RES_1,[Jupiter_1],[Jupiter_1],Ephemeris,345029700000.0,345076500000.0
8,uid___A002_X1015532_X1926f.small_8,OBSERVE_TARGET#OFF_SOURCE,"(510, 4, 4, 2)","[XX, YY]",X780709176#ALMA_RB_07#BB_2#SW-01#FULL_RES_1,[Jupiter_1],[Jupiter_1],Ephemeris,345029700000.0,345076500000.0
9,uid___A002_X1015532_X1926f.small_16,"CALIBRATE_ATMOSPHERE#AMBIENT,CALIBRATE_WVR#AMB...","(30, 4, 4, 2)","[XX, YY]",X780709176#ALMA_RB_07#BB_4#SW-01#FULL_RES_3,[Jupiter_1],[Jupiter_1],Ephemeris,354473200000.0,354473600000.0


## Inspect antenna_xds

In [20]:
ant_xds = ps['uid___A002_X1015532_X1926f.small_9'].attrs['antenna_xds'].load()
ant_xds