# Review Instructions

Please review the MSv4 `antenna_xds` schema and the XRADIO interface (`ps['MSv4_name'].antenna_xds`). Note that the PS (processing set) interface or the main_xds should not be reviewed.

The `antenna_xds` schema specification: https://docs.google.com/spreadsheets/d/14a6qMap9M5r_vjpLnaBKxsR9TF4azN5LVdOxLacOX-s/edit#gid=257301047

The processing set is a loose collection of MSv4 which might come from multiple MSv2 (or ASDMS). Consequently, arbitrary ids are avoided in favor of descriptive strings.

Two example datasets will be used VLBA_TL016B_split_lsrk.ms (VLBI) and uid___A002_X1015532_X1926f.small.ms (Single Dish).

## Preparatory Material
Go over Xarray nomenclature and selection syntax:
- https://docs.xarray.dev/en/latest/user-guide/terminology.html
- https://docs.xarray.dev/en/latest/user-guide/indexing.html

MSv2 and CASA documentation:
- MSv2 schema: https://casacore.github.io/casacore-notes/229.pdf
- MSv3 schema: https://casacore.github.io/casacore-notes/264.pdf

VLBI MSv2 extension: 
https://casacore.github.io/casacore-notes/265.pdf

## `antenna_xds` Schema
The ANTENNA, FEED, GAIN_CURVE, PHASE_CAL, and INTERFEROMETER_MODEL (VLBI, not yet included) tables in the MSv2 contain closely related information that is all related to the antenna. The MSv4 contains a single SPW, consequently, an antenna will only have a single feed associated with it.

Antenna_ids are no longer used in favor of the antenna name (for antennas that can move the name consists of the name + "_" + station). This allows for easily comparing MSv4s from different observations.

## Key Questions to Answer
### Schema Questions
- 1.1) Are there missing use cases?
- 1.2) Is all the information present needed for offline processing?
- 1.3) (VLBI) Instead of storing BASELINE_REFERENCE in main_xds can we store it in the antenna_xds? This would assume that for the duration of the MS v4 that the reference antennas remain constant.
- 1.4) Is the order of the dims correct (antenna_id)?
- 1.5) Should BEAM_OFFSET be sky_dir_label (Ra, Dec) or local_sky_label (Az, Alt)?
- 1.6) Should we add prefex to organize data variables? For example PHASE_DELAY -> VLBI_PHASE_DELAY?
- 1.7) Should we include the POLARIZATION_RESPONSE doesn't seem to be used?
- 1.8) Can we get rid of POLARIZATION_RESPONSE?
- 1.9) ANTENNA_FEED_OFFSET is a new data variable that is calculated using (Antenna_Table.offset + Feed_Table.position). Is this any need to store these values separately?

### XRADIO
2.1) Please check all the data variable (names,dimensions,measures) and coordinates (names,dimensions,measures) in both the google spreadsheet and ipynb.

# Environment instructions

It is recommended to use the conda environment manager to create a clean, self-contained runtime where xradio and all its dependencies can be installed:

```bash
conda create --name xradio python=3.11 --no-default-packages
conda activate xradio
```

Clone the repository, checkout the review branch and do a local install:

```bash
git clone https://github.com/casangi/xradio.git
git checkout 168-review-ms_xdsattrsantenna_xds-schema-and-xradio-interface
cd xradio
pip install -e .
```

On macOS it is required to pre-install python-casacore using ```bash conda install -c conda-forge python-casacore```.

# Data
Two examples will be given VLBA_TL016B_split_lsrk.ms (VLBI) and  uid___A002_X1015532_X1926f.small.ms (Single Dish). But you are welcome to change things up by downloading (or using your own MSv2 dataset). The GraphVIPER function list_files will give you a list of available datasets.


In [1]:
from graphviper.utils.data import list_files
list_files()

# VLBI Example

## Download Dataset

In [2]:
from xradio.vis.convert_msv2_to_processing_set import convert_msv2_to_processing_set
from xradio.vis.read_processing_set import read_processing_set
import graphviper

graphviper.utils.data.download(file="VLBA_TL016B_split_lsrk.ms")

[[38;2;128;05;128m2024-08-23 14:48:50,752[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m  graphviper: [0m Updating file metadata information ...  
 

[[38;2;128;05;128m2024-08-23 14:48:51,589[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m  graphviper: [0m File exists: VLBA_TL016B_split_lsrk.ms 


## Start Dask cluster 
Choose an approriate number of cores and memory_limit (this is per core).

In [3]:
from graphviper.dask.client import local_client

viper_client = local_client(cores=4, memory_limit="4GB")
viper_client

[[38;2;128;05;128m2024-08-23 14:48:51,651[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m  graphviper: [0m Checking parameter values for [38;2;50;50;205mclient[0m.[38;2;50;50;205mlocal_client[0m 
[[38;2;128;05;128m2024-08-23 14:48:51,651[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m  graphviper: [0m Module path: [38;2;50;50;205m/Users/jsteeb/Downloads/yes/envs/zinc/lib/python3.11//site-packages/[0m 
[[38;2;128;05;128m2024-08-23 14:48:52,411[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m Created client <MenrvaClient: 'tcp://127.0.0.1:51104' processes=4 threads=4, memory=14.90 GiB> 


0,1
Connection method: Cluster object,Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status,

0,1
Dashboard: http://127.0.0.1:8787/status,Workers: 4
Total threads: 4,Total memory: 14.90 GiB
Status: running,Using processes: True

0,1
Comm: tcp://127.0.0.1:51104,Workers: 4
Dashboard: http://127.0.0.1:8787/status,Total threads: 4
Started: Just now,Total memory: 14.90 GiB

0,1
Comm: tcp://127.0.0.1:51117,Total threads: 1
Dashboard: http://127.0.0.1:51119/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:51107,
Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-i0na8oe9,Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-i0na8oe9
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 0.0%,Last seen: Just now
Memory usage: 64.22 MiB,Spilled bytes: 0 B
Read bytes: 0.0 B,Write bytes: 0.0 B

0,1
Comm: tcp://127.0.0.1:51115,Total threads: 1
Dashboard: http://127.0.0.1:51120/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:51109,
Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-hgjccz9g,Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-hgjccz9g
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 0.0%,Last seen: Just now
Memory usage: 63.95 MiB,Spilled bytes: 0 B
Read bytes: 0.0 B,Write bytes: 0.0 B

0,1
Comm: tcp://127.0.0.1:51118,Total threads: 1
Dashboard: http://127.0.0.1:51125/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:51111,
Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-pghvc1mx,Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-pghvc1mx
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 0.0%,Last seen: Just now
Memory usage: 64.20 MiB,Spilled bytes: 0 B
Read bytes: 0.0 B,Write bytes: 0.0 B

0,1
Comm: tcp://127.0.0.1:51116,Total threads: 1
Dashboard: http://127.0.0.1:51121/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:51113,
Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-gly9dve6,Local directory: /var/folders/b7/dx896v1x4yjb9v6rvs_n2hs00000gp/T/dask-scratch-space/worker-gly9dve6
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 0.0%,Last seen: Just now
Memory usage: 65.03 MiB,Spilled bytes: 0 B
Read bytes: 0.0 B,Write bytes: 0.0 B


## Convert dataset

In [12]:
from xradio.vis.convert_msv2_to_processing_set import convert_msv2_to_processing_set

in_file = "VLBA_TL016B_split_lsrk.ms"
out_file = "VLBA_TL016B_split_lsrk.vis.zarr"

convert_msv2_to_processing_set(
    in_file=in_file,
    out_file=out_file,
    parallel=False,
    overwrite=True,
    phase_cal_interpolate=True,
)

[[38;2;128;05;128m2024-08-23 14:49:22,915[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m Partition scheme that will be used: ['DATA_DESC_ID', 'OBSERVATION_ID', 'FIELD_ID'] 
[[38;2;128;05;128m2024-08-23 14:49:22,933[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m Number of partitions: 4 


[[38;2;128;05;128m2024-08-23 14:49:22,934[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [0], STATE [-1], FIELD [0], SCAN [0] 
[[38;2;128;05;128m2024-08-23 14:49:23,577[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [0], STATE [-1], FIELD [1], SCAN [0] 
[[38;2;128;05;128m2024-08-23 14:49:23,828[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [1], STATE [-1], FIELD [0], SCAN [0] 
[[38;2;128;05;128m2024-08-23 14:49:24,047[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [1], STATE [-1], FIELD [1], SCAN [0] 


## Inspect Processing Set

In [13]:
import pandas as pd

# Set the maximum number of rows displayed before scrolling
pd.set_option("display.max_rows", 1000)

from xradio.vis.read_processing_set import read_processing_set

ps = read_processing_set("VLBA_TL016B_split_lsrk.vis.zarr")
ps.summary()

Unnamed: 0,name,obs_mode,shape,polarization,scan_number,spw_name,field_name,source_name,line_name,field_coords,start_frequency,end_frequency
2,VLBA_TL016B_split_lsrk_0,[obs_0],"(200, 55, 6, 2)","[RR, LL]",[0],spw_0,[4C39.25_0],[Unknown],[],"[fk5, 9h27m03.01s, 39d02m20.85s]",5004196000.0,5006697000.0
3,VLBA_TL016B_split_lsrk_1,[obs_0],"(540, 55, 6, 2)","[RR, LL]",[0],spw_0,[J1154+6022_1],[Unknown],[],"[fk5, 11h54m04.54s, 60d22m20.82s]",5004196000.0,5006697000.0
1,VLBA_TL016B_split_lsrk_2,[obs_0],"(200, 55, 6, 2)","[RR, LL]",[0],spw_1,[4C39.25_0],[Unknown],[],"[fk5, 9h27m03.01s, 39d02m20.85s]",5068199000.0,5070699000.0
0,VLBA_TL016B_split_lsrk_3,[obs_0],"(540, 55, 6, 2)","[RR, LL]",[0],spw_1,[J1154+6022_1],[Unknown],[],"[fk5, 11h54m04.54s, 60d22m20.82s]",5068199000.0,5070699000.0


## Inspect antenna_xds:

In [14]:
ant_xds = ps['VLBA_TL016B_split_lsrk_0'].attrs['antenna_xds'].load()
ant_xds

GAIN_CURVE(name, gain_curve_time, receptor_name, poly_term)
GAIN_CURVE_INTERVAL (name, gain_curve_time)
GAIN_CURVE_SENSITIVITY (name, gain_curve_time, receptor_name)

Proposal:
GAIN_CURVE(name,receptor_name, poly_term)
GAIN_CURVE_INTERVAL (name,)
GAIN_CURVE_SENSITIVITY (name,receptor_name)
gain_curve_time (name) 


antenna_xds

antenna, feed, pointing, sys_cal, phase_cal, gain_curve, beam (require zernike, polynomial, airy disk, url)






In [7]:
ant_xds.GAIN_CURVE

# Single Dish Example

## Convert Dataset

In [8]:
from xradio.vis.convert_msv2_to_processing_set import convert_msv2_to_processing_set
from xradio.vis.read_processing_set import read_processing_set
import graphviper

graphviper.utils.data.download(file="uid___A002_X1015532_X1926f.small.ms")

[[38;2;128;05;128m2024-08-23 14:48:54,464[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m Updating file metadata information ...  
 

[[38;2;128;05;128m2024-08-23 14:48:55,274[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m File exists: uid___A002_X1015532_X1926f.small.ms 


## Convert Dataset

In [9]:
from xradio.vis.convert_msv2_to_processing_set import convert_msv2_to_processing_set

in_file = "uid___A002_X1015532_X1926f.small.ms"
out_file = "uid___A002_X1015532_X1926f.small.vis.zarr"

convert_msv2_to_processing_set(
    in_file=in_file,
    out_file=out_file,
    parallel=True,
    overwrite=True,
)

[[38;2;128;05;128m2024-08-23 14:48:55,293[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m Partition scheme that will be used: ['DATA_DESC_ID', 'OBS_MODE', 'OBSERVATION_ID', 'FIELD_ID'] 
[[38;2;128;05;128m2024-08-23 14:48:55,325[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m Number of partitions: 20 
[[38;2;128;05;128m2024-08-23 14:48:55,326[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [0], STATE [10], FIELD [1], SCAN [2 4] 
[[38;2;128;05;128m2024-08-23 14:48:55,327[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [0], STATE [11], FIELD [1], SCAN [2 4] 
[[38;2;128;05;128m2024-08-23 14:48:55,328[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: [0m OBSERVATION_ID [0], DDI [0], STATE [12], FIELD [1], SCAN [2 4] 
[[38;2;128;05;128m2024-08-23 14:48:55,329[0m] [38;2;50;50;205m    INFO[0m[38;2;112;128;144m      client: 

## Inspect Processing Set

In [10]:
import pandas as pd

# Set the maximum number of rows displayed before scrolling
pd.set_option("display.max_rows", 1000)

from xradio.vis.read_processing_set import read_processing_set

ps = read_processing_set("uid___A002_X1015532_X1926f.small.vis.zarr")
ps.summary()

Unnamed: 0,name,obs_mode,shape,polarization,scan_number,spw_name,field_name,source_name,line_name,field_coords,start_frequency,end_frequency
14,uid___A002_X1015532_X1926f.small_00,"[CALIBRATE_ATMOSPHERE#OFF_SOURCE, CALIBRATE_WV...","(30, 4, 4, 2)","[XX, YY]","[2, 4]",X780709176#ALMA_RB_07#BB_1#SW-01#FULL_RES_0,[Jupiter_1],[Jupiter_1],[SPW1_-_CO_v=0_3-2(ID=3768104)],Ephemeris,345764900000.0,345765000000.0
11,uid___A002_X1015532_X1926f.small_01,"[CALIBRATE_ATMOSPHERE#AMBIENT, CALIBRATE_WVR#A...","(30, 4, 4, 2)","[XX, YY]","[2, 4]",X780709176#ALMA_RB_07#BB_1#SW-01#FULL_RES_0,[Jupiter_1],[Jupiter_1],[SPW1_-_CO_v=0_3-2(ID=3768104)],Ephemeris,345764900000.0,345765000000.0
0,uid___A002_X1015532_X1926f.small_02,"[CALIBRATE_ATMOSPHERE#HOT, CALIBRATE_WVR#HOT]","(30, 4, 4, 2)","[XX, YY]","[2, 4]",X780709176#ALMA_RB_07#BB_1#SW-01#FULL_RES_0,[Jupiter_1],[Jupiter_1],[SPW1_-_CO_v=0_3-2(ID=3768104)],Ephemeris,345764900000.0,345765000000.0
3,uid___A002_X1015532_X1926f.small_03,[OBSERVE_TARGET#OFF_SOURCE],"(510, 4, 4, 2)","[XX, YY]","[3, 5]",X780709176#ALMA_RB_07#BB_1#SW-01#FULL_RES_0,[Jupiter_1],[Jupiter_1],[SPW1_-_CO_v=0_3-2(ID=3768104)],Ephemeris,345764900000.0,345765000000.0
2,uid___A002_X1015532_X1926f.small_04,[OBSERVE_TARGET#ON_SOURCE],"(864, 4, 4, 2)","[XX, YY]","[3, 5]",X780709176#ALMA_RB_07#BB_1#SW-01#FULL_RES_0,[Jupiter_1],[Jupiter_1],[SPW1_-_CO_v=0_3-2(ID=3768104)],Ephemeris,345764900000.0,345765000000.0
1,uid___A002_X1015532_X1926f.small_05,"[CALIBRATE_ATMOSPHERE#OFF_SOURCE, CALIBRATE_WV...","(30, 4, 4, 2)","[XX, YY]","[2, 4]",X780709176#ALMA_RB_07#BB_2#SW-01#FULL_RES_1,[Jupiter_1],[Jupiter_1],[SPW2(ID=3768104)],Ephemeris,345029700000.0,345076500000.0
10,uid___A002_X1015532_X1926f.small_06,"[CALIBRATE_ATMOSPHERE#AMBIENT, CALIBRATE_WVR#A...","(30, 4, 4, 2)","[XX, YY]","[2, 4]",X780709176#ALMA_RB_07#BB_2#SW-01#FULL_RES_1,[Jupiter_1],[Jupiter_1],[SPW2(ID=3768104)],Ephemeris,345029700000.0,345076500000.0
15,uid___A002_X1015532_X1926f.small_07,"[CALIBRATE_ATMOSPHERE#HOT, CALIBRATE_WVR#HOT]","(30, 4, 4, 2)","[XX, YY]","[2, 4]",X780709176#ALMA_RB_07#BB_2#SW-01#FULL_RES_1,[Jupiter_1],[Jupiter_1],[SPW2(ID=3768104)],Ephemeris,345029700000.0,345076500000.0
12,uid___A002_X1015532_X1926f.small_08,[OBSERVE_TARGET#OFF_SOURCE],"(510, 4, 4, 2)","[XX, YY]","[3, 5]",X780709176#ALMA_RB_07#BB_2#SW-01#FULL_RES_1,[Jupiter_1],[Jupiter_1],[SPW2(ID=3768104)],Ephemeris,345029700000.0,345076500000.0
13,uid___A002_X1015532_X1926f.small_09,[OBSERVE_TARGET#ON_SOURCE],"(864, 4, 4, 2)","[XX, YY]","[3, 5]",X780709176#ALMA_RB_07#BB_2#SW-01#FULL_RES_1,[Jupiter_1],[Jupiter_1],[SPW2(ID=3768104)],Ephemeris,345029700000.0,345076500000.0


## Inspect antenna_xds

In [11]:
ant_xds = ps['uid___A002_X1015532_X1926f.small_09'].attrs['antenna_xds'].load()
ant_xds