WesterborkCookbook

This document is out of date, and is kept around for historical purposes only

Please refer to CalicoDocumentation for newer documents.

The Many Ways To Skin a (WSRT) Cat With Calico

This document is meant to be a step-by-step recipe for reducing a WSRT dataset with Calico. Readers are HIGHLY ENCOURAGED to extend this with their own experiences and comments. The target audience is the "button-pushing astronomer". Power users who mean to extend Calico scripts with new features, or write new Calico modules, will need their own documentation.

In this example, we will reduce a 1.4GHz observation of 3C147. You can download the measurement set, 3C147_spw0.MS, and a sky model, BAND0.MDL here:

ftp://ftp.astron.nl/pub/smirnov/3C147_spw0.MS.tgz
ftp://ftp.astron.nl/pub/smirnov/BAND0.MDL A complete log of my data reduction experiences can be found in CalibrationEfforts and ["OlegSmirnov/CalibrationEfforts/3C147_Band0"].

Prerequisites

For WSRT data, three Calico scripts are currently provided:

calico-wsrt.py is the workhorse calibration script.
calico-view-ms.py is meant for "quick look" at Measurement Sets, and also to manage flags.
calico-flagger.py is a primitive flagger. I only use it to flag whole channels. For real flagging, I have been using the aips++ autoflag tool (more on this below). In the near future will provide a "wrapper script" around autoflag to make the process easier. All scripts currently reside under Timba/FW/Calico. Place your MSs there (or alternately, copy the scripts to another location.)

Quick Look: calico-view-ms.py

This script is meant for quick look at the data, and for flag management. When you load this script in the browser, you're presented with a standard MS selection menu. The top part of the menu is common to all Calico scripts, so it is documented on a separate page: ["../MSSelectionMenu"]

The last option in the menu, when enabled, inverts the phases when reading the input data (on-the-fly, the MS is not actually changed.) This is not important for quick look, but may be important in calibration.

Once you have set up the MS selection, press "Compile" to build the tree. A menu called "TDL Jobs & Runtime Options" will pop up.

Looking at bookmarks

To see anything useful, you want to load a few Bookmarks from the top-level menu of the browser. Do this before clicking anything in the Runtime Options menu.

The "Inspect input visibilities" bookmark gives you a set of 1D tracks of frequency-averaged data vs. time, per baseline. You probably always want to load this one.
"Input visibilities by baseline" is a set of bookmarks for viewing individual baselines in all their detailed glory. If you're interested in a specific baseline, load the appropriate bookmark.

Running the viewer

Before you run the viewer, you must make sure the "Data selection & flag handling" submenu is set up properly. This is common to all Calico scripts, so it is documented on a separate page: ../DataSelectionMenu.

To look at your raw data, select the DATA column under "Data selection". Now, make sure that "Data description ID" is set to 1 (since we're dealing with spectral window 1 here.) Now select "view MS" and observe the plots.

Preliminary flagging

For starters, read ../FlagHandling. This provides an overview of how MeqTrees and Calico handle flags.

Following the explanation in ["OlegSmirnov/CalibrationEfforts/3C147_Band0"], we want to start by flagging the even channels (0,2,4,...), plus channels 5,7,53,55. This can be done by running the calico-flagger.py script. In compile-time options, select

Using the autoflag tool

For preliminary flagging, you want to flag:

the start and end of the band (e.g. channels 0-2, 58-63). This is somewhat optional, since the "Channel selection" menus throughout Calico allow you to ignore these anyway.
obvious RFI This is easiest to do with the aips++ autoflag tool as follows:
{{{glish -l autoflag.g

af:=autoflag('whatever.MS') # this flags channels 0-2 and 58-63 (glish indices are 1-based!) - af.setselect(chan=1,3],[59,64) # these two will flag obvious outliers - af.settimemed() - af.setuvbin(plotchan=T,nbins=100,thr=.01) # applies to all data - af.setdata() # runs the flagging - af.run(plotdev=[2,2]) }}}

Note the flagreport.ps file that will show up in your current directory. This contains a flagging report that may be useful to study.

If using Hanning tapering (and thus calibrating on every second channel), you also want to flag every even or every odd channel. This is due to a bug in the aips++/casarest imager -- it is unable to use channel stepping propely, so it's easier to just flag the even/odd channels so that the imager ignores them.

NB: you must use calico-flagger.py for this. Insert instructions here.

Transferring flags into a flagset

Once all this is done, you probably want to transfer these flags into a flagset. Read ../FlagHandling for an overview. The basic steps are:

Review flags to make sure they're sensible

Compile calico-view-ms.py
Under "Data selection & flag handling", enable "Read flags from MS" and "Include legacy FLAG column". DISABLE "Fill legacy FLAG column" and "Convert legacy FLAG column into output flagset" -- we want the first run to be read-only.
Disable the "Channel selection" menu. We want to view all channels (even flagged ones.)
Load a few bookmarks.
Click on "view MS".
Study the plots (use right-click to turn flag display on and off).

Write these flags into a flagset

Compile calico-view-ms.py
Under "Data selection & flag handling", enable "Read flags from MS" and "Include legacy FLAG column". Enable "Convert legacy FLAG column into output flagset", enable "Write output flagset". Pick a name for the output flagset, e.g. "BASIC", or "FLAG0", or something like that.
Disable "Fill legacy FLAG column".
Click on "view MS" and wait for the job to finish.

Verify new flagset

Congratulations, you have your first flagset. You may verify this by re-compilingFootNote(currently, scripts need to be recompiled every time a new flagset is created, otherwise the new name is not picked up by the menus) the calico-view-ms.py script, and then

Under "Data selection & flag handling", enable "Read flags from MS", disable "Include legacy FLAG column", enable "Include flagset BASIC" (or whatever you called it.) DISABLE "Fill legacy FLAG column" and "Convert legacy FLAG column into output flagset" -- we want to go back to being read-only.
Load a few bookmarks.
Click on "view MS".
Study the plots (use right-click to turn flag display on and off).
The results should be the same as at step 1 -- but due to different option settings, we are now using the flagset BASIC, rather than the legacy FLAG column.

Calibration proper: initial setup

Now load the calico-wsrt.py script. This is main calibration script. You will want to setup compile-time options as follows:

MS selection is documented in MSSelectionMenu. Pick your MS, select None (i.e. all) for the antenna subset. Select "2x2, diagonal terms only" for the correlation type.
In the "What do we want to do" submenu, enable 'Calibrate', 'Subtract', 'Correct', 'Invert phases in input'. Under the 'Calibrate' submenu, select 'Calibrate on: visibilities'. Set the baseline selection to None -- we want to use all baselines for now.
Under "Measurement Equation options", enable "Include time & bandwidth smearing", and set it to apply to the 100 brightest sources (which may be overkill...)
Under "Sky model", select "MeowLSM". Pick BAND1.MDL as the model file, set it to NEWSTAR format. Under "Use subset of LSM sources" enter "0:19". This will use only the 20 brightest sources for calibration. Uncheck all the other options, but enter something sensible for "Export LSM as Karma annotations file", e.g. 3C147.ann.
Disable "Use E Jones" for now
Enable "B Jones" and "G Jones", select the "FullRealImag" module for both. This adds two Jones terms to your M.E., both of them represented by solvable real and imaginary parts. (The other option -- "DiagAmplPhase" -- represents them by solvable amplitude and phase, which allows one to solve for gains and phases on different timescales, but can converge much slower.)
Disable interfeometer gains & phases for now. You can now press "Compile" to build the tree.

Calibration proper: setting up input data

Under the runtime options menu that pops up, go to "Data selection & flag handling". The full menu is documented in ../DataSelectionMenu and ../FlagHandling. What you need to do here is:

set input column to DATA, output column to CORRECTED_DATA
enable Hanning tapering
enable reading of MS flags, and include flagset BASIC (which we created above)
set data description ID to 1 (for 3C147_spw1.MS. It is an unfortunate "feature" of the WSRT MS splitter scripts we use that the SPECTRALW_WINDOW table gets copied to all MSs, so each MS "thinks" it contains all 8 spectral windows, when in fact each only contains 1 window.)
set channel selection to start 3, end 57, step 2 (this selects the odd channels -- the ones we HAVEN'T flagged above).

Calibrating For G

Time for an actual calibration. Open up the "Calibrate G diagonal terms" menu. What we want to do is solve for the G term (receiver gain), constant in frequency, with one independent solution per each timeslot. This is accomplished by setting:

Tile size: 20 timeslots.
"Select solvabilty by..." should both be disabled, so that all Gs are solved for (these options are only useful when you want to solve for specific subsets.)
Polynomial degree in time and freq is 0.
Solution subinterval is None in freq, and 1 in time (the latter, in combination with the tile size =20 option above, causes 20 timeslots to be read in one chunk, and 20/1=20 independent solutions to be computed simultaneously.)
Under 'Solver options', set 'Convergence threshold' to 1e-6.
Leave everything else at default values. Now, load a few bookmarks. At the least, load "inspector:G" (this will show the G solutions as a function of time), and "Inspect corrected data or residuals" (which will display per-baseline frequency-averaged residuals as a function of time).

Click on "Calibrate G diagonal terms" and wait for the solution to end. If all went well, you should see:

Some reasonable G curves (except for antenna 7, perhaps)
Some reasonable residuals (except for baselines with antenna 7).
Chi-sq should have been converging to the order of 0.001~0.0001.

Calibrating For B

The previous solution only took out the average (over frequency) receiver gain. Now we need to solve for antenna bandpasses. Bandpasses have almost the opposite behaviour to reciever gains: they have significant variation in frequency, but almost none in time. Bandpasses are designated by the B term in our measurement equation.

Note that the B term and the G term are implemented in exactly the same way, the only difference is their time and frequency behaviour. To get the desired behaviour for B, open up the "Calibrate B diagonal terms" menu, and set:

Tile size: 120 timeslots.
"Select solvabilty by..." should both be disabled, so that all Bs are solved for.
Polynomial degree in time and freq is 0.
Solution subinterval is 1 in freq, and None in time. In combination with the tile size =120 option above, this causes an independent solution to be made for each 1 channel and 120 timeslots.
Leave everything else at default values. Now, load a few bookmarks. At the least, load "inspector:G" (this will show your previous G solutions as a function of time), and "Inspect corrected data or residuals" (which will display per-baseline frequency-averaged residuals as a function of time). Also, open up a page or two under "B diagonal terms".

Now click on "Calibrate B diagonal terms" and wait for the solution to end. If all went well, you should see:

Some reasonable B curves (in frequency)
Some reasonable residuals (except for baselines with antenna 7).
Chi-sq should have been converging to the order of 1e-5~1e-6.

FLagging On Residuals

Recalibrating For G

Now that the bandpasses have been fitted, you may want to go back and repeat the G solution, it will be considerably improved.

Making an image

It's time to make a residual image. Go into 'Imaging options', set:

Image type: CORRECTED_DATA
Frequency channels: mfs
Imaging weights: uniform
Stokes parameters: IQUV (or just I)
Image size: 1024 pixels, 120 arcmin.
Image padding factor: 1.2
No w-projection
Use custom MS selection for imaging
- Data description ID: 1
- Channels 3-57, stepping 1 (remember that the imager ignores the stepping argument, which is why we've flagged all the even channels.) Now click on "MAke a dirty image", and wait for the image to come up.

Viewing the image

If you're using the Karma viewer, you can go to Overlay: Load annotations to display the LSM annotation file produced at the previous step. This shows where the model soures are.

More flagging

It is obvious from the image that some more flagging is required. Also, from the residuals seen during the solution, it's clear that all baselines with RT7 are bad. We can flag all baselines with RT7 as follows:

{{{$ glish -l autoflag.g

af:=autoflag('3C147_spw1.MS') - af.setdata() - af.setselect(ant=['RT7']) - af.run() }}}

You can then transfer these flags to a new flagset (call it "RT7". It's handy to have them in a sepaate flagset, so that later we can compare calibration with and without RT7 flagged.) Then, you can flag outliers in the residuals of the previous solution (i.e. the CORRECTED_DATA column):

{{{$ glish -l autoflag.g

af:=autoflag('3C147_spw1.MS') - af.setdata() - af.setuvbin(plotchan=T,nbins=100,thr=.01,column='CORR') - af.settimemed(column='CORR',hw=20,rowhw=20) - af.run(plotdev=[2,2]) }}}

Again check the flagreport.ps file for flag statistics. Transfer these flags to a new flagset, as described above, call it "RESIDUALS1".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly