# Mastering TaQL

*Of: Heb geen hekel aan TaQL*

TaQL is part of [casacore](https://github.com/casacore/casacore), a set of libraries for radio astronomy data processing.

TaQL stands for "Table Query Language", but it can be used also without tables.

## Using this notebook

This notebook is written in Jupyter, but it contains a binding to the TaQL kernel. If you highlight a code cell, you can press **Shift-Enter** to evaluate it in TaQL.

You can evaluate all the TaQL commands already present in this notebook. To understand the commands, you can try to predict the outcome of a statement before evaluating it. Also, you are encouraged to change the commands: you can enter any valid TaQL-statement in this notebook.

Navigation in Jupyter notebooks can be tricky. If you are in *command mode* (not editing a cell), lots of keyboard shortcuts are active, like `"j"` and `"k"` for scrolling, `"a"` for inserting a cell, or `"dd"` for deleting one (these should be familiar if you know `vi`). If you want to type in a cell, make sure to be in *edit mode* by checking you see a blinking cursor. 

  * To go to edit mode, press Enter or double click a cell.
  * To go back to command mode, press Esc or single click another cell.

<div class="alert alert-success">**Exercise**: what will happen if you type "`add`" in command mode, and try it to see if you're right.</div>

## TaQL as a calculator

Although it's not the main intended use, you can use TaQL like a regular calculator:

In [135]:
6*7

42


Exponentiation is done using `**` (as in Python), the operator `^` is a bitwise *xor* – if you don't know what that is, just don't use `^`.

In [136]:
4^2

6


In [138]:
4**2

16.0


TaQL can do complex numbers as well:

In [139]:
(3+4i)/(8-1i)

(0.307692307692+0.538461538462j)


Most functions you expect to work are actually there:

In [141]:
sin(pi()/2)

1.0


<div class="alert alert-success">**Exercise**: try some functions and see if they work. Does TaQL respect operator precedence?.</div>

### Arrays and indexing

Unlike some other languages, TaQL supports lists (or actually arrays):

In [142]:
[10,34,21,0,-3.4,8]

[ 10. ,  34. ,  21. ,   0. ,  -3.4,   8. ]


The python way of specifying lists is used:

In [144]:
[1:10:2]

[1, 3, 5, 7, 9]


A lot of functions exist specifically for lists:

In [145]:
mean([10,34,21,0,-3.4,8])

11.6


In [146]:
stddev([10,34,21,0,-3.4,8])

13.8938835464


<div class="alert alert-success">**Exercise**: use the function `sumsqr` (square of sums) to compute the length of the vector *(3, 4)*.</div>

Indexing arrays works like in python, with a range specified by `start:end:incr`. For example, `a[4:12:3]` takes the elements `[4,8]`. Note that the end is *exclusive*.

Arrays can be created using the python way of creating **ranges**. `0:5` creates a range `0,1,2,3,4` (so the last number is exclusive). You can use a third part to specify the step size: `0:16:5` creates the range `0,5,10,15`. To make an array with a range, enclose it in square brackets.

<div class="alert alert-success">**Exercise**: create the array `[13, 17, 21, … 45, 49]` and compute its average.</div>

Higher dimensional arrays are supported:

In [316]:
array([37,1,3,34,5,7], 2, 3)

[[37,  1,  3],
 [34,  5,  7]]


As you see, this yields an array of two rows and three columns.

**Note**: the command line version `taql` prints this as
```
Axis Lengths: [3, 2]  (NB: Matrix in Row/Column order)
[37, 34
 1, 5
 3, 7]
```
In this notebook, it is printed in Column/Row order, just like in python and C.

Indexing higher arrays works with `[…, …]`. Leaving out an indexing expression selects the entire axis.

<div class="alert alert-success">**Exercise**: add an indexing expression in the array below to select the row with values *(37, 1, 3)*. Afterwards, change the indexing expression to select the column with values *(1, 5)*.</div>

In [335]:
array([37,1,3,34,5,7], 2, 3)[]

Error in TaQL command: using style Python SELECT array([37,1,3,34,5,7], 2, 3)[]
  parse error at or near position 56 ']':                                     ^


To take the mean over one axis of a higher dimensional array, use `mean`**`s`** (just like in NumPy):

In [338]:
means(array([37,1,3,34,5,7], 2, 3), 0)

[ 35.5,   3. ,   5. ]


In [339]:
means(array([37,1,3,34,5,7], 2, 3), 1)

[ 13.66667,  15.33333]


Most operators and functions act sensibly when you apply them to an array:

In [340]:
3 + [1,2,3]

[4, 5, 6]


In [341]:
sin([pi()/2, pi()/4, pi()/3, pi()/6])

[ 1.     ,  0.70711,  0.86603,  0.5    ]


### Comparison

In [342]:
sqrt(2)/2 == sin(pi()/4)

False


This shows that you shouldn't test exact comparison for  `sin(pi()/4)` equal to `sqrt(2)/2`? The answer is floating point precision: computers don't know absolute numbers. That's why TaQL has a function `near` to compare if to numbers are relatively near, with a default tolerance of *10<sup>-13</sup>*.

In [343]:
near(sqrt(2)/2, sin(pi()/4), 1.e-13)

True


For testing with a relative tolerance of *10<sup>-5</sup>* (useful for single precision numbers), you can use the shorthand operator `~=`:

In [344]:
sqrt(2)/2 ~= sin(pi()/4)

True


### Sets and intervals

In [168]:
[1:10:3]

[1, 4, 7]


In [83]:
3.4 in 1:10

False


In [85]:
3.4 in <3.4,>

False


In [167]:
3.4 in 3.4<:

False


In [89]:
3.4 in {3.4,10>

True


In [221]:
3.4 in 3.4=:<10

True


## Units

TaQL has basic support for units, even obscure ones.

In [270]:
4m + 3in

4.0762 m


SI prefixes like `p`, `n`, `u` (for µ), `m`, `c`, `d`, `da`, `h`, `k`, `M`, `G`, `T` can be used.

<div class="alert alert-success">**Exercise**: Evualte with taql if a *millifoot* is in the open interval between 100 and 200 *nano-mile* (this is an accidental feature of TaQL).</div>

Unit support is not perfect, reduction of units does not work:

In [291]:
200m/200m

1.0 m/(m)


In [294]:
1/1s+1Hz

Error in TaQL command: 'using style Python SELECT 1/1s+1Hz'
  Error in select expression: Units s and Hz do not conform

Some checking of units is performed:

In [293]:
sqrt(3km)

Error in TaQL command: 'using style Python SELECT sqrt(3km)'
  Error in select expression: Erronous use of function sqrt - UnitVal::UnitVal Illegal unit dimensions for root

### Angles

Angles can be given in `h:m:d` or `d:m` format, or in radians or degrees:

In [347]:
4h56m03.5 + 4d12m43.7 + 1 deg - 0.3 rad

1.08276715834 rad


If you want the result in a different unit, append that unit to an expression:

In [33]:
(4h56m03.5 + 4d12m43.7 + 1 deg - 0.3 rad) deg

62.0379883683 deg


It also helps to know that the unit of an expression will be the same of the unit of the first component:

In [348]:
0deg + 4h56m03.5 + 4d12m43.7 + 1 deg - 0.3 rad

62.0379883683 deg


To format an angle in hours, minutes and seconds, use the function `hms()`. Similarly, to format it in degrees, minutes, seconds, use `dms()`. To format an array with RA-DEC values, use `hdms()`, which formats even elements with `hms()` and odd elements with `dms()`.

<div class="alert alert-success">**Exercise**: put the coordinates of Westerbork, *(6.60417°, 52.91692°)* in an array, and format it in the conventional RA-DEC notation.</div>

Functions for calculations with angles are built in, for example for computing the angular distance between two positions:

In [365]:
angdist([6.60417, 52.91692] deg, [0, 90] deg) deg

37.08308 deg


### Dates

![ISO 8601 was published on 06/05/88 and most recently amended on 12/01/04.](https://imgs.xkcd.com/comics/iso_8601.png "XKCD 1179")

Literal dates can be entered directly into TaQL, for example using the above ISO standard (which was introduced after the first version of casacore).

In [185]:
1981-04-01

1981/04/01


Values can be converted to dates with the function `date()` or `datetime()`. Without arguments, this gives the current date (or date + time).

In [186]:
date(0.)

1858/11/17


As you can guess from the above, dates are internally stored as modified Julian Date.

To convert a date to a pretty-printed date, you can use `cdate()`:

In [187]:
cdate(date(0.))

17-Nov-1858


Similarly for showing times there is `ctime()`, and for showing both date and time there is `cdatetime()`.

Calculations on dates work like you would expect:

In [189]:
date() - 1981-01-04

12808.0 d


<div class="alert alert-success">**Exercise**: when were you 10.000 days old?</div>

### Times

The function `time()` gives the time (current time if no arguments given) in *radians*. This makes it possible to write times in the same way as angles: 

In [194]:
time() > 12h38m

True


<div class="alert alert-success">**Exercise**: check that `datetime() - date()` (which gives a result in days) is consistent with `time()`.</div>

To convert a time to a string, use the function `ctime()` (remember it as "*see time*), or `cdatetime()` to include the date.

In [199]:
ctime(5000 s)

01:23:20.000


## Measures

The prefix `meas.` is for functions linking to CasaCore's *measures* library. These functions make it possible to convert measures like directions, epochs, and positions from one reference frame to another.

### Times

To do really accurate computations with times, one should use Measures. When you specify a time, it is interpreted with respect to the `UTC` frame (Coordinated Universal Time). To convert to a different frame, e.g. `TAI` (International Atomic Time), use `meas.epoch`:

In [200]:
cdatetime(meas.epoch("TAI", 2016-01-28 15:00:00, "UTC"))

['2016/01/28/15:00:36.000']


Since the default time frame is `UTC`, it may be omitted.

As you see, there is a discrepancy between `UTC` and `TAI`. This is due to leap seconds.  These leap seconds are announced only half a year before (for example, here's the [announcement](ftp://hpiers.obspm.fr/iers/bul/bulc/bulletinc.49) for 2015's leap second). This is one of the reasons that you sometimes get warnings if your casacore data directory is out of date.

In [201]:
meas.epoch("TAI","30-Jun-2015")-meas.epoch("UTC","30-Jun-2015")

[ 35.] s


In [202]:
meas.epoch("TAI","01-Jul-2015")-meas.epoch("UTC","01-Jul-2015")

[ 36.] s


As you see, a leap second was inserted in `UTC` between June and July 2015. Leap seconds are not applied in the `TAI` standard, otherwise the standards are the same.

<div class="alert alert-success">**Exercise**: calculate the number of seconds between `1997-01-01 00:00 UTC` and `2000-01-01 00:00 UTC`, and explain why the answer is *not* `94608000 s`.</div>

Available time frames are:

`"LAST"` (Local Apparent Sidereal Time), `"LMST"` (Local Mean Sidereal Time), `"GMST1"` (Greenwhich Mean ST1), `"GAST"` (Greenwhich Apparent ST1), `"UT1"`, `"UT2"`, `"UTC"`, `"TAI"`, `"TDT"`, `"TCG"`, `"TDB"`, `"TCB"`

### Positions

Positions on Earth must be given with respect to a reference frame. Two important reference frames are `WGS84` and `ITRF`. Positions can be converted between reference frames with the function `meas.position` (or `meas.pos`).

In [205]:
meas.position("ITRF", [6.60417, 52.91692] deg, "WGS")

[ 3828485.54946,   443253.26237,  5064974.012  ] m


Since `WGS` is the default, it may be omitted.

The positions of most radio telescopes are predefined:
`"ALMA"`, `"ARECIBO"`, `"ATCA"`, `"BIMA"`, `"CLRO"`, `"DRAO"`, `"DWL"`, `"GB"`, `"GBT"`, `"GMRT"`, `"IRAM PDB"`, `"IRAM_PDB"`, `"JCMT"`, `"MOPRA"`, `"MOST"`, `"NRAO12M"`, `"NRAO_GBT"`, `"PKS"`, `"SAO SMA"`, `"SMA"`, `"VLA"`, `"VLBA"`, `"WSRT"`, `"ATF"`, `"ATA"`, `"CARMA"`, `"ACA"`, `"OSF"`, `"OVRO_MMA"`, `"EVLA"`, `"ASKAP"`, `"APEX"`, `"SMT"`, `"NRO"`, `"ASTE"`, `"LOFAR"`, `"MeerKAT"`, `"KAT-7"`, `"EVN"`, `"LWA1"`, `"PAPER_SA"`, `"PAPER_GB"`, `"e-MERLIN"`, `"MERLIN2"`, `"Effelsberg"`, `"MWA32T"`, `"AMI-LA"`

The output of meas.position defaults to be in meters from the origin. By appending `LL` to the code for the frame, you get it in lat/long.

In [207]:
meas.position("WGSLL", "WSRT") deg

[  6.60417,  52.91692] deg


<div class="alert alert-success">**Exercise**: compute the angular distance between ALMA and MeerKAT.</div>

### Directions

Casacore knows a lot of reference frames. Conversions between them are done with `meas.dir`:

In [95]:
meas.dir("GALACTIC", [-6h52m36.7, 34d25m56.1], "J2000")

[ 1.00291,  0.61843] rad


Since `J2000` is the default, it may be omitted.

Several directions have been predefined, like all the planets, the sun and the moon, and standard sources (`"CasA"`, `"CygA"`, `"TauA"`, `"VirA"`, `"HerA"`, `"HydA"`, `"PerA"`).

If you want to convert to a coordinate frame which is tied to the Earth, it is necessary to also specify a time and a position.

In [96]:
meas.dir("AZEL", "Jupiter", datetime(), "WSRT")

[ 1.38195, -0.05051] rad


The frame of the date and time can be given explicitly (and should be if they are not `UTC` and `WGS84`, respectively):

In [97]:
meas.dir("AZEL", "Jupiter", 2000-01-01 00:00, "TAI", 
        [3826577.110, 461022.900, 5064892.758] m, "ITRF")

[-1.57846,  0.19452] rad


In [111]:
Supported reference frames are:

Error in TaQL command: using style Python SELECT Supported reference frames are:
  parse error at or near position 52 'frames':                            ^


There is a special function to see when a source will be visible on a given day:

In [366]:
meas.riseset("SUN", date(), [6.60417, 52.91692] deg)

[30-Jan-2016/07:21:18, 30-Jan-2016/16:12:19]


<div class="alert alert-success">**Exercise**: when will Cassiopeia A rise tomorrow?</div>

## Tables

In [None]:
select mean(DATA[FLAG]) as mymean, WEIGHT from ~/projects/tim/tim.MS limit 2

Columns

limit, offset, etc

Storing the output

### Using groupby

## Structure of a Measurement Set

Example with subquery

Example with mscal

## Baseline selection syntax