# WALLABY Database Access Notebook

<span style="font-weight: bold; color: #FF0000;">⚠ Make sure the Jupyter Notebook server is loaded with the wallaby/python-3.9.1 module!</span>

## Connect to Database

The first step will be to connect to the WALLAYBY database by importing the `wallaby` module and calling the `wallaby.connect()` function. This will connect you to the database using a generic WALLABY user account and provide read access to all data tables.

In [None]:
import wallaby_data_access as wallaby
wallaby.connect()

## Retrieve Catalogue

Once you are connected, you can then use the `wallaby.get_catalog()` function to retrieve the source catalogue as an Astropy table object. Catalogues are retrieved by tag, where tags define different collections of sources, e.g. all sources from a specific data release. The following tags are currently supported:

In [None]:
wallaby.print_tags()

As an example, let us retrieve all sources from phase 2 pilot observations released as part of the NGC 5044 DR1 release by supplying the `NGC 5044 DR1` tag to the `wallaby.get_catalog()` function:

In [None]:
# Retrieve catalogue as Astropy table
from astropy.table import Table
table = wallaby.get_catalog("NGC 5044 DR1");

# Sort table by flux (brightest first)
table.sort("f_sum", reverse=True)

# Print table
table.pprint()

The source catalogue returned by the function should have been printed above (if not, check for error messages) and is stored in the variable `table`. We can now use basic indexing to access different catalogue entries. For example, `table["f_sum"]` will return the entire column of integrated flux measurements, and we can use `table["f_sum"][0]` etc. to extract the individual fluxes for each source. Likewise, `table[0]` will extract the entire first row of the catalogue, i.e. a list of all parameters of the first source.

## Calculate Physical Parameters

The next example demonstrates how to retrieve certain parameters from the catalogue and use basic arithmetic to convert some of the raw measurements made by SoFiA into physically meaningful parameters such as redshift or HI mass. These can be directly appended to the catalogue as additional columns using `table["parameter_name"] = <expression>`.

In [None]:
import numpy as np
import scipy.constants as const
from astropy.cosmology import FlatLambdaCDM

# Set up cosmology
f_rest = 1.42040575e+9;  # HI rest frequency in Hz
cosmo = FlatLambdaCDM(H0=70, Om0=0.3, Tcmb0=2.725)

# Calculate redshift
table["redshift"] = f_rest / table["freq"] - 1.0

# Calculate luminosity distance in Mpc and HI mass in solar masses
table["dl"] = cosmo.luminosity_distance(table["redshift"]).value
table["log_mhi"] = np.log10(49.7 * table["dl"] * table["dl"] * table["f_sum"])

# Calculate source rest frame velocity width in km/s
table["dv"] = const.c * (1.0 + table["redshift"]) * table["w20"] / f_rest / 1000.0

# Show our new parameters
table["name", "id", "redshift", "dl", "log_mhi", "dv"].pprint(max_width=-1)

## Create a Plot

Once we’ve done our analysis, we can the create plots of any of the parameters in our table. In this example, let us plot the logarithmic HI mass against redshift and additionally colour the data points by source rest frame velocity width. If desired, the resulting plot can be exported as a PDF file and then downloaded to your local computer, e.g. to use in a presentation or publication.

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

plt.rcParams["figure.figsize"] = (14, 8)
plt.rcParams["font.size"] = 16

plt.scatter(table["redshift"], table["log_mhi"], s=16, c=table["dv"], cmap="jet")
plt.xlabel(r"$z$")
plt.ylabel(r"$\log_{10}(M_{\rm HI} / M_{\odot})$")
cbar = plt.colorbar()
cbar.set_label(r"$\Delta v \; (\mathrm{km \, s}^{-1})$")
plt.xlim(0.0, 0.1)
plt.ylim(7.0, 11.0)
plt.grid(True)

# Uncomment the following line to make a PDF copy in the notebook folder for download
#plt.savefig("my_plot.pdf", format="pdf", bbox_inches="tight", pad_inches=0.05)

plt.show()

## Filtering the catalogue

Once we have the catalogue loaded into an Astropy table object, we can easily make selections to suit our scientific needs. The following examples illustrate how the catalogue can be filtered by certain criteria such as parameter ranges or the presence of comments and tags.

**Example 1: Filter sources by parameter range**

In [None]:
# Select all sources within a certain RA and Dec range

mask = (table["ra"] > 202.0) & (table["ra"] < 203.0) & (table["dec"] > -22.5) & (table["dec"] < -21.5)
table[mask].pprint()

**Example 2: Filter sources tagged as components of a galaxy**

In [None]:
# Select all sources that have the "Component" tag set

mask = ["Component" in tags for tags in table["tags"]]
table[mask]["name", "id", "tags"].pprint_all()

**Example 3: Filter sources that have comments attached**

In [None]:
# Select all sources with at least one comment

mask = [len(comments) > 0 for comments in table["comments"]]
table[mask]["name", "id", "comments"].pprint_all()

## Create overview plot for a specific source

It is also possible to display an overview plot of a specific source (as identified by its catalogue ID) by calling the `wallaby.overview_plot()` function. That function will display four panels showing the moment 0 and 1 maps, a DSS image with HI contours and the integrated spectrum of the source. **Note that it may take up to half a minute before the plot is displayed, as Astropy must download the DSS image from Skyview first.** If the Skyview download fails, which happens occasionally, just try again a few hours later.

In [None]:
%matplotlib inline

import warnings
warnings.filterwarnings("ignore")

plt = wallaby.overview_plot(id=4713)

# Uncomment the following line to make a PDF copy in the notebook folder for download
#plt.savefig("my_plot.pdf", format="pdf", bbox_inches="tight", pad_inches=0.05)

plt.show()