In [1]:
from IPython.display import HTML
HTML('../style/course.css')
HTML('../style/code_toggle.html')

## Predicting HI Detections

### Overview

* [Introduction: HI Surveys](#Introduction-Surveys)
* [Essential Survey Parameters for Making Predictions](#Essential-Parameters)
  * [Survey Sensitivity](#Survey-Sensitivity)
  * [Survey Volume](#Survey-Volume)
* [Predicting Galaxy Detections](#Predicting-Detections)
* [More Advanced Predictions: Resolved Galaxy Number Counts](#Advanced-Resolved)

### Introduction: HI Surveys

The next decade will be an exciting one for HI science, since several major new surveys on SKA precursor facilities are underway or set to begin in 2017 that will cover cosmological volumes.These range from wide-field, relatively shallow surveys of the local Universe in emission and the distant one in absorption ([Wallaby on ASKAP](http://www.atnf.csiro.au/research/WALLABY/), [FLASH on ASKAP](http://www.caastro.org/research/evolving/flash), [Apertif Shallow Northern Survey on WSRT](http://www.astron.nl/astronomy-group/apertif/surveys-and-documents/apertif-surveys-and-documents), [MALS on MeerKAT](http://mals.iucaa.in/)) to deeper <i>pencil beam</i> surveys that will detect gas in and around galaxies out to $z \approx 1$ ([CHILES on the VLA](http://chiles.astro.columbia.edu/), [LADUMA on MeerKAT](http://www.ast.uct.ac.za/laduma/node/6), [DINGO on ASKAP](http://internal.physics.uwa.edu.au/~mmeyer/dingo/welcome.html)).

A common feature of the majority of the surveys mentioned above is that they are *blind*: they will uniformly survey patches of sky in which the HI content is not a priori known at the requisite sensitivity and resolution (this is, of course, why one would want to survey there in the first place!).  Predicting the number and nature of detections anticipated in these surveys is therefore critical for estimating the scientific returns from these large investments of limited telescope time, the impact of modifications to survey plans, and the resources required to analyse the resulting datasets (e.g. Duffy et al. 2012; Maddox et al. 2015; Giovanelli &amp; Haynes 2016). Such predictions can also be useful a posteriori, to understand the properties of survey detections and to control for systematics.

This chapter provides general guidelines for predicting HI detections for blind HI surveys of cosmological volumes.  All reliable predictions require a thorough understanding of the planned survey's basic parameters. With these parameters in-hand, we discuss predicting survey detections using the HI mass function, with a mention of other techniques. We then turn to predicting resolved galaxy properties.

### Essential Survey Parameters for Making Predictions

The key ingredients for making robust predictions of HI detections in a given survey are the survey sensitivity and volume. In this section, we discuss the essential parameters that determine these quantities. Several excellent resources exist that describe the fundamental properties of radio telescopes, both in print (e.g. Thompson, Moran and Swenson 2001) and online (e.g. [Essential Radio Astronomy](http://www.cv.nrao.edu/course/astr534/PDFnew.shtml), [Radio Astronomy Tools and Techniques](https://casper.berkeley.edu/astrobaki/index.php/Radio_Astronomy:_Tools_and_Techniques)). Here, we tailor the more general (and complete) discussion in those resources to the specific case of HI survey predictions.

#### Survey Sensitivity

A survey's sensitivity is determined by a combination of the observational strategy and the instrument characteristics: the longer one integrates and the more sensitive the instrument, the more sensitive the resulting observation. If the observations a not limited by either astronomical confusion (e.g. Condon et al. 2012) or instrument systematics (e.g. Grobler et al. 2014), the defining equation for the point-source sensitivity ($\sigma_{\rm{PS}}$, in Jy) per spectral channel of a radio interferometer at the pointing centre is the radiometer equation: $$equation in here$$ where $SEFD$ (in Jy) is the system equivalent flux density of an individual antenna (defined as the flux density of a radio source that doubles the system temperature), $n_c$ is the correlator efficiency ($n_c \sim 1$ for modern systems), $n_{pol}$ is the number of polarization products included in the image ($n_{pol}=2$ for the vast majority of HI surveys), $N$ is the number of antennas, $t_int$ (in seconds) is the net integration time of the observation, and $\delta_\nu$ (in Hz) is the spectral channel width. This equation illustrates that the signal-to-noise of an observation depends linearly on the antenna sensitivity (ie. the SEFD), and on the inverse square root of both the observing time and the spectral channel width. For a given instrument configuration, estimates of $\sigma_{PS}$ and the assumed spectral channel width $\delta_\nu$ are essential for predicting HI survey detections. Clearly, smoothing the data to a lower spectral resolution (larger $\delta_\nu$) will increase a survey's sensitivity. 

It is important to recognize that the equation above provides the instrument sensitivity per synthesized beam per spectral channel; the flux from sources that are spatially or spectrally resolved relative to this beam or channel width will therefore be distributed across more than one pixel of the resulting dataset. In particular, the spatial scales to which an interferometer is sensitive are determined by the distribution of its antennas and the weighting applied during imaging (see [here](https://github.com/griffinfoster/fundamentals_of_interferometry/blob/edd3120cb0f2a3a62ecf8b307b282fd4dfa82756/5_Imaging/5_0_introduction.ipynb) for a detailed derivation). The antenna configuration therefore determines the angular resolution (and therefore the synthesized beam) of an observation, which is required to determine the likelihood that survey detections will be spatially resolved.

The column density sensitivity of a survey dictates the detectability of spatially resolved sources (see Chapter 4.1). The column density sensitivity ($\sigma_{NHI}$, in $\rm{atoms} \, \rm{cm}^2$) per spectral channel of an observation is given by: $$\sigma_{NHI} = 2.23 \times 10^{24} \frac{\sigma_{PS} \, \delta_\nu}{\theta_a \, \theta_b \, \nu_c^2} ,$$ where $\theta_a$ and $\theta_b$ are full-width at half-maximum of the Gaussian beam along its major and minor axes, respectively (typically $\theta_a$=$\theta_b$ for planar arrays), $\nu_c$ is the observing frequency in GHz, and the other variables have the same definitions as above. The column density sensitivity of an observation therefore depends on both the point-source sensitivity and the synthesized beam shape; the latter property is therefore critical for estimating resolved galaxy detections. Note that different weights applied to survey data at the imaging stage will change $\theta_a$ and $\theta_b$, and therefore the column density sensitivity.

#### Survey Volume

Given the point-source and surface-brightness sensitivities of each pointing, the second important factor governing the HI detections in a survey is the survey volume. The areal coverage of the survey is clearly an important factor in determining this volume. For surveys that will tile the sky to achieve near-uniform sensitivity in the survey footprint (e.g. [WALLABY on ASKAP](http://www.atnf.csiro.au/research/WALLABY/)), the survey area is a trivial factor to include when computing HI detections. The issue is more complex for surveys with variable sensitivity due to the sparseness of individual pointings (e.g. [MALS on MeerKAT](http://mals.iucaa.in/))). A survey's volume is also determined by its spectral coverage. While the large bandwidths of modern correlators rarely limit survey volume, radio frequency interference can limit a survey's sensitivity at some frequencies at which redshifted HI lines may fall (due to the elimination of corrupted baselines, effectively reducing $N$ in the sensitivity equation; Fernandez et al. 2013), or blind it altogether (e.g. Catinella et al. 2015). Spectral windows with strong RFI outside the protected 1.4 GHz bands should therefore be taken into account when estimating the survey volume for high-redshift HI detections.  

### Predicting Galaxy Detections

We now turn to techniques for predicting HI detections in a survey of known sensitivity and volume. For surveys that probe a cosmological volume (ie. a volume much larger that the one spanned by typical large scale structures in the galaxy distribution; e.g. Martin et al. 2012), the HI mass function (HIMF) is an important tool for making such predictions (e.g. Giovanelli et al. 2005; Duffy et al. 2012; Maddox et al. 2016). We focus on exploiting the HIMF to make predictions in this section, and also include a brief description of other approaches. We focus on the simplest case of spatially unresolved sources in this section, deferring a discussion of resolved disks to the next one.

[Section 3.X]() defines the HIMF as the number density of HI detections as a function of their HI mass, and it is now well-measured for HI masses in the range $M_{\rm{HI}} \gtrsim 10^{6.5}\,M_{\odot}$ in the local universe (e.g. Jones et al. 2016; our knowledge of the low-mass end of the HIMF is limited by the sensitivity of extant surveys to these faint sources, of which there are many). The number HI detections as a function of HI mass in a given volume slice of an HI survey can therefore be obtained by integrating the HIMF: $$xx,$$

where $\phi(M_{\rm{HI})$ is the HIMF and $dV$ is a volume element; the equation above may be evaluated analytically from Schechter function fits to the HIMF, although extrapolating galaxy number counts beyond the range spanned by the data (e.g. $M_{\rm{HI}} \leq 10^6 M_{\odot}$) should be treated with caution. As described in chapter 1, the HI mass of a source scales with the flux integral of the detected line and distance squared. Given the distance of a volume slice in a survey, the detectability of an HI source predicted to lie in that volume from the HIMF therefore depends on the expected linewidth and the survey sensitivity. Specifically, the expected signal-to-noise $SN_{\rm{PS}}$ of a (point-source) detection in a given spectral channel can be approximated by: $$SN_{\rm{PS}} \sim 235.6 \, D^2 \, S_{\rm{peak}} \, W \, \sigma_{PS}^{-1},$$

where $D$ is the distance to the source in Mpc, $S_{\rm{peak}}$ is the peak flux of the source in mJy, $W$ is the characteristic width of the source in km/s, and $\sigma_{PS}$ in mJy is the survey sensitivity for that channel width. The simplest approach for estimating $SN_{PS}$ is to adopt a fixed $W$ for all detections (e.g. Maddox et al. 2016). For the inclined, rotating disks that are expected to make up the majority of HI sources (see Chapter 4), $W$ depends on the rotation speed in the outer disk $V_{rot}$, the disk inclination $i$ along the line-of-sight, and the velocity dispersion $\sigma_V$ of the disk: $$W \sim 2 \times \sqrt{ (V_{rot} \, \sin{i})^2 + \sigma_V^2 }. $$

*Velocity dispersion paragraph here.*