diff --git a/tutorials/astropy-coordinates/4_Coordinates_Crossmatch.ipynb b/tutorials/astropy-coordinates/4_Coordinates_Crossmatch.ipynb new file mode 100644 index 00000000..13d222c4 --- /dev/null +++ b/tutorials/astropy-coordinates/4_Coordinates_Crossmatch.ipynb @@ -0,0 +1,435 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Astronomical Coordinates 4: Cross-matching Catalogs Using astropy.coordinates and astroquery\n", + "\n", + "## Authors\n", + "Adrian Price-Whelan, Lúthien Liu, Zihao Chen, Saima Siddiqui\n", + "\n", + "## Learning Goals\n", + "* Demonstrate how to retrieve a catalog from Vizier using astroquery\n", + "* Show how to perform positional cross-matches between catalogs of sky coordinates\n", + "\n", + "## Keywords\n", + "coordinates, OOP, astroquery, gaia\n", + "\n", + "\n", + "## Summary\n", + "\n", + "In the previous tutorials in this series, we introduced many of the key concepts underlying how to represent and transform astronomical coordinates using `astropy.coordinates`, including how to work with both position and velocity data within the coordinate objects.\n", + "\n", + "In this tutorial, we will explore how the `astropy.coordinates` package can be used to cross-match two catalogs that contain overlapping sources that may have been observed at different times. You may find it helpful to keep [the Astropy documentation for the coordinates package](http://docs.astropy.org/en/stable/coordinates/index.html) open alongside this tutorial for reference or additional reading. In the text below, you may also see some links that look like ([docs](http://docs.astropy.org/en/stable/coordinates/index.html)). These links will take you to parts of the documentation that are directly relevant to the cells from which they link. \n", + "\n", + "*Note: This is the 4th tutorial in a series of tutorials about astropy.coordinates. If you are new to astropy.coordinates, you may want to start from the beginning or an earlier tutorial.*\n", + "- [Previous tutorial: Astronomical Coordinates 3: Working with Velocity Data](3-Coordinates-Velocities)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Imports\n", + "\n", + "We start by importing some general packages that we will need below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import warnings\n", + "\n", + "import matplotlib.pyplot as plt\n", + "%matplotlib inline\n", + "import numpy as np\n", + "\n", + "from astropy import units as u\n", + "from astropy.coordinates import SkyCoord, Distance\n", + "from astropy.io import fits\n", + "from astropy.table import QTable\n", + "from astropy.time import Time\n", + "from astropy.utils.data import download_file\n", + "\n", + "from astroquery.vizier import Vizier" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Cross-matching and comparing catalogs\n", + "\n", + "In this tutorial, we are going to return to a set of data that we downloaded from the *Gaia* archive back in [Tutorial 1](1-Coordinates-Intro) of this series.\n", + "\n", + "Let's recap what we did in that tutorial: We defined a `SkyCoord` object to represent the center of an open cluster (NGC 188), we queried the *Gaia* DR2 catalog to select stars that are close (on the sky) to the center of the cluster, and we used the parallax values from *Gaia* to select stars that are near NGC 188 in 3D position. Here, we will briefly reproduce those selections so that we can start here with a catalog of sources that are likely members of NGC 188 (see [Tutorial 1](1-Coordinates-Intro) for more information):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ngc188_table = QTable.read('gaia_results.fits')\n", + "ngc188_table = ngc188_table[ngc188_table['parallax'] > 0.25*u.mas]\n", + "\n", + "ngc188_center_3d = SkyCoord(12.11*u.deg, 85.26*u.deg,\n", + " distance=1.96*u.kpc,\n", + " pm_ra_cosdec=-2.3087*u.mas/u.yr,\n", + " pm_dec=-0.9565*u.mas/u.yr)\n", + "\n", + "# Deal with masked quantity data in a backwards-compatible way:\n", + "parallax = ngc188_table['parallax']\n", + "if hasattr(parallax, 'mask'):\n", + " parallax = parallax.filled(np.nan)\n", + " \n", + "velocity_data = {\n", + " 'pm_ra_cosdec': ngc188_table['pmra'],\n", + " 'pm_dec': ngc188_table['pmdec'],\n", + " 'radial_velocity': ngc188_table['radial_velocity']\n", + "}\n", + "for k, v in velocity_data.items():\n", + " if hasattr(v, 'mask'):\n", + " velocity_data[k] = v.filled(0.)\n", + " velocity_data[k][np.isnan(velocity_data[k])] = 0.\n", + "\n", + "ngc188_coords_3d = SkyCoord(\n", + " ra=ngc188_table['ra'], \n", + " dec=ngc188_table['dec'],\n", + " distance=Distance(parallax=parallax),\n", + " obstime=Time('J2015.5'),\n", + " **velocity_data\n", + ")\n", + "\n", + "sep3d = ngc188_coords_3d.separation_3d(ngc188_center_3d)\n", + "pm_diff = np.sqrt(\n", + " (ngc188_table['pmra'] - ngc188_center_3d.pm_ra_cosdec)**2 + \n", + " (ngc188_table['pmdec'] - ngc188_center_3d.pm_dec)**2)\n", + "\n", + "ngc188_members_mask = (sep3d < 50*u.pc) & (pm_diff < 1.5*u.mas/u.yr)\n", + "ngc188_members = ngc188_table[ngc188_members_mask]\n", + "ngc188_members_coords = ngc188_coords_3d[ngc188_members_mask]\n", + "len(ngc188_members)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "From the selections above, the table `ngc188_members` and the `SkyCoord` instance `ngc188_members_coords` contain 216 sources that, based on their 3D positions and proper motions, are consistent with being members of the open cluster NGC 188." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's assume that we now want to cross-match our catalog of candidate members of NGC 188 — here, based on *Gaia* data — to some other catalog. In this tutorial, we will demonstrate how to manually cross-match these *Gaia* sources with the 2MASS photometric catalog to retrieve infrared magnitudes for these stars, and then we will plot a color–magnitude diagram. To do this, we first need to query the 2MASS catalog to retrieve all sources in a region around the center of NGC 188, as we did for *Gaia*. Here, we will also take into account the fact that the *Gaia* data release 2 reference epoch is J2015.5, whereas the 2MASS coordinates are likely reported at their time of observation (in the late 1990's). \n", + "\n", + "*Note that some data archives, like the Gaia science archive, support running cross-matches at the database level and even support epoch propagation. If you need to perform a large cross-match, it will be much more efficient to use these services!*\n", + "\n", + "We will again use `astroquery` to execute this query. This will again require an internet connection, but we have included the results of this query in a file along with this notebook in case you are not connected to the internet. To query 2MASS, we will use the `astroquery.vizier` module ([docs](https://astroquery.readthedocs.io/en/latest/vizier/vizier.html)) to run a cone search centered on the sky position of NGC 188 with a search radius of 0.5º:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NOTE: skip this cell if you do not have an internet connection\n", + "\n", + "# II/246 is the catalog name for the main 2MASS photometric catalog\n", + "v = Vizier(catalog=\"II/246\", columns=['*', 'Date']) \n", + "v.ROW_LIMIT = -1\n", + "\n", + "result = v.query_region(ngc188_center_3d, radius=0.5*u.deg)\n", + "tmass_table = result[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Alternatively, we can read the 2MASS table provided along with this tutorial:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# the .read() below produces some warnings that we can safely ignore\n", + "with warnings.catch_warnings(): \n", + " warnings.simplefilter('ignore', UserWarning)\n", + " \n", + " tmass_table = QTable.read('2MASS_results.ecsv')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As with the *Gaia* results table, we can now create a single `SkyCoord` object to represent all of the sources returned from our query to the 2MASS catalog. Let's look at the column names in this table by displaying the first few rows:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "tmass_table[:3]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "From looking at the column names, the two relevant sky coordinate columns are `RAJ2000` for `ra` and `DEJ2000` for `dec`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "tmass_coords = SkyCoord(tmass_table['RAJ2000'], \n", + " tmass_table['DEJ2000'])\n", + "len(tmass_coords)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note also that the table contains a \"Date\" column that specifies the epoch of the coordinates. Are all of these epochs the same?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "np.unique(tmass_table['Date'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It looks like all of the sources in our 2MASS table have the same epoch, so let's create an `astropy.time.Time` object to represent this date:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "tmass_epoch = Time(np.unique(tmass_table['Date']))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We now want to cross-match our *Gaia*-selected candidate members of NGC 188, `ngc_members_coords`, with this table of photometry from 2MASS. However, as noted previously, the *Gaia* coordinates are given at a different epoch J2015.5, which is nearly ~16 years after the 2MASS epoch of the data we downloaded (1999-10-19 or roughly J1999.88). We will therefore first use the `SkyCoord.apply_space_motion()` method ([docs](http://docs.astropy.org/en/latest/api/astropy.coordinates.SkyCoord.html#astropy.coordinates.SkyCoord.apply_space_motion)) to transform the *Gaia* positions back to the 2MASS epoch before we do the cross-match:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# you can ignore the warning raised here\n", + "ngc188_members_coords_1999 = ngc188_members_coords.apply_space_motion(\n", + " tmass_epoch)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The object `ngc188_members_coords_1999` now contains the coordinate information for our *Gaia*-selected members of NGC 188, as we think they would appear if observed on 1999-10-19.\n", + "\n", + "We can now use the ``SkyCoord.match_to_catalog_sky`` method to match these two catalogs ([docs](http://docs.astropy.org/en/latest/coordinates/matchsep.html#astropy-coordinates-matching)), using the `ngc188_members_coords_1999` as our NGC 188 members coordinates. \n", + "\n", + "Note that order matters with this method: Here we will match *Gaia* to 2MASS. `SkyCoord.match_to_catalog_sky` returns three objects: (1) the indices into `tmass_coords` that get the closest matches in `ngc188_members_coords_1999`, (2) the angular separation between each `ngc188_members_coords_1999` coordinate and the closest source in `tmass_coords`, and (3) the 3D distance between each `ngc188_members_coords_1999` coordinate and the closest source in `tmass_coords`. Here, the 3D distances will not be useful because the 2MASS coordinates do not have associated distance information, so we will ignore these quantities:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "idx_gaia, sep2d_gaia, _ = ngc188_members_coords_1999.match_to_catalog_sky(\n", + " tmass_coords)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's now look at the distribution of separations (in arcseconds) for all of the cross-matched sources:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "plt.hist(sep2d_gaia.arcsec, histtype='step', \n", + " bins=np.logspace(-2, 2., 64))\n", + "plt.xlabel('separation [arcsec]')\n", + "plt.xscale('log')\n", + "plt.yscale('log')\n", + "plt.tight_layout()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "From this, it looks like all of sources in our *Gaia* NGC 188 member list cross-match to another sources within a few arcseconds, so these all seem like they are correctly matches to a 2MASS source!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "(sep2d_gaia < 2*u.arcsec).sum(), len(ngc188_members)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With our cross-match done, we can now make `Gaia`+2MASS color–magnitude diagrams of our candidate NGC 188 members using the information returned by the cross-match:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "Jmag = tmass_table['Jmag'][idx_gaia] # note that we use the index array returned above\n", + "Gmag = ngc188_members['phot_g_mean_mag']\n", + "Bmag = ngc188_members['phot_bp_mean_mag']" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(1, 2, figsize=(10, 5))\n", + "\n", + "ax = axes[0]\n", + "ax.scatter(Gmag - Jmag, Gmag, \n", + " marker='o', color='k', \n", + " linewidth=0, alpha=0.5)\n", + "ax.set_xlabel('$G - J$')\n", + "ax.set_ylabel('$G$')\n", + "ax.set_xlim(0, 3)\n", + "ax.set_ylim(19, 10) # backwards because magnitudes!\n", + "\n", + "ax = axes[1]\n", + "ax.scatter(Bmag - Gmag, Jmag, \n", + " marker='o', color='k', \n", + " linewidth=0, alpha=0.5)\n", + "ax.set_xlabel('$G_{BP} - G$')\n", + "ax.set_ylabel('$J$')\n", + "ax.set_xlim(0.2, 1)\n", + "ax.set_ylim(17, 8) # backwards because magnitudes!\n", + "\n", + "fig.tight_layout()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Those both look like color-magnitude diagrams of a main sequence + red giant branch of an intermediate-age stellar cluster, so it looks like our selection and cross-matching has worked!\n", + "\n", + "For more on what matching options are available, check out the [separation and matching section of the Astropy documentation](https://astropy.readthedocs.io/en/stable/coordinates/matchsep.html). Or for more on what you can do with `SkyCoord`, see [its API documentation](http://astropy.readthedocs.org/en/stable/api/astropy.coordinates.SkyCoord.html)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercises" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Instead of transforming the Gaia positions back to the 2MASS epoch for cross-matching, try to move the 2MASS positions to the Gaia's epoch (J2015.5), and name it as `tmass_coords_2015`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now apply the `SkyCoord.match_to_catalog_sky` method to do a new cross-matching between these 2 catalogs, using the `tmass_coords_2015` as our NGC 188 members coordinates this time. Plot a histogram to show the distribution of separation. Is there any difference between this trial and the result above?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Instead of picking all stars within 50 pc of the cluster center, try to enlarge to a larger range of 80 pc for cross-match and make a comparison between the result above." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "language_info": { + "codemirror_mode": { + "name": "ipython" + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 1 +} diff --git a/tutorials/astropy-coordinates/requirements.txt b/tutorials/astropy-coordinates/requirements.txt deleted file mode 100644 index f3cb4ae9..00000000 --- a/tutorials/astropy-coordinates/requirements.txt +++ /dev/null @@ -1,4 +0,0 @@ -astropy -astroquery -matplotlib -numpy