# A Hubble Source Catalog (HSC) Use Case

- [Example #4: Using the Discovery Portal to perform cross-matching between an input catalog and the HSC][1]
  - (Search for HST data relevant to Supernova 2005cs in the galaxy M51)
  
![crossmatch_0][2]


  [1]: https://archive.stsci.edu/hst/hsc/help/use_case_4_v1.html
  [2]: screenshots/crossmatch_0.png

<span style="color:red;">Goal</span>: This tutorial shows you how to use the [MAST Discovery Portal][1] to perform cross-matching between an input catalog and the HSC. It also shows how to use the [Hubble Legacy Archive][2] to search for Hubble data that is not in the HSC.

<span style="color:red;">SCIENCE CASE</span>: The science case is to search for HST data relevant to a supernova (i.e. SN2005cs in the galaxy M51; see Maund et al. 2005, MNRAS, 364, 33, and Li et al. 2006, ApJ, 641, 1060).


  [1]: https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html
  [2]: https://hla.stsci.edu/

In [None]:
import time
import sys
import os
import requests
import json
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

from scipy.stats import linregress
from sklearn.linear_model import LinearRegression

from PIL import Image
from io import BytesIO

from astroquery.mast import Observations
import astropy
from astropy.coordinates import SkyCoord
from astropy.table import Table, join
from astropy.io import ascii

# check that version of mastcasjobs is new enough
# we are using some features not in version 0.0.1
from pkg_resources import get_distribution
from packaging.version import Version as V

assert V(get_distribution("mastcasjobs").version) >= V('0.0.2'), """
A newer version of mastcasjobs is required.
Update mastcasjobs to current version using this command:
pip install --upgrade git+git://github.com/rlwastro/mastcasjobs@master
"""

import mastcasjobs

from image_utils import *

# set width for pprint
astropy.conf.max_width = 150

import warnings
warnings.filterwarnings('ignore')

In [None]:
pd.set_option('display.max_columns', 700)
pd.set_option('display.max_rows', 400)
pd.set_option('display.min_rows', 10)
pd.set_option('display.expand_frame_repr', True)

In [None]:
HSCContext = "HSCv3"

Set up Casjobs environment.

In [None]:
import getpass
if not os.environ.get('CASJOBS_USERID'):
    os.environ['CASJOBS_USERID'] = input('Enter Casjobs UserID:')
if not os.environ.get('CASJOBS_PW'):
    os.environ['CASJOBS_PW'] = getpass.getpass('Enter Casjobs password:')

### Useful functions

* The `hcvcone(ra,dec,radius [,keywords])` function searches the HCV catalog near a position.
* The `hcvsearch()` function performs general non-positional queries.
* The `hcvmetadata()` function gives information about the columns available in a table.

In [None]:
hscapiurl = "https://catalogs.mast.stsci.edu/api/v0.1/hsc"


def hcvcone(ra, dec, radius, table="hcvsummary", release="v3", format="csv", magtype="magaper2",
            columns=None, baseurl=hscapiurl, verbose=False, **kw):
    """Do a cone search of the HSC catalog (including the HCV)
    
    Parameters
    ----------
    ra (float): (degrees) J2000 Right Ascension
    dec (float): (degrees) J2000 Declination
    radius (float): (degrees) Search radius (<= 0.5 degrees)
    table (string): hcvsummary, hcv, summary, detailed, propermotions, or sourcepositions
    release (string): v3 or v2
    magtype (string): magaper2 or magauto (only applies to summary table)
    format: csv, votable, json
    columns: list of column names to include (None means use defaults)
    baseurl: base URL for the request
    verbose: print info about request
    **kw: other parameters (e.g., 'numimages.gte':2)
    """
    
    data = kw.copy()
    data['ra'] = ra
    data['dec'] = dec
    data['radius'] = radius
    return hcvsearch(table=table, release=release, format=format, magtype=magtype,
                     columns=columns, baseurl=baseurl, verbose=verbose, **data)


def hcvsearch(table="hcvsummary", release="v3", magtype="magaper2", format="csv",
              columns=None, baseurl=hscapiurl, verbose=False, **kw):
    """Do a general search of the HSC catalog (possibly without ra/dec/radius)
    
    Parameters
    ----------
    table (string): hcvsummary, hcv, summary, detailed, propermotions, or sourcepositions
    release (string): v3 or v2
    magtype (string): magaper2 or magauto (only applies to summary table)
    format: csv, votable, json
    columns: list of column names to include (None means use defaults)
    baseurl: base URL for the request
    verbose: print info about request
    **kw: other parameters (e.g., 'numimages.gte':2).  Note this is required!
    """
    
    data = kw.copy()
    if not data:
        raise ValueError("You must specify some parameters for search")
    if format not in ("csv", "votable", "json"):
        raise ValueError("Bad value for format")
    url = f"{cat2url(table, release, magtype, baseurl=baseurl)}.{format}"
    if columns:
        # check that column values are legal
        # create a dictionary to speed this up
        dcols = {}
        for col in hcvmetadata(table, release, magtype)['name']:
            dcols[col.lower()] = 1
        badcols = []
        for col in columns:
            if col.lower().strip() not in dcols:
                badcols.append(col)
        if badcols:
            raise ValueError(f"Some columns not found in table: {', '.join(badcols)}")
        # two different ways to specify a list of column values in the API
        # data['columns'] = columns
        data['columns'] = f"[{','.join(columns)}]"

    # either get or post works
    # r = requests.post(url, data=data)
    r = requests.get(url, params=data)

    if verbose:
        print(r.url)
    r.raise_for_status()
    if format == "json":
        return r.json()
    else:
        return r.text


def hcvmetadata(table="hcvsummary", release="v3", magtype="magaper2", baseurl=hscapiurl):
    """Return metadata for the specified catalog and table
    
    Parameters
    ----------
    table (string): hcvsummary, hcv, summary, detailed, propermotions, or sourcepositions
    release (string): v3 or v2
    magtype (string): magaper2 or magauto (only applies to summary table)
    baseurl: base URL for the request
    
    Returns an astropy table with columns name, type, description
    """
    url = f"{cat2url(table,release,magtype,baseurl=baseurl)}/metadata"
    r = requests.get(url)
    r.raise_for_status()
    v = r.json()
    # convert to astropy table
    tab = Table(rows=[(x['name'], x['type'], x['description']) for x in v],
                names=('name', 'type', 'description'))
    return tab


def cat2url(table="hcvsummary", release="v3", magtype="magaper2", baseurl=hscapiurl):
    """Return URL for the specified catalog and table
    
    Parameters
    ----------
    table (string): hcvsummary, hcv, summary, detailed, propermotions, or sourcepositions
    release (string): v3 or v2
    magtype (string): magaper2 or magauto (only applies to summary table)
    baseurl: base URL for the request
    
    Returns a string with the base URL for this request
    """
    checklegal(table, release, magtype)
    if table == "summary":
        url = f"{baseurl}/{release}/{table}/{magtype}"
    else:
        url = f"{baseurl}/{release}/{table}"
    return url


def checklegal(table, release, magtype):
    """Checks if this combination of table, release and magtype is acceptable
    
    Raises a ValueError exception if there is problem
    """
    
    releaselist = ("v2", "v3")
    if release not in releaselist:
        raise ValueError(f"Bad value for release (must be one of {', '.join(releaselist)})")
    if release == "v2":
        tablelist = ("summary", "detailed")
    else:
        tablelist = ("summary", "detailed", "propermotions", "sourcepositions", "hcvsummary", "hcv")
    if table not in tablelist:
        raise ValueError(f"Bad value for table (for {release} must be one of {', '.join(tablelist)})")
    if table == "summary":
        magtypelist = ("magaper2", "magauto")
        if magtype not in magtypelist:
            raise ValueError(f"Bad value for magtype (must be one of {', '.join(magtypelist)})")

## <span style="color:red;">Step 1</span>

Create a catalog of objects to be cross-matched with the HSC. In this case, we want a catalog of known supernovae in 2005. This list was created using data from the [IAU Central Bureau for Astronomical Telegrams][1]. The [list][2] is in CSV format, and looks like this:

```lang-none
RA,Dec,Name,Host Galaxy
37.68021,-2.93883,SN 2005A,NGC 958
268.70342,71.54292,SN 2005B, UGC 11066
168.87258,60.75153,SN 2005C, Anon.
```

There are 367 objects listed in the file, which are located all over the sky.

NOTE: The coordinates for SN 2005c in the attached list have been changed from the IAU position to the improved position 202.4699167 47.17658333 (see discussion in Maund et al 2005, and Li et al. 2006).


  [1]: http://www.cbat.eps.harvard.edu/lists/Supernovae.html
  [2]: https://archive.stsci.edu/hst/hsc/help/use_case_4_v1/sn2005.csv

## <span style="color:red;">Step 2</span> - Go to the [MAST Discovery Portal][1]

Upload the catalog by clicking on the ![import][2] Upload File icon (<span style="color:blue;">blue</span>). Use the browse button to find the supernova file you downloaded in Step 1, then click on the import button. The catalog now has its own tab (<span style="color:green;">green</span>), and the AstroView window shows the location of the first object in the catalog.

![crossmatch_1][3]

  [1]: http://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html
  [2]: screenshots/import.png
  [3]: screenshots/crossmatch_1.png

In [None]:
df = pd.read_csv('sn2005.csv', usecols=[0, 1, 2, 3])
df.head()

In [None]:
objname = df.loc[0, 'Host Galaxy']
filters = "gri"
size_deg = 0.2
zoom = 0.33

pixel_size = 0.25 # PS1 pixel size in arcsec
ra, dec = resolve(objname)
# get size in pixels from image size in degrees
size = int(size_deg*3600/pixel_size)
print(f"{objname} ra {ra} dec {dec} size {size}")

In [None]:
target = df.loc[0, 'Host Galaxy']
coord_m87 = SkyCoord.from_name(target)

ra_m87 = coord_m87.ra.degree
dec_m87 = coord_m87.dec.degree
radius = 500 # arcsec

print(f'ra: {ra_m87}\ndec: {dec_m87}')

In [None]:
filters = "gri"
size_deg = 0.2
zoom = 0.33

pixel_size = 0.25 # PS1 pixel size in arcsec
ra = ra_m87
dec = dec_m87
# get size in pixels from image size in degrees
size = int(size_deg*3600/pixel_size)

In [None]:
# color JPEG image
cim = getcolorim(ra, dec, zoom=zoom, size=size, filters=filters, format="jpg", autoscale=99.5)
print(f"Got {cim.size} color jpeg image")

# get Gaia EDR3 catalog (search a circle that just touches the image edges)
gcat = getgaia(ra, dec, radius=size_deg/2)
print(f"Got {len(gcat)} Gaia EDR3 sources")

# extract image WCS
iwcs = getwcs(cim)
gcat['x'], gcat['y'] = iwcs.wcs_world2pix(gcat['ra'],gcat['dec'],0)

plt.rcParams.update({'font.size':12})
plt.figure(1, (10,10))
plt.title(f'{objname} PS1 {filters}')
# note we must specify the extent to get the pixel origin in the lower left corner
plt.imshow(cim, origin="upper", extent=(0, cim.size[0], 0, cim.size[1]))
# plt.plot(gcat['x'], gcat['y'], 'o', markerfacecolor='none')

## <span style="color:red;">Step 3</span> - Perform the cross-matching

Click on the ![crossmatch_24][1] icon (<span style="color:blue;">blue</span>), which generates a popup where you can select the catalog (i.e. HSC; <span style="color:green;">green</span>) and matching radius (default of 3 arcsec; <span style="color:orange;">orange</span>); see below for a discussion of the search radius. Click Cross-Match (<span style="color:yellow;">yellow</span>).

![crossmatch_2][3]


  [1]: screenshots/crossmatch_24.png
  [2]: https://archive.stsci.edu/hst/hsc/help/use_case_4_v1.html#radius
  [3]: screenshots/crossmatch_2.png

## <span style="color:red;">Step 4</span> - Review the results

There are 60 matches (<span style="color:blue;">blue</span>), although there can be multiple HSC matches for any given supernova if there are several HSC objects within the 3 arcsec of the supernova position (e.g. in this case, there are two matches for SN 2005H, two for SN2005P, 10 for SN 2005V). Note that the RA, Dec, Name, and Host Galaxy columns from the input catalog (<span style="color:green;">green</span>) are at the beginning of the Cross-Match table, followed by information (such as the distance between the catalog position and the HSC position (<span style="color:orange;">orange</span>)) from the HSC.

![crossmatch_3][1]

If you scroll over in the table, you will see the Target Name from the HST observation. For example, lines 3 and 4 (<span style="color:blue;">blue</span>) were given the name SN2005P, while lines 5-10 (<span style="color:green;">green</span>) were given the name of the galaxy. Note that this name is what the HST observer chose to call the object, so the name may not be useful in determining information about the target (e.g. the first target in the list is ambiguous).

![crossmatch_4][2]



  [1]: screenshots/crossmatch_3.png
  [2]: screenshots/crossmatch_4.png

## <span style="color:red;">Step 5</span> - Review the matches in AstroView

Under Filters on the left, the Name box shows you how many matches each supernova had with an HSC object (<span style="color:blue;">blue</span>); click on the "Show 16 more" button to see the complete list of the 21 (out of the 367) objects that had matches.

Clicking on the Focus button ![crosshair_24][1] icon (<span style="color:green;">green</span>) will center the object in the AstroView window (in this case SN 2005P). It's interesting to browse through some to see the locations of the supernovae in their galaxies. If you zoom the AstroView display, you can see both the HSC object(s) (in <span style="color:blue;">blue</span>) and the position of the supernova (in <span style="color:orange;">orange</span>) shown with different color symbols (<span style="color:yellow;">yellow</span>).

![crossmatch_5][2]



  [1]: screenshots/crosshair_24.png
  [2]: screenshots/crossmatch_5.png

## <span style="color:red;">Step 6</span> - Examine the HST data for SN 2005cs in detail

Click on the 2005cs button (<span style="color:blue;">blue</span>) to restrict the data to just this object. We find that there are 10 matches. If you scroll over to the right you will find that line 4 has NumImage = 9, while the others have NumImages between 1 and 5. Lets look at the detailed information for line 4 by clicking the Load Detailed Results ![detailed][1] icon on that row (<span style="color:green;">green</span>).

![crossmatch_10][2]

This creates a new tab (<span style="color:blue;">blue</span>), and presents cutout preview images for each member of the match (<span style="color:green;">green</span>). Hover over the cutout and it will tell you the name of the image, in this case hst_12762_a6_wfc3_uvis_f689m. There are five measurements with WFPC2 and four WFC3 measurements (<span style="color:orange;">orange</span>) for this object, MatchID = 3972863. The WFC3 observations might be of particular interest since they were taken after the images discussed in the Maund et al. (2005) and Li et al. (2006) papers.

![crossmatch_11][3]


  [1]: screenshots/detailed.png
  [2]: screenshots/crossmatch_10.png
  [3]: screenshots/crossmatch_11.png

## <span style="color:red;">Step 7</span> - Compare with Li et al. (2006)

Click on the Toggle Overlay Image ![overlay][1] icon (<span style="color:yellow;">yellow</span> in the image above) to overlay the Hubble image in AstroView. Another option is to click on the cutout preview image (<span style="color:green;">green</span>) itself to bring up the HLA Interactive Display, which provides more flexibility for viewing the image (e.g., changing the contrast level), although you will need to use the RA and DEC to find the object.

When you zoom in with AstroView, you can see the 10 potential matches (blue squares) to the original search position (orange square). The match that we selected is shown by the light blue square, and turns out to be very close to the position of SN2005cs, as shown by comparison with the Li et al. (2006) image below.

![crossmatch_12][2]

Note that the Li et al.(2006) paper (and Maund et al. (2005) primarily used ACS observations, which are NOT INCLUDED IN THE HSC. This is an important point, that while the HSC is a useful way to look for HST data, users should not assume there is no other data in the HLA if it is not listed in the HSC. In Step 8 we show how to make a more detailed evaluation of exisiting data using the HLA.



  [1]: screenshots/overlay.png
  [2]: screenshots/crossmatch_12.png

## <span style="color:red;">Step 8</span> - Go to the [HLA][1] to search for related data

Enter the coordinates (202.4699167 47.17658333) for SN2005cs into the search box (<span style="color:blue;">blue</span>); click on advanced search (<span style="color:green;">green</span>); then click on cutouts (<span style="color:orange;">orange</span>), and then search to see the Inventory. Click on the Images button (<span style="color:yellow;">yellow</span>) to view the cutouts.

![crossmatch_14][2]

Here are some of the cutouts, including HST_10498_01_ACS_HRC_F555W that has the supernova itself! This is around cutout #350 if you want to find it. It is not included in the HSC because it is an ACS/HRC image. Source lists for this detector will be added to the HSC in the future.

![crossmatch_15][3]

There are 824 images returned from the HLA search, though in many cases the "Cutout position is outside image". You can reduce the number to look at by going to the inventory view and selecting a subset (e.g., by putting *acs* in the box under Detector and clicking).


  [1]: http://hla.stsci.edu/
  [2]: screenshots/crossmatch_14.png
  [3]: screenshots/crossmatch_15.png

## <span style="color:red;">Search Radius selection</span>

The selection of the search size can be tricky, and depends on the science you are trying to accomplish. For this case, we want a radius large enough to find all the galaxies that have HST data, but small enough to not give you HSC objects that are not part of the galaxy. One approach is to start with a large search radius (say 20") and then try smaller ones until you start loosing cases you would like to keep. Note that a more typical search radius (e.g., for matching stellar fields) is ~0.2", which is roughly the accuracy of the absolute astrometry for the HSC.

Return to <span style="color:red;">Step 3</span>.