In [None]:
__author__ = 'Yumi Choi <yumi.choi@noirlab.edu>'
__version__ = '20231127' 
__datasets__ = ['phat_v2']  
__keywords__ = ['M31', 'stars', 'interactive plot', 'plot:cmd', 'plot:sed']

# Exploring Resolved Stellar Populations in M31 with PHAT

*Yumi Choi & the Astro Data Lab Team*

### Table of contents
* [Goals & Summary](#goals)
* [Disclaimer & attribution](#attribution)
* [Imports & setup](#import)
* [Authentication](#auth)
* [Explore the main PHAT object table](#exploreTable)
* [Make Healpix maps of the brick number, MS and RGB stars](#chapter1)
* [Variation in stellar populations and photometric quality across the M31's disk](#chapter2)
* [Resources and references](#resources)

<a class="anchor" id="goals"></a>
# Goals
* Learn how to use an SQL query to make Healpix maps of PHAT brick number as well as young and old stellar populations 
* Learn how to retrieve data for each brick and plot color-magnitude diagrams and broad-band spectral energy distribution for individual stars
* Explore how stellar populations and photometry quality vary across the M31 disk

# Summary
Our own galaxy, the Milky Way (MW), provides detailed views of astrophysical processes, anchoring much of our understanding about galaxy formation and evolution. However, because we are observing the MW while residing in it, our observations suffer from complications arising from line-of-sight reddening, uncertain distances, and background/foreground confusion. Studying external galaxies, which are free of these projection effects, instead offers a much cleaner view of an entire galaxy. The closest massive galaxy to the MW, the Andromeda galaxy (also known as M31) provides a superb laboratory, as its proximity still allows us to resolve its individual stars but is far enough away to observe the entire galaxy. Furthermore, M31 contains a wide range of local environments consisting of young and old stellar populations. It also has various structures including spiral arms, star-forming rings, bar, and bulge. 

The Panchromatic Hubble Andromeda Treasury (PHAT; PI Dalcanton) was a Hubble Space Telescope Multi-cycle program to map roughly a third of M31's star forming disk, using 6 filters covering from the ultraviolet through the near infrared. 


<a class="anchor" id="attribution"></a>
# Disclaimer & attribution
If you use this notebook for your published science, please acknowledge the following:

* Data Lab concept paper: Fitzpatrick et al., "The NOAO Data Laboratory: a conceptual overview", SPIE, 9149, 2014, http://dx.doi.org/10.1117/12.2057445

* Data Lab disclaimer: https://datalab.noirlab.edu/disclaimers.php

* PHAT Reduction paper: Williams et al., "Reducing and Analyzing the PHAT Survey with the Cloud", ApJS, 2018, 236, 4: https://ui.adsabs.harvard.edu/abs/2018ApJS..236....4W

<a class="anchor" id="import"></a>
# Imports and setup

In [None]:
# std lib
from getpass import getpass

# 3rd party # remove imports that are not needed
import numpy as np
import matplotlib.pyplot as plt
import healpy as hp
import random

# Data Lab
from dl import authClient as ac, queryClient as qc, storeClient as sc
from dl.helpers.utils import convert

<a class="anchor" id="auth"></a>
# Authentication
Much of the functionality of Data Lab can be accessed without explicitly logging in (the service then uses an anonymous login). But some capacities, for instance saving the results of your queries to your virtual storage space, require a login (i.e. you will need a registered user account).

If you need to log in to Data Lab, un-comment the cell below and execute it:

In [None]:
#token = ac.login(input("Enter user name: (+ENTER) "),getpass("Enter password: (+ENTER) "))
#ac.whoAmI()

<a class="anchor" id="exploreTable"></a>
# Explore the main PHAT object table
The notebook `M31withPhat.ipynb` listed available tables in the PHAT database. This notebook will use the main PHAT object table, phat_v2.phot_mod, which contains combined average photometry.

### Examine the columns of the phat_v2.phot_mod table

First, query 10 rows from phat_v2.phot_mod just to get some basic information about the table.

In [None]:
query = """SELECT *
           FROM phat_v2.phot_mod
           LIMIT 10
        """

In [None]:
try:
    result = qc.query(sql=query) # by default the result is a CSV formatted string
except Exception as e:
    print(e.message)

Convert the result, which is by default a CSV formatted string, to a Pandas dataframe.

In [None]:
df = convert(result,'pandas')
print("Number of columns:",len(df.columns))
print("List of columns:", df.columns)
df

<a class="anchor" id="chapter1"></a>
# Make Healpix maps of the brick number, MS and RGB stars

PHAT tiled the survey area with 23 bricks. Each brick consists of a 3$\times$6 mosaic of 18 HST pointings (<a href="http://adsabs.harvard.edu/abs/2012ApJS..200...18D">Dalcanton et al., 2012</a>).

One of the columns in the PHAT object table, pix4096, is the Healpix index (NSIDE=4096, nested scheme) for the object's RA and Dec. Healpix is a handy tesselation of the sky into tiles of equal area. To make maps of aggregate quantities in PHAT, we're going to use the database to return results in a query grouped by Healpix index value.  We can then put the results into arrays, and use `healpy`'s functionality to display the maps.

In this first query, the GROUP BY clause tells the database to aggregate the results by the values in the pix4096 column, and return the average RA and Dec of objects in those groups, as well as the pix4096 value itself and the count of the number of objects in the group. Here we only retrieve blue and relatively bright main seqeunce (MS) stars with good photometric quality. 

In [None]:
query = """SELECT avg(ra) as ra0, avg(dec) as dec0, pix4096, count(pix4096) as nb, 
            avg(brick) as brick
           FROM phat_v2.phot_mod
           WHERE f475w_gst=1 AND f814w_gst=1 AND 
            f475w_vega-f814w_vega>-0.5 AND f475w_vega-f814w_vega<0.25 AND f814w_vega<24
           GROUP BY pix4096
          """

In [None]:
try:
    result = qc.query(sql=query) # by default the result is a CSV formatted string
except Exception as e:
    print(e.message)

Convert the result of MS stars to a Pandas dataframe.

In [None]:
df_MS = convert(result,'pandas')
print("Number of rows:", len(df_MS))

Compute the center of the RA and Dec distribution of the objects

In [None]:
rarot, decrot = np.median(df_MS['ra0']), np.median(df_MS['dec0'])

### Healpix map of the PHAT brick number

In [None]:
brickmap = np.zeros(hp.nside2npix(4096))
brickmap[df_MS['pix4096']] = df_MS['brick']
hp.gnomview(brickmap, title='PHAT Brick Number', reso=0.4, nest=True, rot=(rarot,decrot,0), notext=True, cmap='jet', min=0, max=23)

### Healpix map of MS stars

Young MS stars are clearly clustered around the spiral arms and the 10 kpc ring, where active recent star formation takes place. 

In [None]:
msmap = np.zeros(hp.nside2npix(4096))
msmap[df_MS['pix4096']] = df_MS['nb']
hp.gnomview(msmap, title='Map of MS stars', notext=True, reso=0.4, nest=True, 
            rot=(rarot,decrot,0), norm="%LogNorm", max=200)

The same query as above, but for red-giant branch (RGB) stars with good photometry quality.

In [None]:
query = """SELECT pix4096, count(pix4096) as nb
           FROM phat_v2.phot_mod
           WHERE f110w_gst=1 AND f160w_gst=1 AND 
            f110w_vega-f160w_vega>0.75 AND f110w_vega-f160w_vega<2.0 AND 
            f160w_vega>18.5 AND f160w_vega<22.0
           GROUP BY pix4096
          """

In [None]:
try:
    result = qc.query(sql=query) # by default the result is a CSV formatted string
except Exception as e:
    print(e.message)

Convert the result to a Pandas dataframe.

In [None]:
df_RGB = convert(result,'pandas')
print("Number of rows:", len(df_RGB))

### Healpix map of RGB stars

Contrary to MS stars, old RGB stars show a smoother spatial distribution, mainly following a stellar density profile described as a combination of an exponential disk and a bulge.

In [None]:
rgbmap = np.zeros(hp.nside2npix(4096))
rgbmap[df_RGB['pix4096']] = df_RGB['nb']
hp.gnomview(rgbmap, title='Map of RGB stars', notext=True, reso=0.4, nest=True, 
            rot=(rarot,decrot,0), norm="%LogNorm", max=1e4)

<a class="anchor" id="chapter2"></a>
# Variation in stellar populations and photometric quality across the M31's disk

As seen in the maps of MS and RGB stars above, the spatial distribution of stellar populations as well as the stellar number density vary with their position within the galaxy disk. The number density of RGB stars, which are dominant by number, increases towards the center of M31. The stellar "crowding" becomes the dominant source of error in the HST photometry. The rest of the notebook will explore how stellar populations and photometric quality change with position within the disk by looking at multiple color-magnitude diagrams (CMDs) in representative environments (Bricks 1, 15, and 23). We will also plot stellar broad-band spectral energy distributions (SEDs) for randomly selected stars in each environment. 

## Do query for Brick 1 (most crowded centeral region)

In [None]:
query = """SELECT f275w_vega, f336w_vega, f475w_vega, f814w_vega, f110w_vega, f160w_vega 
           FROM phat_v2.phot_mod
           WHERE ((f275w_gst=1 AND f336w_gst=1) AND 
                  (f475w_gst=1 AND f814w_gst=1) AND 
                  (f110w_gst=1 AND f160w_gst=1)) AND
                  brick=1
          """

In [None]:
try:
    result = qc.query(sql=query) # by default the result is a CSV formatted string
except Exception as e:
    print(e.message)

In [None]:
df_b1 = convert(result,'pandas')
print("Number of rows:", len(df_b1))

### Make UV, optical, and IR CMDs for Brick 1

Young and massive stars are dominant sources of UV photons, while old and cool stars are dominant sources of IR photons. Most stellar populations emit a good amount of optical photons unless they are heavily embedded in dust. Here, the UV CMD highlights mostly young stars, the optical CMD features mostly all stellar populations, and the IR CMD highlights mostly more evolved cool stars.

Let's define a function to plot UV, optical, and IR CMDs for a given Brick table.

In [None]:
def make_cmds(brick, starlist=None):
    """
    brick: Pandas dataframe for a given PHAT Brick
    starlist: list of indices for stars (default=None)
    """
    
    fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(18,4))
    if starlist is None:
        cmap = plt.cm.viridis
    else:
        cmap=plt.cm.gray_r
    
    huv = ax1.hist2d(brick['f275w_vega']-brick['f336w_vega'], brick['f336w_vega'], 
                     bins=200, range=((-2,4),(15,27)), cmap=cmap, 
                     norm=plt.matplotlib.colors.LogNorm())
    ax1.set_xlabel('F275W - F336W',fontsize=15)
    ax1.set_ylabel('F336W',fontsize=15)
    ax1.set_xlim(huv[1].min()-0.5,huv[1].max()+0.5)
    ax1.set_ylim(huv[2].max()+1,huv[2].min()-1)
    ax1.set_title('UV CMD',fontsize=20)

    hopt = ax2.hist2d(brick['f475w_vega']-brick['f814w_vega'], brick['f814w_vega'], 
                      bins=200, range=((-1,6),(14,29)), cmap=cmap,
                      norm=plt.matplotlib.colors.LogNorm())
    ax2.set_xlabel('F475W - F814W',fontsize=15)
    ax2.set_ylabel('F814W',fontsize=15)
    ax2.set_xlim(hopt[1].min()-0.5,hopt[1].max()+0.5)
    ax2.set_ylim(hopt[2].max()+1,hopt[2].min()-1)
    ax2.set_title('Optical CMD',fontsize=20)

    hir = ax3.hist2d(brick['f110w_vega']-brick['f160w_vega'], brick['f160w_vega'], 
                     bins=200, range=((-1,3),(11,28)), cmap=cmap,
                     norm=plt.matplotlib.colors.LogNorm())
    ax3.set_xlabel('F110W - F160W',fontsize=15)
    ax3.set_ylabel('F160W',fontsize=15)
    ax3.set_xlim(hir[1].min()-0.5,hir[1].max()+0.5)
    ax3.set_ylim(hir[2].max()+1,hir[2].min()-1)
    ax3.set_title('IR CMD',fontsize=20)
    
    if starlist is not None:
        cmap = plt.cm.viridis
        for i in starlist:
            ax1.scatter(brick['f275w_vega'][i]-brick['f336w_vega'][i], brick['f336w_vega'][i], s=50)
            ax2.scatter(brick['f475w_vega'][i]-brick['f814w_vega'][i], brick['f814w_vega'][i], s=50)
            ax3.scatter(brick['f110w_vega'][i]-brick['f160w_vega'][i], brick['f160w_vega'][i], s=50)


    plt.show()

Since Brick 1 covers the innermost region of M31, this region suffers the most from stellar crowding. The effect of crowding on the photometry quality is shown as broadening of the features and shallow depth in each CMD.

In [None]:
make_cmds(df_b1)

### Plot a broad-band spectral energy distribution of selected stars in Brick 1

First, let's define the pivot wavelengths for the PHAT filters (from the __[HST Instrument Handbooks](https://hst-docs.stsci.edu/hom)__).

In [None]:
# Pivot wavelength in nm for each filter (F275W, F336W, F475W, F814W, F110W, F160W)
plambda = [270.97, 335.45, 474.44, 805.98, 1153.4, 1536.9]

Define a function to select stars, in a given Brick, that have good photometry in all 6 bands.

In [None]:
def good_stars(brick):
    stars_6b, = np.where((brick['f275w_vega'] < 30) & (brick['f336w_vega'] < 30) &
                         (brick['f475w_vega'] < 30) & (brick['f814w_vega'] < 30) & 
                         (brick['f110w_vega'] < 30) & (brick['f160w_vega'] < 30))
    print('There are %d stars with good measurements in all 6 bands!' % (len(stars_6b))) 
    
    return stars_6b

Define a function to pick random three stars out of the sample with good 6-band photomery, plot their broad-band SEDs, and indicate their positions in each of UV, optical, and IR CMDs. 

In [None]:
def make_seds(brick):
    #fig, (ax1, ax2, ax3, ax4) = plt.subplots(1, 4, figsize=(20,4))
    fig, ax1 = plt.subplots(1, 1, figsize=(6,4))

    stars_6b = good_stars(brick)
    sIDs = random.choices(stars_6b, k=3)
    for i in sIDs:
        ax1.plot(plambda, [brick['f275w_vega'][i], brick['f336w_vega'][i], 
                           brick['f475w_vega'][i], brick['f814w_vega'][i],
                           brick['f110w_vega'][i], brick['f160w_vega'][i]], '.-')
    ymin = np.min(brick.iloc[sIDs].values.ravel())
    ymax = np.max(brick.iloc[sIDs].values.ravel())
    ax1.set_ylim(ymax+0.5, ymin-0.5)
    ax1.set_xlabel(r'$\lambda$ [nm]',fontsize=15)
    ax1.set_ylabel('magnitude',fontsize=15)

    make_cmds(brick, sIDs)

    plt.show()

Pick random three stars out of the sample with good 6-band photomery, and plot their broad-band SEDs and indicate their positions in each CMD. 

In [None]:
make_seds(df_b1)

# Do query for Brick 15 (10 kpc star-forming ring region)

Brick 15 covers a portion of the 10 kpc star-forming ring of the M31's disk. This region suffers less from stellar crowding than Brick 1, and thus has deeper CMDs. However, the presence of more dust significantly extinguishes stars, making the CMD features fainter and broader due to dust extinction and reddening. 

In [None]:
query = """SELECT f275w_vega, f336w_vega, f475w_vega, f814w_vega, f110w_vega, f160w_vega 
           FROM phat_v2.phot_mod
           WHERE ((f275w_gst=1 AND f336w_gst=1) AND 
                  (f475w_gst=1 AND f814w_gst=1) AND 
                  (f110w_gst=1 AND f160w_gst=1)) AND
                  brick=15
          """

In [None]:
try:
    result = qc.query(sql=query) # by default the result is a CSV formatted string
except Exception as e:
    print(e.message)

In [None]:
df_b15 = convert(result,'pandas')
print("Number of rows:", len(df_b15))

### Make UV, optical, and IR CMDs for Brick 15

In [None]:
make_cmds(df_b15)

Pick random three stars out of the sample with good 6-band photomery, and plot their broad-band SEDs and indicate their positions in each CMD. 

In [None]:
make_seds(df_b15)

# Do query for Brick 23 (outer most low-density region)

Brick 23 covers the outer most star-forming disk. Therefore, we expect to see prominent young MS stars. In addition, the least effect of stellar crowding and less effect of dust are expected compared to Bricks 1 and 15. These all together allows for Brick 23 to achieve the deepest and sharpest CMDs within the PHAT footprint. 

In [None]:
query = """SELECT f275w_vega, f336w_vega, f475w_vega, f814w_vega, f110w_vega, f160w_vega
           FROM phat_v2.phot_mod
           WHERE ((f275w_gst=1 AND f336w_gst=1) AND 
                  (f475w_gst=1 AND f814w_gst=1) AND 
                  (f110w_gst=1 AND f160w_gst=1)) AND
                  brick=23
          """

In [None]:
try:
    result = qc.query(sql=query) # by default the result is a CSV formatted string
except Exception as e:
    print(e.message)

In [None]:
df_b23 = convert(result,'pandas')
print("Number of rows:", len(df_b23))

### Make UV, optical, and IR CMDs for Brick 23

In [None]:
make_cmds(df_b23)

Pick random three stars out of the sample with good 6-band photomery, and plot their broad-band SEDs and indicate their positions in each CMD. 

In [None]:
make_seds(df_b23)

<a class="anchor" id="resources"></a>
# Resources and references
Dalcanton, J.J. et al. (2012, ApJS, 200, 18), "The Panchromatic Hubble Andromeda Treasury"
http://adsabs.harvard.edu/abs/2012ApJS..200...18D

Williams, B.F. et al. (2023, arXiv:2307.09681), "The Panchromatic Hubble Andromeda Treasury XXI. The Legacy Resolved Stellar Photometry Catalog"
https://ui.adsabs.harvard.edu/abs/2023arXiv230709681W/abstract