# [$\alpha$/Fe] vs [Fe/H] throughout the Galaxy

This notebook is where you will generate the central plot of this project: the alpha-element-to-iron abundance ratio against metallicity for different regions of the Milky Way.

## Setup

In [None]:
# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# for the random sampling function
from numpy.random import default_rng
# for coordinate conversions
import astropy.units as u
from astropy.coordinates import SkyCoord, Galactic, Galactocentric

In [None]:
# matplotlib settings
plt.rcParams.update({
    'xtick.direction': 'in',  # set ticks pointing in
    'ytick.direction': 'in',
    'xtick.top': True,        # add ticks to top and right edges
    'ytick.right': True,
    'figure.dpi': 300,        # high resolution for poster
})

In the cell below, use the `pd.read_csv()` function to import the APOGEE data into a DataFrame named `full_df`. Refer back to previous notebooks if you're unsure.

### Sample Selection

In [None]:
# Limit to main survey targets
sample_df = full_df[full_df['EXTRATARG'] == 0]
# Remove all stars flagged as bad
sample_df = sample_df[sample_df['ASPCAPFLAG'] & (2**23) == 0]
# Replace NaN stand-in values with NaN
sample_df.replace(99.999, np.nan, inplace=True)

In the blank cell below, limit the contents of `sample_df` to stars with signal-to-noise ratio greater than 80, srface gravity $\log(g)$ between 1.0 and 3.8, and temperature $T_{\rm{eff}}$ between 3500 K and 5500 K. Refer back to the `ExploringAPOGEE.ipynb` notebook if you're unsure.

Print the sample DataFrame as a sanity check: there should be roughly 192,000 stars.

In [None]:
# reset the DataFrame index to sequential integers
sample_df.reset_index(inplace=True, drop=True)
# print the sample as a sanity check
sample_df

### Galactocentric coordinates

In [None]:
def galactic_to_galactocentric(l, b, distance):
    r"""
    Use astropy's SkyCoord to convert Galactic (l, b, distance) coordinates
    to galactocentric (r, phi, z) coordinates.

    Parameters
    ----------
    l : array-like
        Galactic longitude in degrees
    b : array-like
        Galactic latitude in degrees
    distance : array-like
        Distance (from Sun) in parsecs

    Returns
    -------
    galr : numpy array
        Galactocentric radius in kpc
    galphi : numpy array
        Galactocentric phi-coordinates in degrees
    galz : numpy array
        Galactocentric z-height in kpc
    """
    l = np.array(l)
    b = np.array(b)
    d = np.array(distance)
    if l.shape == b.shape == d.shape:
        if not isinstance(l, u.quantity.Quantity):
            l *= u.deg
        if not isinstance(b, u.quantity.Quantity):
            b *= u.deg
        if not isinstance(d, u.quantity.Quantity):
            d *= u.pc
        galactic = SkyCoord(l=l, b=b, distance=d, frame=Galactic())
        galactocentric = galactic.transform_to(frame=Galactocentric())
        galactocentric.representation_type = 'cylindrical'
        galr = galactocentric.rho.to(u.kpc).value
        galphi = galactocentric.phi.to(u.deg).value
        galz = galactocentric.z.to(u.kpc).value
        return galr, galphi, galz
    else:
        raise ValueError('Arrays must be of same length.')

Using the function above, generate three lists of galactocentric radius ($R_{\rm{Gal}}$), azimuth ($\phi$), and distance from the midplane ($z$). You'll need to pass three columns from the sample DataFrame as inputs: "GLON", "GLAT", and "GAIAEDR3_R_MED_PHOTOGEO". Then, create three new columns in the DataFrame titled "GALR", "GALPHI", and "GALZ" for each of the new lists. Refer back to `PlottingAbundances.ipynb` if you're unsure.

### Other useful functions

Below is a function to randomly sample N rows from a DataFrame. This might be useful if you want to make a scatterplot of stars from APOGEE without cluttering the plot with hundreds of thousands of points.

In [None]:
def sample_dataframe(df, n, weights=None, reset=True):
    """
    Randomly sample n unique rows from a pandas DataFrame.

    Parameters
    ----------
    df : pandas DataFrame
    n : int
        Number of random samples to draw
    weights : array, optional
        Probability weights of the given DataFrame
    reset : bool, optional
        If True, reset sample DataFrame index

    Returns
    -------
    pandas DataFrame
        Re-indexed DataFrame of n sampled rows
    """
    if isinstance(df, pd.DataFrame):
        # Initialize default numpy random number generator
        rng = default_rng()
        # Randomly sample without replacement
        rand_indices = rng.choice(df.index, size=n, replace=False, p=weights)
        sample = df.loc[rand_indices]
        if reset:
            sample.reset_index(inplace=True, drop=True)
        return sample
    else:
        raise TypeError('Expected pandas DataFrame.')

## Make the plot

The goal of this plot is to illustrate how the distribution of [$\alpha$/Fe] vs [Fe/H] in stars changes with respect to location in the Galaxy. There are many ways to do this, and the choice often comes down to factors like clarity, aesthetics, and personal preference. Here are a couple options, but feel free to experiment and let me know if you want to try a different route. You only need to pick one.

### Scatter plots

- Make three subplots representing different radial zones: 2-6 kpc, 6-10 kpc, and 10-14 kpc. 
- In each zone, make a scatter plot of [$\alpha$/M] vs [Fe/H] and color-code the points by their _absolute_ distance from the midplane $|z|$. 
- It would be a good idea to limit your sample to only stars with $0\leq |z| < 2$ kpc. 
- You'll also want to use `sample_dataframe()` from above to plot only a small number of all stars in each panel (start with 1000 and try different numbers until it looks good). 
- Make sure to include a colorbar, and use `ax.set_title()` to label each panel with the range of radii.

### 2D histograms

- Make a 2-by-3 grid of subplots where the columns represent different radial zones (2-6 kpc, 6-10 kpc, and 10-14 kpc), and the rows represent different bins in $|z|$ (0-0.5 kpc and 0.5-2 kpc). 
- In each subplot, plot a 2D histogram of the number of stars in [$\alpha$/M] vs [Fe/H] space. 
- You'll need to set the plot range using the keyword argument `range=[[<xmin>, <xmax>], [<ymin>, <ymax>]]` (here I'm using angle brackets <> to represent numbers you'll need to change). 
- Also adjust the number of bins using `bins=(<xbins>, <ybins>)`: pick numbers that make the bins look more or less square, with enough resolution to make out some of the finer details. 
- Label each panel with the region of the galaxy that it represents - you can try using `ax.text()`, `ax.set_title()`, `ax.set_ylabel()`, or some combination of those to make the labels clear.

### Other notes

- You'll want to adjust the size of the figure using the `figsize=(<width in inches>, <height in inches>)` keyword argument in `plt.subplots()`. Choose a size that keeps the font size fairly large and make it wider than it is tall.
- Check out the [matplotlib page on colormaps](https://matplotlib.org/stable/tutorials/colors/colormaps.html) (you can set the colormap using the `cmap='<name>'` keyword argument). 
- Also see [the page for ax.text()](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.text.html) for how to put text in your subplots. 

As always, let me know if you have any questions or if any difficulties arise. Happy plotting!