# DC2 object catalog: removing Milky Way dust

Contributed by: **Sam Schmidt [@sschmidt23](https://github.com/LSSTDESC/DC2-analysis/issues/new?body=@sschmidt23)**

Last Verifed to Run: **2020-06-19** (by @sschmidt23)

The DC2 object catalogs generated from mock images have simulted Milky Way dust included.  If you need access to object colors, effects of this foreground must be removed.  This notebook will give a very quick demo for using the `dustmaps` package to remove Milky Way foreground dust from the DC2 object catalogs.

Typically, foreground dust is parameterized by E(B-V), and the amount of dereddening in each specific band, A_lambda, is found via an A_lambda/E(B-V) parameter specific to each filter.  A separate notebook [LINK HERE LATER] can show you how to derive these parameters for the LSST bandpass shapes assumed in the DC2 simulations, but for the purposes of this notebook we will simply list the A_lambda/E(B-V) parameters, which for filters `u,g,r,i,z,y` are `4.81,3.64,2.70,2.06,1.58,1.31`

__Logistics__: This notebook is intended to be run through the Jupyter Lab NERSC interface available here: https://jupyter.nersc.gov/. To setup your NERSC environment, please follow the instructions available here: https://confluence.slac.stanford.edu/display/LSSTDESC/Using+Jupyter+at+NERSC

__Other notes__: 
This demo uses the non-DESC `dustmaps` package, which employs astropy units, so both of these packages must be available in the path of the user.  Note that both packages are available in the `desc-python-dev` kernel

The DC2 simulations assume SFD reddening with interpolation between the pixels set. the `dustmaps` package can work with several dust maps derived from a variety of sources.  We will point the dustmaps code to the SFD maps with the config['data_dir'] parameter in the cell below.

In [None]:
import numpy as np
import pandas as pd
import GCRCatalogs
import dustmaps
from dustmaps.sfd import SFDQuery
from astropy.coordinates import SkyCoord
from dustmaps.config import config
import matplotlib.pyplot as plt
config['data_dir'] = '/global/cfs/cdirs/lsst/groups/PZ/PhotoZDC2/run2.2i_dr6_test/TESTDUST/mapdata' #update this path when dustmaps are copied to a more stable location!
%matplotlib inline

In [None]:
# set the A_lamba/E(B-V) values for the six LSST filters 
band_a_ebv = np.array([4.81,3.64,2.70,2.06,1.58,1.31])

Let's grab a sample set of DC2 data to deredden, we'll use run2.2i_dr3 and use tract 3450, and store it in a pandas dataframe for simplicity:

In [None]:
cat = GCRCatalogs.load_catalog("dc2_object_run2.2i_dr3")

In [None]:
columns = ['ra','dec','extendedness','blendedness']
for band in ['u','g','r','i','z','y']:
    columns.append(f'mag_{band}_cModel') #cModel magnitudes
    columns.append(f'mag_{band}') #alias for the PSF magnitudes

In [None]:
data = cat.get_quantities(columns,native_filters=['tract==3450'])

In [None]:
df = pd.DataFrame(data)

In [None]:
df.head()

Now, we need to create a set of astropy SkyCoord coordinates for all of our RA's and DEC's

In [None]:
coords = c = SkyCoord(df['ra'], df['dec'], unit = 'deg',frame='fk5')

Looking up the ebv value at each position is now a simple procedure with `dustmaps`

In [None]:
sfd = SFDQuery()
ebvvec = sfd(coords)
df['ebv'] = ebvvec

To de-redden the magnitudes, we simply need to subtract of A_lambda/E(B-V)*E(B-V) for each band:

In [None]:
for i,band in enumerate(['u','g','r','i','z','y']):
    df[f'mag_{band}_cModel_dered']= df[f'mag_{band}_cModel']-df['ebv']*band_a_ebv[i]
    df[f'mag_{band}_dered'] = df[f'mag_{band}']-df['ebv']*band_a_ebv[i]

That's it!  But, to check if our dereddening worked correctly we'll make a few plots.  Let's see what our E(B-V) map looks like in this tract:

In [None]:
fig = plt.figure(figsize=(15,12))
plt.scatter(df['ra'][::10],df['dec'][::10],s=15,c=df['ebv'][::10],cmap='hot')
plt.xlabel("RA (degrees)",fontsize=18)
plt.ylabel("DEC (degrees)",fontsize=18)
plt.colorbar();

We see varying amounts of foreground dust with an E(B-V) going as high as 0.05.  For the u-band this means a de-reddening as high as 0.25 magnitudes, with lesser effects for logner wavelength bands.  Let's Select some non-blended, non-extended samples from the region with high E(B-V), and compare to a tabulated list of some of the star colors used in IMSIM: the Kurucz and "old mlt" (red) stars:

In [None]:
mask = (df['ebv']>.04) & (df['blendedness']<.05) & (df['extendedness']<.1) & (df['mag_i_cModel_dered']<23.5)
gooddf = df[mask]

For comparison, we have queried the truth fluxes with and without Milky Way dust from the postgres SQL database (which you can learn about in the postgres_truth.ipynb notebook in the tutorials directory).  We ran a query to grab true fluxes for stars in the same region of sky as the example tract.  To save time we've saved this data in a parquet file that we will read in with Pandas:

In [None]:
stardf = pd.read_parquet("data/demo_star_fluxes_colors_dust.parquet",engine='pyarrow')

In [None]:
stardf.head()

Let's plot a color-color diagram of r-i vs g-r and show the observed star colors before and after dereddening, and compare to the truth table star colors with and without MW dust included:

In [None]:
fig = plt.figure(figsize=(10,10))
plt.scatter(gooddf['mag_g']-gooddf['mag_r'],gooddf['mag_r']-gooddf['mag_i'],s=10,c='r',label="stars before dered")
plt.xlim(-.5,2.5)
plt.ylim(-.5,2.5)
plt.scatter(gooddf['mag_g_dered']-gooddf['mag_r_dered'],gooddf['mag_r_dered']-gooddf['mag_i_dered'],s=10,c='dodgerblue',label='dereddened stars')
plt.scatter(stardf['gmr'],stardf['rmi'],s=20,c='purple',label ="truth star colors with MW dust")
plt.scatter(stardf['gmr_nomw'],stardf['rmi_nomw'],s=20,c='k',label ="truth star colors with no MW dust" )
plt.xlim(-.5,2.5)
plt.ylim(-.5,2.5)
plt.xlabel("g-r",fontsize=18)
plt.ylabel("r-i",fontsize=18)
plt.legend(loc='lower right',fontsize=16);

We see the reddening vector as a shift in the colors, easily visible between the red and blue and black and purple points.  This reddening vector is somewhat aligned with the bluer stars in the stellar locus, but an offset is evident in the red M and L dwarfs.  We see that the dereddening procedure does, indeed, correct for the dust extinction. We can plot the two datasets on separate axes so things are a little more clear:

In [None]:
fig = plt.figure(figsize=(20,10))
plt.subplot(121)
plt.scatter(gooddf['mag_g']-gooddf['mag_r'],gooddf['mag_r']-gooddf['mag_i'],s=10,c='r',label="stars before dered")
plt.scatter(stardf['gmr'],stardf['rmi'],s=20,c='purple',label="truth star colors with MW dust")
#plt.scatter(stardf['gmr_nomw'],stardf['rmi_nomw'],s=50,c='k')
plt.xlim(-.5,2.5)
plt.ylim(-.5,2.5)
plt.xlabel("g-r",fontsize=18)
plt.ylabel("r-i",fontsize=18)
plt.legend(loc='lower right',fontsize=16)
plt.subplot(122)
plt.scatter(gooddf['mag_g_dered']-gooddf['mag_r_dered'],gooddf['mag_r_dered']-gooddf['mag_i_dered'],s=10,c='dodgerblue',label='dereddened stars')
#plt.scatter(stardf['gmr'],stardf['rmi'],s=20,c='purple',label = "truth star colors")
plt.scatter(stardf['gmr_nomw'],stardf['rmi_nomw'],s=20,c='k',label='truth star colors without MW dust')
plt.xlim(-.5,2.5)
plt.ylim(-.5,2.5)
plt.xlabel("g-r",fontsize=18)
plt.ylabel("r-i",fontsize=18)
plt.legend(loc='lower right',fontsize=16);

And, finally, we will also plot i-z vs g-r:

In [None]:
fig = plt.figure(figsize=(10,10))
plt.scatter(gooddf['mag_g']-gooddf['mag_r'],gooddf['mag_i']-gooddf['mag_z'],s=10,c='r',label="stars before dered")
plt.scatter(stardf['gmr_nomw'],stardf['imz_nomw'],s=20,c='k',label="truth star colors with no MW dust")
plt.scatter(gooddf['mag_g_dered']-gooddf['mag_r_dered'],gooddf['mag_i_dered']-gooddf['mag_z_dered'],s=10,c='dodgerblue',label='dereddened stars')
plt.scatter(stardf['gmr'],stardf['imz'],s=20,c='purple',label='truth star colors with MW dust')
plt.xlim(-.5,2.)
plt.ylim(-.5,1.5)
plt.xlabel("g-r",fontsize=18)
plt.ylabel("i-z",fontsize=18)
plt.legend(loc='lower right',fontsize=16);

In [None]:
fig = plt.figure(figsize=(20,10))
plt.subplot(121)
plt.scatter(gooddf['mag_g']-gooddf['mag_r'],gooddf['mag_i']-gooddf['mag_z'],s=10,c='r',label="stars before dered")
plt.scatter(stardf['gmr'],stardf['imz'],s=20,c='purple',label='truth star colors')
#plt.scatter(stardf['gmr_nomw'],stardf['imz_nomw'],s=20,c='k',label ='truth star colors with no MW dust')
plt.xlim(-.5,2.)
plt.ylim(-.5,1.5)
plt.xlabel("g-r",fontsize=18)
plt.ylabel("i-z",fontsize=18)
plt.legend(loc='lower right',fontsize = 16)
plt.subplot(122)
plt.scatter(gooddf['mag_g_dered']-gooddf['mag_r_dered'],gooddf['mag_i_dered']-gooddf['mag_z_dered'],s=10,c='dodgerblue',label='dereddened stars')
#plt.scatter(stardf['gmr'],stardf['imz'],s=20,c='purple',label='truth star colors')
plt.scatter(stardf['gmr_nomw'],stardf['imz_nomw'],s=20,c='k',label='truth star colors with no MW dust')
plt.xlim(-.5,2.)
plt.ylim(-.5,1.5)
plt.xlabel("g-r",fontsize=18)
plt.ylabel("i-z",fontsize=18)
plt.legend(loc='lower right',fontsize=16);