# In-class notebook for reading in data using astroML and other codes

In this notebook, we will get familiar with the basics function related to downloading SDSS data using astroML, some basic plotting techniques. We will then switch gear to talk a little about common speed-up strategies in sorting and searching.

This notebook is intended to support Chapter 1-2 of the textbook.   See the [Data Set Examples](https://www.astroml.org/examples/datasets/index.html) for AstroML to explore and work the excercises.

Material is taken from the following scripts (from astroML):
* https://github.com/astroML/astroML_figures/blob/main/book_figures/chapter1/fig_SDSS_imaging.py
* https://github.com/astroML/astroML_figures/blob/main/book_figures/chapter1/fig_sdss_S82standards.py
* https://github.com/astroML/astroML_figures/blob/main/book_figures/chapter1/fig_S82_scatter_contour.py
* https://github.com/astroML/astroML_figures/blob/main/book_figures/chapter1/fig_S82_hess.py
* https://github.com/astroML/astroML_figures/blob/main/book_figures/chapter1/fig_sdss_spectrum.py

## Excercise 1-- Color Magnitude Diagrams from SDSS imaging

Code can be found [here](https://www.astroml.org/examples/datasets/plot_sdss_imaging.html).

In [None]:
# print the column names for both stars and galaxies
print(stars.dtype.names)

### Any idea what these columns are? 

You can read the schema to find out what is what - https://skyserver.sdss.org/dr7/en/help/browser/browser.asp

## Excercise 2-- Plot a Hess diagram for SDSS stars

Code can be found [here](https://www.astroml.org/examples/datasets/plot_SDSS_SSPP.html).

## Excercise 3-- plot an SDSS spectrum

Code can be found [here](https://www.astroml.org/examples/datasets/plot_sdss_spectrum.html).

if the above code failed-- this should work.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from astropy.io import fits
from astropy.coordinates import SkyCoord
from astropy import units as u
from astroquery.sdss import SDSS
from astroquery.sdss import SDSS


## choose a spectrum 
plate = 1615
mjd =53166 
fiber = 513

## read in the spectrum and convert the ouptut to useful qunataties
spec = SDSS.get_spectra(plate=plate, mjd=mjd, fiberID=fiber)[0]

hdu = spec[1].data
wavelength = 10**hdu['loglam']          # Ã…
flux = hdu['flux']
error = 1 / np.sqrt(hdu['ivar'])

## query the SDSS spectrascopei object data to get the redshift, redshift error, and class of opject

res = SDSS.query_specobj(
    plate=plate,
    mjd=mjd,
    fiberID=fiber,
    fields=['z', 'zErr', 'class']
)

z = res['z'][0]


## make the plot


ax = plt.axes()

ax.plot(wavelength, flux, '-k', label='spectrum')
ax.plot(wavelength, error, '-', color='gray', label='error')

ax.legend(loc=4)

ax.set_title(f'Plate = {plate}, MJD = {mjd}, Fiber = {fiber}')

ax.text(0.05, 0.95, f'z = {z:.2f}',
        size=16, ha='left', va='top', transform=ax.transAxes)

ax.set_xlabel(r'$\lambda (\AA)$')
ax.set_ylabel('Flux')

ax.set_ylim(-10, 300)

plt.show()

## Download a set of SDSS standard stars and plot its color-color diagram

In [None]:
from astroML.datasets import fetch_sdss_S82standards

# Fetch the stripe 82 data (see https://github.com/astroML/astroML/blob/main/astroML/datasets/sdss_S82standards.py)
data = fetch_sdss_S82standards()

# select the first 10000 points
data = data[:10000]

In [None]:
print(data.dtype.names)

In [None]:
# select the mean magnitudes for g, r, i
g = data['mmu_g']
r = data['mmu_r']
i = data['mmu_i']

### Plot

In [None]:
# Plot the g-r vs r-i colors
fig, ax = plt.subplots(figsize=(5, 3.75))
ax.plot(g - r, r - i, marker='.', markersize=2,
        color='black', linestyle='none')

ax.set_xlim(-0.6, 2.0)
ax.set_ylim(-0.6, 2.5)

ax.set_xlabel(r'${\rm g - r}$')
ax.set_ylabel(r'${\rm r - i}$')

### We can plot it differently

In [None]:
# Fetch the stripe 82 data (see https://github.com/astroML/astroML/blob/main/astroML/datasets/sdss_S82standards.py)
data = fetch_sdss_S82standards()

# select the mean magnitudes for g, r, i
g = data['mmu_g']
r = data['mmu_r']
i = data['mmu_i']

In [None]:
import copy

# Compute and plot the 2D histogram
H, xbins, ybins = np.histogram2d(g - r, r - i,
                                 bins=(np.linspace(-0.5, 2.5, 50),
                                       np.linspace(-0.5, 2.5, 50)))

# Use the image display function imshow() to plot the result
fig, ax = plt.subplots(figsize=(5, 3.75))
H[H == 0] = 1  # prevent warnings in log10

ax.imshow(np.log10(H).T, origin='lower',
          extent=[xbins[0], xbins[-1], ybins[0], ybins[-1]],
          cmap='binary', interpolation='nearest',
          aspect='auto')

ax.set_xlabel(r'${\rm g - r}$')
ax.set_ylabel(r'${\rm r - i}$')

ax.set_xlim(-0.6, 2.5)
ax.set_ylim(-0.6, 2.5)