# A Period - Magnitude Relation in Cepheid Stars

* Cepheids are stars whose brightness oscillates with a stable period that appears to be strongly correlated with their luminosity (or absolute magnitude).


* A lot of _monitoring_ data - repeated imaging and subsequent "photometry" of the star - can provide a measurement of the absolute magnitude (if we know the distance to it's host galaxy) and the period of the oscillation.


* Let's look at some Cepheid measurements reported by [Riess et al (2011)](Riess et al., 2011, ApJ, 730, 119).  Like the correlation function summaries, they are in the form of datapoints with error bars, where it is not clear how those error bars were derived (or what they mean).


* Our goal is to infer the parameters of a simple relationship between Cepheid period and, in the first instance, apparent magnitude.

In [None]:
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = (15.0, 8.0) 

## A Look at Each Host Galaxy's Cepheids

Let's read in all the data, and look at each galaxy's Cepheid measurements separately. Instead of using `pandas`, we'll write our own simple data structure, and give it a custom plotting method so we can compare the different host galaxies' datasets.

In [None]:
# First, we need to know what's in the data file.

!head -15 R11ceph.dat

In [None]:
class Cepheids(object):
    
    def __init__(self,filename):
        # Read in the data and store it in this master array:
        self.data = np.loadtxt(filename)
        self.hosts = self.data[:,1].astype('int').astype('str')
        # We'll need the plotting setup to be the same each time we make a plot:
        colornames = ['red','orange','yellow','green','cyan','blue','violet','magenta','gray']
        self.colors = dict(zip(self.list_hosts(), colornames))
        self.xlimits = np.array([0.3,2.3])
        self.ylimits = np.array([30.0,17.0])
        return
    
    def list_hosts(self):
        # The list of (9) unique galaxy host names:
        return np.unique(self.hosts)
    
    def select(self,ID):
        # Pull out one galaxy's data from the master array:
        index = (self.hosts == str(ID))
        self.m = self.data[index,2]
        self.merr = self.data[index,3]
        self.logP = np.log10(self.data[index,4])
        return
    
    def plot(self,X):
        # Plot all the points in the dataset for host galaxy X.
        ID = str(X)
        self.select(ID)
        plt.rc('xtick', labelsize=16) 
        plt.rc('ytick', labelsize=16)
        plt.errorbar(self.logP, self.m, yerr=self.merr, fmt='.', ms=7, lw=1, color=self.colors[ID], label='NGC'+ID)
        plt.xlabel('$\\log_{10} P / {\\rm days}$',fontsize=20)
        plt.ylabel('${\\rm magnitude (AB)}$',fontsize=20)
        plt.xlim(self.xlimits)
        plt.ylim(self.ylimits)
        plt.title('Cepheid Period-Luminosity (Riess et al 2011)',fontsize=20)
        return

    def overlay_straight_line_with(self,m=0.0,c=24.0):
        # Overlay a straight line with gradient m and intercept c.
        x = self.xlimits
        y = m*x + c
        plt.plot(x, y, 'k-', alpha=0.5, lw=2)
        plt.xlim(self.xlimits)
        plt.ylim(self.ylimits)
        return
    
    def add_legend(self):
        plt.legend(loc='upper left')
        return


In [None]:
data = Cepheids('R11ceph.dat')
print(data.colors)

OK, now we are all set up! Let's plot one of the datasets.

In [None]:
data.plot(4258)

# for ID in data.list_hosts():
#     data.plot(ID)
    
data.overlay_straight_line_with(m=-2.0,c=24.0)

data.add_legend()

### Q: Is the Cepheid Period-Luminosity relation likely to be well-modeled by a power law ?

With your neighbor, try plotting up the different host galaxy's cepheid datasets. Can you find straight lines that "fit" all the data from each host? And do you get the same "fit" for each host? Notice that you can plot multiple datasets on the same axes.  

## Inferring the Period-Magnitude Relation

* Let's try inferring the parameters $a$ and $b$ of the following linear relation:

$m = a\;\log_{10} P + b$

* We have data consisting of *observed magnitudes with quoted uncertainties*, of the form 

$m_{\rm obs} = 24.51 \pm 0.31$ at $\log_{10} P = \log_{10} (13.0/{\rm days})$

* Let's draw the PGM together, on the whiteboard.

### Q: What is the PDF for $m$, ${\rm Pr}(m|a,b,H)$?

### Q: What are reasonable assumptions about the sampling distribution ${\rm Pr}(m_{\rm obs}|m,H)$?

### Q: What are reasonable assumptions about the prior ${\rm Pr}(a,b|H)$?

We should now be able to code up functions for the log likelihood, log prior and log posterior, such that we can evaluate them on a 2D parameter grid. Let's fill them in:

In [None]:
def log_likelihood(logP,m,merr,a,b):
    return 0.0 # np.sum()? 

def log_prior(a,b):
    return 0.0 # Ranges? Functions?

def log_posterior(logP,m,merr,a,b):
    return log_likelihood(logP,m,merr,a,b) + log_prior(a,b)

Now, let's set up a suitable parameter grid and compute the posterior PDF!

In [None]:
# Select a Cepheid dataset:
data.select(4258)

# Set up parameter grids:
npix = 100
amin,amax = -5.0,0.0
bmin,bmax = 18.0,27.0
agrid = np.linspace(amin,amax,npix)
bgrid = np.linspace(bmin,bmax,npix)
logprob = np.zeros([npix,npix])

# Loop over parameters, computing unnormlized log posterior PDF:
for i,a in enumerate(agrid):
    for j,b in enumerate(bgrid):
        logprob[j,i] = log_posterior(data.logP,data.m,data.merr,a,b)

# Normalize and exponentiate to get posterior density:
Z = np.max(logprob)
prob = np.exp(logprob - Z)
norm = np.sum(prob)
prob /= norm

Now, plot, with confidence contours:

In [None]:
sorted = np.sort(prob.flatten())
C = sorted.cumsum()

# Find the pixel values that lie at the levels that contain
# 68% and 95% of the probability:
lvl68 = np.min(sorted[C > (1.0 - 0.68)])
lvl95 = np.min(sorted[C > (1.0 - 0.95)])

plt.imshow(prob, origin='lower', cmap='Blues', interpolation='none', extent=[amin,amax,bmin,bmax])
plt.contour(prob,[lvl68,lvl95],colors='black',extent=[amin,amax,bmin,bmax])
plt.xlabel('slope a')
plt.ylabel('intercept b / AB magnitudes')