# Hubble's law and the age of the Universe

Hubble's law is the observation that galaxies are moving away from
Earth at speeds proportional to their distances from Earth: $$v = H_0 \, D,$$
where $H_0$ is the *Hubble constant*, $D$ is the distance to a
galaxy, and $v$ is the speed of separation.

Hubble constant is most frequently quoted in km/s/Mpc, thus giving
the speed in km/s of a galaxy 1 megaparsec away. (1Mpc = $3.09 \times 10^{19}$ km; the parsec, pc, is a unit of length used to measure the large distances to astronomical objects outside the Solar System.) However, the SI unit of $H_0$ is
simply $s^{-1}$. The reciprocal of $H_0$ is known as the
*Hubble time*. The Hubble time is the age the Universe would
have had if the expansion had been uniform in time; it is different from the
real age of the Universe because the real expansion is not
uniform. However, the Hubble time and the age of the Universe are
related by a dimensionless factor which depends on the mass-energy
content of the Universe; it assumed to be close to 0.96.

We determine the Hubble constant from
the experimental data on the magnitude and redshift of supernovae.

Load the required packages:

In [None]:

using CSV
using DataFrames
using PyPlot

The URL of the database of the observations:

In [None]:

url = "https://vizier.u-strasbg.fr/viz-bin/asu-txt?-source=J/ApJ/716/712/tableb2&-out=SN&-out=zCMB&-out=mu"

Download the data into a temporary file on the local machine:

In [None]:

catalog = download(url);


Read the data from a disk into a DataFrame, skipping 38 lines of the header and giving the new names to the columns - `name`, `redshift`, and `modulus` - for the name of the star, its red shift, and its distance modulus.

In [None]:

df = CSV.read(catalog, DataFrame, skipto=38, delim=' ', ignorerepeated=true,
              types=[String, Float64, Float64], silencewarnings=true,
              header=["name", "redshift", "modulus"],)

Drop records with missing data:

In [None]:
dropmissing!(df)

The *distance modulus* is a logarithmic measure of the distance to an astronomical object, 
calculated from its apparent brightness and absolute brightness. The distance modulus, $\mu$, is 
related to the object's distance from the observer through the formula 

$$\mu \equiv 5 \log_{10}(D) + 25,$$ 

where $D$ is the distance in Megaparsecs.

Let's define a helper function, `dist`, that, given the modulus, calculates the distance in Mpc:

In [None]:
dist(modulus) = 10.0 ^ (modulus / 5 - 5)


The redshift is a dimensionless parameter defined as follows:

$$z = \frac{\lambda_{\mathrm{ob}} - \lambda_{\mathrm{em}} }{\lambda_{\mathrm{em}}}.$$

Here $\lambda_{\mathrm{em}}$ is the wavelength of the emited light, $\lambda_{\mathrm{ob}}$ is the wavelength measured by the observer.

The main causes of redshift in cosmology are the relative motions of radiation sources, which give rise to the *Doppler effect*, and *gravitational redshift* due to the radiation escaping gravitational potential. 

In this assignment we are only considering the case of small redshift, $z \ll 1$. For small $z$, the redshift is related to the velocity of the separation as follows:

$$z \approx \frac{v}{c} .$$

Here $v$ is the speed of the star, and $c$ is the speed of light. 

$$z = \frac{v}{c} = \frac{H_0}{c}D ,$$

i.e. the slope of the graph $z(D)$ gives the Hubble constant divided by the speed of light.

Let's keep only the observational records for small values of $z$, as specified in the problem statement.

In [None]:

zmax =  # <= your code here
filter!(row -> row.redshift < zmax, df)
sort!(df, "modulus")

Calculate distances to the stars in the dataframe `df`.

In [None]:
distances = dist.(df.modulus)

Let's plot the redshift vs distance to the star:

In [None]:

plot(distances, df.redshift, ".", label="measurements")
grid(true)
legend()
ylabel("Red Shift")
xlabel("Distance (Mpc)");

We use the least squares fit to find the parameters of the linear regression.

In [None]:
"""
   alpha, beta, sigma = linear_regression(x, y)

Least square linear regression fit y = alpha + beta * x
Sigma is standard deviation for beta
"""
function linear_regression(x, y)
    # your code here
end

In [None]:
alpha, beta, sigma = linear_regression(distances, df.redshift)

Plot of the result of the fit:

In [None]:
plot(distances, df.redshift, ".", label="measurements")
plot(distances, alpha .+ beta .* distances, label="LSq linear fit")
grid(true)
legend()
ylabel("Red Shift")
xlabel("Distance (Mpc)");

Now, we can calculate the Hubble constant, in km/sec/Mpc:

In [None]:
const c = 300000.0  # speed of light, km/sec
H0 =  # <= your code here

Standard deviation, in km/sec/Mpc (it is much smaller than H0):

In [None]:
dH0 = c * sigma

Hubble constant in 1/sec:

In [None]:
const mpc = 3.09e19     # 1 megaparsec in km
h0 = H0 / mpc
round(h0, sigdigits=3)

Hubble time in seconds:

In [None]:
Th =   # <= your code here
round(Th, sigdigits=3)

Hubble time in years:

In [None]:
round(Th/(), sigdigits=2)  # <= your code here

Compare your values of the Hubble constant and the Hubble time with with the ones found in the literature. Describe in the cell(s) below: