# Stellar velocities and the Milky Way
---
<div>Michael C. Stroh</br>
michael.stroh@northwestern.edu</br>
Center for Interdisciplinary Exploration and Research in Astrophysics</br>
Northwestern University</br>
2021
</div>

---



# Section A: Viewing the Milky Way

We live in a barred spiral galaxy, the Milky Way. Below is an artist's concept of the Milky Way if one were able to look down upon it.

<img src='https://github.com/mcstroh/python-tutorials/blob/master/velocity_stars_in_milky_way/images/ssc2008-10a.png?raw=true' alt="Artist's concept of the Milky Way" width="500" />

We cannot take an image of the Milky Way because we live within it, in one of the spiral arms. Thus it is important to view the Milky Way in alternative ways in order to understand its structure.

To begin, we can build upon concepts that are more commonly used in other aspects of our lives. To begin with, let us review the common way we view the surface of the Earth.

![Longitude-latitude diagram view of the Earth](https://github.com/mcstroh/python-tutorials/blob/master/velocity_stars_in_milky_way/images/earth_l_b_comparison.png?raw=true)

We can view our galaxy, the Milky Way, in a similar manner. Below is an image of carbon-oxygen clouds in the Milky Way using a similar Longitude-Latitude coordinate system.

![Longitude-latitude diagram of the Milky Way](https://github.com/mcstroh/python-tutorials/blob/master/velocity_stars_in_milky_way/images/dame_l_b_simplified.png?raw=true)

This isn't particularly useful because the Milky Way looks pretty flat because most of the clouds reside in a very thin disk.
As an alternative, we can replace latitude with something else. 
Below, we show a similar diagram where someone has plotted velocity of the clouds along the y-axis.

![Longitude-velocity diagram of the Milky Way](https://github.com/mcstroh/python-tutorials/blob/master/velocity_stars_in_milky_way/images/dame_l_v_simplified.png?raw=true)

# Section B: The Doppler effect

We often think of light as a wave, described by a wavelength (the distance between nearby peaks), or a frequency (how often peaks are detected).
If the star is stationary, the peaks of the wave will be detected at regular intervals as demonstrated in the Wikipedia image below (think of the circles as representing separate peaks of the wave moving away from the star):

![Sound emitted by a stationary source](https://upload.wikimedia.org/wikipedia/commons/e/e3/Dopplereffectstationary.gif)

However, if the star is moving, an observer will measure a difference in the detected light.

![Sound emitted by a moving source](https://upload.wikimedia.org/wikipedia/commons/c/c9/Dopplereffectsourcemovingrightatmach0.7.gif)

If the star is moving away from you, the peaks will be much further apart (redshifting). 

*   If a star is moving **away from you**, the peaks will be much further apart. This is referred to as **redshifting** because red light has a longer wavelength (higher frequency) than blue light.
*   If a star is moving **towards you**, the peaks will be much closer together. This is referred to as **blueshifting** because blue light has a shorter wavelength (lower frequency) than red light.


Therefore, if we know what the frequency of light **should be**, we can calculate the velocity of a star by measuring the frequency that we observe. In particular, we can find the velocity using:

$v = \bigg(1 - \frac{f_{observed}}{f_{stationary}} \bigg)c$

In this equation, 
*    v is the velocity,
*    $f_{observed}$ is the frequency measured by the observer,
*    $f_{stationary}$ is the frequency when the star is stationary, and
*    c is the speed of light (approximately $3 \times 10^5$ km/s, or $6.706 \times 10^8$ miles/hour).


##Regarding units
Frequency is often measured in units of hertz (Hz), which indicates how many peaks arrive per second. 
In this notebook we will commonly use units of GHz which are $10^9$ Hz. 
Thus if observing light at 43 GHz, the wave of light peaks $43\times10^{9}$ times per second.

Since the frequencies are in a fraction, their units will cancel out. So the units you use for the speed of light, will be the units of velocity at the end.

#Section C: Clouds around the Stars



We will use observations of clouds surrounding old stars, referred to as Mira variables.
These stars are losing mass which feeds the clouds we are observing.
The stars cyle through being bright and dim, over periods of hundreds of days.

Below we show a movie 
(from Diamond & Kemball, 2003) of the clouds in a similar star to the ones presented here.
The star is located in the center, but is not bright enough to be detected by these observations.
The 3 second movie spans an approximately 2 year period.

![VLBA SiO maser variability from Diamond & Kemball, 2003](https://github.com/mcstroh/python-tutorials/blob/master/velocity_stars_in_milky_way/images/diamond_kemball_sio_movie.gif?raw=true)


Although we are technically observing the clouds surrounding the stars, the velocity of the clouds will give us the velocity of the stars.



# Section D: Goals for this project

<a href='https://public.nrao.edu/telescopes/vla/'>
<img src='https://www.nsf.gov/news/mmg/media/images/vlasunrisejuly2008_h.jpg' alt='VLA observatory' />
</a>

We will use the data collected from the Karl G. Jansky Very Large Array, often referred to as the VLA.
The VLA observes with 27 radio antennas and is located in New Mexico.

Generally, this notebook will guide you through:
* Download, load and plot a radio spectrum.
* Detect at least one spectral line in the spectrum you choose.
* Discover how fast the star is moving towards or away from you.
* Based on the velocity of the star, how does the star fit into the structure of the Milky Way?




The observations shared in this notebook are related to the <a href='http://www.phys.unm.edu/~baade/index.html'>Bulge Asymmetries and Dynamical Evolution (BAaDE)</a> survey.
The BAaDE survey performs a similar analysis to what you are performing here but for 28,062 stars across the Milky Way.
Some questions of interest to the BAaDE survey:
*   How does the galaxy look like based on the velocities of Mira variables?
*   What are the properties of the stars in the survey?
*   What causes the clouds surrounding the star to emit these spectral lines?

## Important note going forward
When working through this project, it's OK to run into problems and struggle.
Make notes of what hurdles you are running into.
If you get stuck, write down what you know, and what you think you need to move forward.

#Section 1: Downloading and plotting a spectrum


Download the five VLA spectra by running the cell below.

In [None]:
!wget https://raw.githubusercontent.com/mcstroh/python-tutorials/master/velocity_stars_in_milky_way/spectra/ad3a-16952.txt
!wget https://raw.githubusercontent.com/mcstroh/python-tutorials/master/velocity_stars_in_milky_way/spectra/ad3a-16500.txt
!wget https://raw.githubusercontent.com/mcstroh/python-tutorials/master/velocity_stars_in_milky_way/spectra/ad3a-17123.txt
!wget https://raw.githubusercontent.com/mcstroh/python-tutorials/master/velocity_stars_in_milky_way/spectra/ad3a-17332.txt
!wget https://raw.githubusercontent.com/mcstroh/python-tutorials/master/velocity_stars_in_milky_way/spectra/ad3a-18081.txt


It's always useful to plot the data to get a sense of what you're dealing with. Write code to plot the spectrum with flux (Jy) as the y-axis, and frequency (GHz) along the x-axis.
The flux is a measurement of how strong the spectrum is.

The files are formatted similar to the table below but with more than 4000 lines:
><p align='right'>Frequency (GHz)</p> |    <p align='right'>Flux (Jy)</p>
> --- | ---
>     <p align='right'>$42.3048852$</p> |   <p align='right'>$-0.0112890$</p>
>     <p align='right'>$42.3051352$</p> |   <p align='right'>$ 0.0055765$</p>
>     <p align='right'>$42.3053852$</p> |   <p align='right'>$-0.0218999$</p>
>     <p align='right'>$42.3056352$</p> |   <p align='right'>$ 0.0051897$</p>
>     <p align='right'>$42.3058852$</p> |   <p align='right'>$-0.0139943$</p>



I include an example of a spectrum to give you an idea of what you're looking for. But you may have fewer peaks in your figure.

![VLA radio spectrum from ce3a-00250](https://github.com/mcstroh/python-tutorials/blob/master/velocity_stars_in_milky_way/images/ce3a-00250.png?raw=true)

You have the freedom to customize how your figure looks. 
Above all, make sure you're happy with your figure.

Hint: The pandas function read_fwf() may be useful to read the files.



# Section 2: Creating your own line detection function

Calculate the average flux of the spectrum. The NumPy "mean" function (https://numpy.org/doc/stable/reference/generated/numpy.mean.html), or Pandas "mean" function (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.mean.html) may be useful.

Now calculate the standard deviation of the flux in the spectrum. This gives an estimate of how wiggly the spectrum is. You can use the NumPy "std" function (https://numpy.org/doc/stable/reference/generated/numpy.std.html) or the Pandas "std" function (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.std.html).

Astronomers often use the following formula to find spectral lines:

$D =~$average$~+~5~\times~$standard deviation

For a line to be detected, it must be brighter than this value.
Anything below this value is considered to be noise.

Using this formula, how bright must a line be in order for it to be detected?

Create a **function** to find the frequency and flux of the peak of any line found in the spectrum you downloaded. You may want to use some of the previous steps, but are not required to do so!

If your file has multiple detected lines as shown in the plot you made above, first focus on finding one line.

In [None]:
def my_line_finder(frequency, flux):

    # Find peak - the values below are placeholders
    peak_frequency = None
    peak_flux = None

    # Send this information back
    return peak_frequency, peak_flux

#
# The code below is included so that you can focus on the function above.
#

# Now use the function you created
peak_frequency, peak_flux = my_line_finder(frequency, flux)

# Print the output
print(f"A spectral line was found at {peak_frequency} GHz, which has a peak flux of {peak_flux} Jy.")

## (Optional) Section 2.1: Finding all spectral lines


If your file has multiple lines, you can try creating another version of your function that finds *all* spectral lines!



# Section 3: Calculating the velocity of the star



Recall from the Doppler effect section that we can find the velocity if we compare the detected frequency to the frequency that the line would be if the star were stationary.
Below is a table of possible frequencies for the lines in this sample.
Which line is closest in frequency to the line you found in Section 2?


> <p align='right'>#</p>|    <p align='right'>Frequency (GHz)</p>
> --- | ---
>  <p align='right'>$1$</p>   | <p align='right'>$42.373341$</p>
>  <p align='right'>$2$</p>   | <p align='right'>$42.519375$</p>
>  <p align='right'>$3$</p>   | <p align='right'>$42.583827$</p>
>  <p align='right'>$4$</p>   | <p align='right'>$42.820570$</p>
>  <p align='right'>$5$</p>   | <p align='right'>$42.879941$</p>
>  <p align='right'>$6$</p>   | <p align='right'>$43.122090$</p>
>  <p align='right'>$7$</p>   | <p align='right'>$43.423853$</p>


In [None]:
# Note the line number and frequency, no coding necessary unless you want to.

Using the following equation to find the velocity of the star using the frequency you found above.
This is the same equation we introduced in the Doppler effect section above.

$v = \bigg(1 - \frac{f_{observed}}{f_{stationary}} \bigg)c$

In this equation 
*    v is the velocity,
*    $f_{observed}$ is the frequency measured by the observer,
*    $f_{stationary}$ is the frequency when the star is stationary, and
*    c is the speed of light (approximately $3 \times 10^5$ km/s, or $6.706 \times 10^8$ miles/hour).


A positive velocity means the star is moving away from you, and a negative velocity means the star is moving towards you.

Is the star moving towards or away from you?

## Section 3.1: Are these stars consistent with spiral arms?

Below we show figures from (Valèe 2017) who tried to model the spiral arms in the Milky Way in two longitude-velocity diagrams.

![Vallèe model of the Milky Way](https://github.com/mcstroh/python-tutorials/blob/master/velocity_stars_in_milky_way/images/vallee_2017_models.png?raw=true)

The left panel shows the structure if you were to look directly towards the center of the Milky Way. 
Note that the x-axis here is backwards from the figure we showed at the end of Section A.
The right panel shows the spiral structure if you are looking away from the center of the Milky Way.

If you haven't already, calculate the velocity of the star in km/s. Based on the value, at what **Galactic longitudes** would you be able to find stars of this velocity?

Now that you've narrowed down possible Galactic longitude ranges where this star could possibly be found, what spiral arms, if any, could such a star belong to? Is it possible to definitely say which spiral arm the star must belong to without knowing the Galactic longitude?

## (Optional - Requires Section 2.1) Section 3.2: Calculate the velocity of all spectral lines

Calculate the velocity of all spectral lines you found in Section 2.1.

How well do the velocities agree? Are there any outliers?

# (Optional) Section 4: Running your code on multiple stars

Calculate the velocity of all of the stars using all five files you downloaded at the start.
It may be useful to use a loop.
Make notes of problems that you run into.
This isn't a simple task!

##(Optional) Section 4.1: A single velocity for each star

Some files have multiple spectral lines and they likely agree. 
If this is the case, calculate a single velocity for each star by calculating the average of the velocities you found for the star.
Also state the error in the average velocity using:

average error = standard deviation/$\sqrt(N)$ 

where N is the number of data points used to calculate the average.

#(Optional) Section 5: A better noise calculation

In Section 2 you calculated the detection threshold using

$D =~$average$~+~5~\times~$standard deviation

but the average and the standard deviation took into account the bright spectral lines.
To go a little deeper, recalculate the average, and standard deviation of the flux values after throwing away the brightest 5% and lowest 5% of flux values.
Using these more robust calculations of the average and standard deviations of the noise, recalculate D.
Repeat the analysis you previously did in Section 2-4 with this updated value.

**If running on multiple files, make sure the D you use to calculate is based on the file you are using and not on a different file.**

Were you able to detect more lines than you originally were able to? If so, which lines were new and for what source(s)?

In [None]:
# Comment

#(Optional) Section 6: Functionalize your approach

Above you created a function to find the peak of a single line and perhaps some other functions. Here, put it all together and create a function that you can pass the file name and it will give you the velocity of the star. Your new function can use functions you created above.

In [None]:
# Write your function here

#(Optional) Section 7: Light Curves


These stars are variable and will brighten over time. Use the next cell to download the light curves (brightness versus time) for the five sources you used above.

These files are from the American Association of Variable Star Observers (AAVSO) at aavso.org.

In [None]:
!wget https://raw.githubusercontent.com/mcstroh/python-tutorials/master/velocity_stars_in_milky_way/lightcurves/lc_ad3a-16952.csv
!wget https://raw.githubusercontent.com/mcstroh/python-tutorials/master/velocity_stars_in_milky_way/lightcurves/lc_ad3a-16500.csv
!wget https://raw.githubusercontent.com/mcstroh/python-tutorials/master/velocity_stars_in_milky_way/lightcurves/lc_ad3a-17123.csv
!wget https://raw.githubusercontent.com/mcstroh/python-tutorials/master/velocity_stars_in_milky_way/lightcurves/lc_ad3a-17332.csv
!wget https://raw.githubusercontent.com/mcstroh/python-tutorials/master/velocity_stars_in_milky_way/lightcurves/lc_ad3a-18081.csv


These text files have the extension .csv which stands for Comma Separated Value. When you load these files, you want to split using commas.

The fields of interest are JD, Magnitude, Uncertainty, and Band.

JD (Julian Date) is widely used in astronomy and is the number of days since noon on Monday, 1 January 4713 BC. A helpful reference is to translate JD into MJD (Modified Julian Date, days since midnight on November 17, 1858) which is given by

$MJD = JD - 2400000.5$

Convert all values into MJD.

Finally, plot the light curve for at least one source since August 1st, 2018. You will want magnitude along the y-axis, and time (as MJD) along the x-axis.

Bands refers to the filter used for the observation. In this case, you'll want to create *separate* light curves for U, B, V, Vis, R, and I. You can overplot these on the same plot but use separate colors.

Brighter magnitudes are smaller, so you'll want the magnitude to decrease along the y-axis.

Useful colors for your bands would be:
- U - purple
- B - blue
- V - green
- Vis - black
- R - red
- I - pink

You can try adding error bars to your magnitude measurements by using the uncertainty field.
In this case, a magnitude 12 measurement with an uncertainty of 0.05 magnitudes would require error bars from 11.95 to 12.05.
The uncertainties may be too small to see relative to the marker size you use, in this case, so your data points may be better presented without uncertainties.


In [None]:
# Read in your file(s) and plot the light curves

The period of the star is the time it takes to complete one cycle, from maximum to maximu, or minimum to minimum.

Did your star complete one full period during this time frame?

In [None]:
# A comment is sufficient here

What is the period of the star? You may need to expand the time range you are using to do this. These stars typically have periods between 80 - 2000 days.

In [None]:
# Comment here if it is convenient