## AST 208 Lab 3: Using *Gaia* data

<i class="fa fa-pencil" style="font-size:1.5em; color:red"></i>
Your Name  
Dates  
**Team Name:** 
**Collaborators:** Collaborator 1, Collaborator 2, Collaborator 3  

In [503]:
# Run this cell first to load the numpy and matplotlib.pyplot modules
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

## Introduction

As we've discussed in class, the 2018 *Gaia* data release was a huge leap forward for astrometric measurements, and by extension, our understanding of the structure of the Milky Way. Today we are going to investigate the environs of the globular star cluster M4. 

A globular cluster is a group of stars that are gravitationally bound to each other. The stars are thought to have formed at the same time from a single cloud of gas, and therefore have similar ages, chemical composition, and motions through the Galaxy. Star clusters have been crucial for our understanding of stellar properties and evolution. Globular clusters are also very densely packed with stars, which means interesting interactions between stars can happen and create unusual objects (Prof Chomiuk studies these in her research).

You can query the *Gaia* catalog yourself from the <a href="http://vizier.u-strasbg.fr/viz-bin/VizieR-3?-source=I/345/gaia2">Vizier website</a>. However, for ease of use today, I've already done the query for you, and you can dowload the data from D2L (`m4.dat`). I searched an area of several square degrees around M4, and made a few additional cuts. I required that stars have magnitudes $G < 19.5$ (fractional errors are always larger on fainter stars), and that the stars had to have been observed at least 8 times (to ensure decent astrometry).

## Load in the Data
Open the file `m4.dat` in the Jupyter Notebook and note the layout of the file.  The first few lines look as follows.
```
RA_ICRS;DE_ICRS;Plx;e_Plx;pmRA;e_pmRA;pmDE;e_pmDE;Gmag;e_Gmag;BPmag;e_BPmag;RPmag;e_RPmag
deg;deg;mas;mas;mas/yr;mas/yr;mas/yr;mas/yr;mag;mag;mag;mag;mag;mag
---------------;---------------;----------;-------;---------;------;---------;------;-------;------;-------;------;-------;------
244.91878681568;-27.52180033584;    0.7939; 0.2079;    5.437; 0.469;    1.111; 0.325;18.4624;0.0016;19.5898;0.0296;17.3570;0.0129
244.92988982198;-27.51248551394;    0.5138; 0.1347;    0.678; 0.358;   -7.356; 0.252;17.6449;0.0010;18.4419;0.0137;16.7256;0.0069
244.92945962047;-27.51678045967;    0.1269; 0.0513;    5.894; 0.144;   -6.477; 0.096;12.9826;0.0005;14.1728;0.0022;11.9026;0.0012
244.87880439398;-27.51832254013;    1.7601; 0.2455;   -8.437; 0.795;  -12.256; 0.468;18.0699;0.0016;19.2928;0.0333;16.8784;0.0108
```
Each row contains information about a different star. The columns are, from left to right:  
1) the star's right ascension in decmial degrees  
2) declination in decimal degrees  
3) the parallax angle in mas (milliarcseconds)  
4) the parallax uncertainty in mas  
5) the proper motion in the RA direction in mas/yr  
6) the uncertainty in the RA proper motion in mas/yr  
7) the proper motion in the Dec direction in mas/yr  
8) the uncertainty in the Dec proper motion in mas/yr   
9) the $G$-band magnitude in units of magnitude  (*Gaia* $G$ band is similar to traditional $V$ band)  
10) the error on th $G$-band magnitude in units of magnitude  
11) the $B$-band magnitude in units of magnitude   
12) the error on th $B$-band magnitude in units of magnitude  
13) the $R$-band magnitude in units of magnitude  
14) the error on th $R$-band magnitude in units of magnitude  

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 1) Run the line of code below to load `m4.dat` into an array. What is the size of the array? How many stars are in the `m4.dat` file?

<i class="fa fa-exclamation-triangle" style="font-size:1.5em; color:red"></i> **Tip:** You can see the size of an array by running the `shape` command:
```python
print(in_array.shape)
```
The first dimension corresponds to rows in the text file (number of stars), and the second dimension refers to columns in the text file.

<span style="color:red">(2 points total)</span>

In [1]:
# load data in from text file. I am skipping 67 header rows for m4.dat
in_array = np.genfromtxt('m4.dat', skip_header=67, delimiter=';')

NameError: name 'np' is not defined

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 2) `in_array` is a two-dimensional array, which can be complicated to manipulate and do calculations on. For simplicity, let's transfer each column to its own array. In the below code cell, store each column in its own array. Make sure they are named memorable things, like `ra`, not `array1`.

<i class="fa fa-exclamation-triangle" style="font-size:1.5em; color:red"></i> **Tips:** First of all--counting in python starts with zero, not one! That means that the first column of your array is column 0.

To refer to the element in the first row of the first colunn of our array, we would write `in_array[0,0]`. If instead we wanted to refer to the first three columns for the first star, we would write `in_array[0,0:2]`. If  we wanted to refer to all columns for the first star, we would write `in_array[0,:]`. 

To copy over all rows of the second column (i.e., to get an array of declinations), you can write
```python
dec = in_array[:,1]
``` 
and similarly for the other variables. 

<span style="color:red">(2 points total)</span>

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 3) Plot the distribution of stars on the sky (i.e., Dec as a function of RA). Describe the appearance of any structures or over-densities that are visible.

<i class="fa fa-exclamation-triangle" style="font-size:1.5em; color:red"></i> **Tip:** There are a lot of stars to plot, so througout this lab, you'll want to use a small plotting symbol size. The comma as a plotting symbol does that. The below command would plot the `yy` array as a function of the `xx` array, with small plot symbols `','`.
```python
plt.plot(xx,yy,',')
``` 

Don't forget to label your axes! (see Lab 1 to remember how to do this). You should also always include units on your axis labels. So here, as RA is in units of degrees, label the axis with something like "RA (deg)".

<span style="color:red">(6 points total)</span>

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 4) Plot the proper motions of stars on the sky, by plotting proper motion in Dec (PM_Dec) as a function of proper motion in RA (PM_RA). Again, plot with the small ',' symbol. Zoom in on the proper motion range of most stars by setting the X and Y ranges of the plot to span -40 to 40 mas/yr, on each axis.

<i class="fa fa-exclamation-triangle" style="font-size:1.5em; color:red"></i> **Tip:** The below line sets the range of the x axis to span -100 to 100:
```python
plt.xlim([-40,40])
``` 
And this sets a similar range for the yaxis. 
```python
plt.ylim([-40,40])
``` 
Insert these lines after your `plt.plot` command. And don't forget to label your axes, and include units!

<span style="color:red">(4 points total)</span>

## Find M4!

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 5) You should be able to see two distinct clumps of stars in proper motion space. Let's work to isolate the smaller proper motion clump (the globular cluster M4!) by making some cuts (or "filters") on PM_RA and PM_Dec. First decide what PM_RA and PM_Dec cuts you should  make to isolate the smaller clump. Re-create the proper motion plot from #4, but draw lines on it to signify your proper motion cuts.

<i class="fa fa-exclamation-triangle" style="font-size:1.5em; color:red"></i> **Tip:** To draw a vertical line with:
```python
plt.vlines(7,-100,100)
``` 
which places the line at x=7, and makes it extend from y = -100 to y=100. Similarly, you can draw a horizontal line with 
```python
plt.hlines(13,-500,500)
```
which will draw a line at y=13 and extend from x = -500 to x=500.

<span style="color:red">(3 points total)</span>

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 6) Now let's actually make these cuts on our arrays. In the code cell below, determine which stars belong to the proper motion clump you isolated in #5.

<i class="fa fa-exclamation-triangle" style="font-size:1.5em; color:red"></i> **Tip:** One way to do this is with `np.where` statements. For example, 
```python
dcut = np.where((dec < -26))
``` 
will return an array called `dcut` which contains the indices (aka "addresses") of all the stars that have Dec less that -26 degrees. You could then see the Decs of these stars by printing `dec[dcut]`. You could also see the RA's of these same stars by looking at `ra[dcut]`.

You can combine multiple cuts with "and" statements. For example, 
```python
rdcut = np.where((dec < -26) & (ra > 246.0))
``` 
will mean that `rdcut` includes addresses for all stars that have Dec less that -26 degrees **and** RA > 246 degrees.

<span style="color:red">(2 points total)</span>

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 7) Check your work! Remake the plot from #5, but this time only plotting the stars that adhere to your proper motion cuts.

<span style="color:red">(2 points total)</span>

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 8) Now let's plot these same stars' positions on the sky. Remake the plot from #3 (Dec as a function of RA), but this time only plotting the stars that adhere to your proper motion cuts. How has the plot changed, compared to #3? Explain why your proper motion cuts led to this change.

<span style="color:red">(7 points total)</span>

## Distance and Selection Effects

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 9) Calculate a new array that contains the distances to the stars in units of pc.

(<span style="color:red">2 points total</span>)

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 10) Estimate and print the distance to the proper motion structure that you filtered out in #6. Do a sanity check and explain why your answer does or does not make sense.

<span style="color:red">(5 points total)</span>

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 11) Plot $G$-band apparent magnitude as a function of distance for all the stars in your file. Set your distance range (x-axis range) to span 0--10 kpc (there are some stars with wild distances, due to large errors on their parallax measurements; this will eliminate most of them).  Consider the distribution of points in the plot, and explain why the plot looks this way.

<i class="fa fa-exclamation-triangle" style="font-size:1.5em; color:red"></i> **Tip:** When plotting magnitude, "flip" the axis so that smaller numbers (brigher stars) are at the top.
```python
plt.ylim([20,0])
``` 

<span style="color:red">(7 points total)</span>

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 12) Calculate a new array that contains the $G$-band absolute magnitudes of the stars. We will just focus on $G$-band magnitudes here, as these are much better measured by *Gaia* than $B$ or $R$ band (but we will later use $B$ and $R$ bands to estimate colors).

<i class="fa fa-exclamation-triangle" style="font-size:1.5em; color:red"></i> **Tip:** To calculate a base-10 logarithm in python, try `np.log10(array)`.

<span style="color:red">(2 points total)</span>

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 13) Make a similar plot as in #11, but plot $G$-band **absolute** magnitude as a function of distance (rather than apparent magnitude). Again, plot all stars in your file (not just those adhering to the proper motion cut).  Consider the distribution of points in the plot. How does it look different from the plot in #11? Why? What observational biases or selection effects are shaping this plot?


<span style="color:red">(7 points total)</span>

## Color Magnitude Diagrams
Finally, let's investigate the properties of the stars in this field, by making color-magnitude diagrams (CMDs).

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 14) Calculate a new array that contains the $B-R$ color index.

<span style="color:red">(2 points total)</span>

To make high-quality color-magnitude diagrams, we need to exclude the points with large uncertainties on their distance estimates.

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 15)  Calculate a new array that contains the fractional error on the distances to the stars. Be aware that some parallaxes are (erroneously) negative, and so you'll want to use absolute values. 

<i class="fa fa-exclamation-triangle" style="font-size:1.5em; color:red"></i> **Tip:** To calculate the abolute value of an array, try 
```python
pospar = np.abs(par)
``` 

<span style="color:red">(2 points total)</span>

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 16) Make a cut on the full star sample that will exclude stars with fractional errors on their distances of >30%.

<span style="color:red">(2 points total)</span>

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 17) Modify your proper motion filter from #6 to also exclude stars with fractional errors on their distances of >30%.

<span style="color:red">(2 points total)</span>

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 18) Make a color-magnitude diagram ($G$-band absolute magnitude as a function of $B-R$ color) for all stars with fractional errors on their distances of <30%.

<span style="color:red">(4 points total)</span>

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> 19) Make a color-magnitude diagram ($G$-band absolute magnitude as a function of $B-R$ color) for stars in the proper motion clump and with fractional errors on their distances of <30%. How does the CMD compare with that in Question #18? Why do you think the two CMDs look different?

<span style="color:red">(7 points total)</span>

## Extra Credit

<i class="fa fa-cogs" style="font-size:1.5em; color:red"></i> EC) There is a second, subtler over-density of stars visible in #3. Try to filter it out based on position, parallax, and proper motion cuts. Discuss how its CMD compares with M4!

<span style="color:red">(6 points total)</span>

## <i class="fa fa-exclamation-triangle" style="font-size:1.5em; color:red"></i>Closeout

Prepare this lab for submission: Remove any "tips" and unnecessary instruction text or cells. Leave in numbered questions, and make sure you have answered all questions clearly and thoroughly. Make sure that all markdown cells are rendered, that code cells execute properly, and that this notebook is saved.  After you've saved the notebook, select `File : Close and Halt`. Follow the instructions on D2L for uploading this notebook to the appropriate dropbox.