<a href="https://colab.research.google.com/github/jimmynewland/colabnotebooks/blob/main/Color_Magnitude_Diagram_using_Gaia_and_IRSA.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Color-Magnitude Diagrams using Gaia data from IRSA

This Python Jupyter Notebook running on Google Colab is a companion to the [Color-Magnitude Diagram using Gaia and IRSA activity](https://docs.google.com/document/d/12A1GQ6mf0feTEJzhqQqQHBhHmT3FknKORnT4TZSfGCY/edit?usp=sharing).

This notebook takes sources extracted from the Gaia catalog within the Infrared Science Archive (IRSA) and builds a color-magnitude diagram for the dataset.

The columns used in this notebook from the [Gaia catalog at IRSA](https://irsa.ipac.caltech.edu/data/Gaia/dr3/gaia_dr3_source_colDescriptions.html) are [parallax](https://crpurcell.github.io/AstroParallax/) ('parallax'), [color index](https://openstax.org/books/astronomy-2e/pages/17-2-colors-of-stars) ('bp_rp'), and [magnitude](https://openstax.org/books/astronomy-2e/pages/17-1-the-brightness-of-stars) for the source in the green channel ('phot_g_mean_mag').



#The Hertzsprung-Russell Diagram
The famous Hertzsprung-Russell (HR) diagram has many uses in stellar astronomy. An HR diagram shows relationships between stellar populations. This activity focuses on stars that are members of a cluster. Stars in a cluster have a common distance, age, and composition. Suppose you have magnitudes measured using different filters (brightnesses measured over a narrow range of wavelengths) for a group of stars that are known members of the same cluster. In that case, you can make a color-magnitude diagram (CMD). If filters are used to image an object, then comparing the amount of light in each filter can provide an analog for temperature in the optical. Things are different in other bands of the EM spectrum. Since magnitudes are logarithmic, a difference in magnitudes is a ratio of fluxes. See the appendix for a lot more detail. This difference is called a [color index](http://spiff.rit.edu/classes/phys440/lectures/color/color.html) based on [the star Vega](http://spiff.rit.edu/classes/phys440/lectures/color/color.html) (assuming no intervening dust). If the index is negative, the star is bluer than the standard star (Vega). If the index is positive, the star is redder than Vega. You will use actual data from the Gaia mission in the IRSA database to create a CMD for some star clusters.

| Cluster Name | Angular Size (arcminutes) | Parallax Range | Cluster Type |
| --- | --- | --- | --- |
| [Collinder 110](https://simbad.cds.unistra.fr/simbad/sim-basic?Ident=collinder+110&submit=SIMBAD+search) | 18 | > 0.30 and < 0.50 | Open cluster |
| [NGC 4755](https://simbad.u-strasbg.fr/simbad/sim-basic?Ident=jewel+box&submit=SIMBAD+search) | 10 | > 0.35 and < 0.50 | Open Cluster |
| [Messier 13](https://simbad.cds.unistra.fr/simbad/sim-basic?Ident=m+13&submit=SIMBAD+search) | 20 | > 0.10 and < 0.20 | Globular cluster |
| [Melotte 71](https://simbad.cds.unistra.fr/simbad/sim-basic?Ident=melotte+71&submit=SIMBAD+search) | 9 | > 0.30 and < 0.50 | Open Cluster |



#Setup the Notebook

In [None]:
# @title Install Astroquery
!pip install -U astroquery

In [None]:
# @title Import Libraries
from astropy.table import Table
from astroquery.ipac.irsa import Irsa
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

In [None]:
# @title Cluster Data URL List
clusters= [
    'https://thinkingwithcode.com/datascience/collinder110_gaia_dr3.tbl', # Collinder 110 at index 0
    'https://thinkingwithcode.com/datascience/m13_gaia_dr3.tbl',          # M13 at index 0
    'https://thinkingwithcode.com/datascience/melotte71_gaia_dr3.tbl',    # Melotte 71 at index 0
    'https://thinkingwithcode.com/datascience/ngc4755_gaia_dr3.tbl']      # NGC 4755 at index 0

## Select a Cluster

* Collinder 110 is clusters[0]
* M13 is is clusters[1]
* Melotte 71 is is clusters[2]
* NGC 4755 is is clusters[3]

In [None]:
# @title Load A Single Cluster as a Table and Create a Dataframe
# change the number in clusters[ ... ] to select a given cluster
t = Table.read(clusters[0], format='ascii.ipac')
df = t.to_pandas()

#Load and Reduce the Data and Plot

Not every source returned by the Gaia database is a cluster member. Let’s learn to filter our data using known distances from [Simbad](https://simbad.cds.unistra.fr/simbad/). Using the simple search on the Simbad site, search for the name of the cluster we are analyzing. Each cluster name is linked above to its Simbad entry. Find the value for the parallax. Note that the value in square brackets is the uncertainty in this value. The filtered data will contain mostly stars that belong to the cluster and not stars that happen to be along our line of sight that are not cluster members. Try limiting the data to values between the given parallax maximum and minimum values if the results say `0.425 [0.002]`, enter the following into the filter cell below `lower = 0.3` and `upper = 0.5`. The table above has some recommended values to try putting in the parallax filter text box. Be sure to rerun this code block apply the filter.

In [None]:
# @title Filter the Parallax Values
# use the ranges in the table above for your selected cluster
lower = 0.30
upper = 0.50
df_filtered = df[(df['parallax'] > lower) & (df['parallax'] < upper)]

In [None]:
# @title Define Distance Modulus Function
def distance_modulus(app_mag, parallax):
  return app_mag - (5 - 5*np.log10(1000/parallax))

In [None]:
# @title Calculate Absolute Magnitudes and Add to Dataframe
abs_mag = distance_modulus(df_filtered['phot_g_mean_mag'], df_filtered['parallax'])
df_filtered['abs_mag'] = abs_mag

In [None]:
# @title Display First Rows of Table for Inspection
display(df_filtered[['bp_rp', 'parallax', 'abs_mag']].head())

# Plot Data as Color-Magnitude Diagram

In [None]:
# @title CMD using MatPlotLib
fig, ax = plt.subplots()

x = df_filtered['bp_rp']
y = df_filtered['abs_mag']
c = df_filtered['parallax']

scatter = ax.scatter(x, y, c=c, cmap='rainbow', s=1)
fig.colorbar(scatter, ax=ax, label='Parallax')

ax.set_xlabel('Color Index (b - p)')
ax.set_ylabel('Absolute Magnitude (g)')
ax.set_title('H-R Diagram')
ax.invert_yaxis()

plt.show()

# Analysis and Questions for Each Cluster

You can return the top of this notebook and change the code for each cluster to load the data and then to filter the data. Then re-run the notebook with the new table and new filter values.

1. Screenshot and upload your CMD plots as images into this cell (double-click to edit). Describe the x- and y-axes to indicate which way the temperature and brightness increase. You’ll need a CMD for each cluster listed. Analyze the plots and attempt to organize the clusters from oldest to youngest. In a single sentence, give justifications for your choices. (There are some limitations here. The field could have foreground or background stars (field stars) that skew the data.)

2. For each of the 4 clusters, describe the type of stars (main sequence, red giant, white dwarf, e.g.) and their relative presence in that particular cluster. (double-click to edit)
* Collinder 110
* NGC 4755
* Messier 13
* Melotte 71

3. Why do we analyze clusters, and how could field stars be a problem for our CMD? And how does this make a CMD different from a typical H-R diagram?

4. Label, mark, or describe areas like the sub-giant branch, the red giant branch, the asymptotic branch, the horizontal branch, the main sequence, the white dwarfs, etc.

5. Remember that larger parallax angles mean closer distances. How does this visualization of the parallax values affect your interpretation of the CMD you produced?

6. Now, calculate the distance to the cluster in parsecs using the given parallax value from Simbad. Remember d = 1/p.

7. Describe the variability of the distribution of stars in each cluster. Discuss outliers and the difference between the shapes and features of each cluster CMD.


# What to turn in
Create a document with your answers to the analysis questions and your CMDs. Each group needs one document. Don’t forget to include everyone in your submission!

# Thanks and Creative Commons
Thanks to Dr. Luisa Rebull for showing me how to use IRSA at an American Astronomical Society meeting.

This work is released under a Creative Commons license.
Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) https://creativecommons.org/licenses/by-nc-sa/4.0/

Edited by Dr. James Newland (https://jimmynewland.com/) as a member of the BIg NITARP Alumni Project (BINAP) - https://nitarp.ipac.caltech.edu/team/89-BINAP-2024 - Last edited 2025/10/17

For more information on the Gaia mission and Hertzsprung-Russell Diagrams, see:

Babusiaux, C., van Leeuwen, F., Barstow, M. A., Jordi, C., Vallenari, A., Bossini, D., Bressan, A., Cantat-Gaudin, T., van Leeuwen, M., Brown, A. G. A., Prusti, T., de Bruijne, J. H. J., Bailer-Jones, C. A. L., Biermann, M., Evans, D. W., Eyer, L., Jansen, F., Klioner, S. A., Lammers, U., … Zwitter, T. (2018). Observational hertzsprung-russell diagrams. Astronomy and Astrophysics, 616. https://doi.org/10.1051/0004-6361/201832843

For more information about using data from IRSA in the astronomy classroom, see:

Rebull, L. M. (2024). Astronomy data in the classroom. Physics Today, 77(2), 44–50. https://doi.org/10.1063/pt.vlhh.iudp   
