# Evolution of the Smallest Galaxies in the Universe

In [2]:
import pandas as pd
import math
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sb



The Milky Way swarms with orbiting satellite dwarf galaxies of astounding diversity. Some galaxies continue to form stars while others stop and dim in brightness. In computer simulations, the evolutionary history of each dwarf galaxy that leads to these differences is known. Galaxies can lose gas and stop forming stars due to early exposure to stellar radiation (reionization), interaction with the hot gas of the host (ram-pressure stripping), or gravitational interactions with the host/dwarf galaxies (tidal effects).

We propose to use the characteristics of simulated dwarf galaxies from a suite of 20 Milky Way-like galaxies, each of which hosts ~100 dwarf galaxies, to train a supervised machine learning algorithm. The goal is to use only those dwarf galaxy characteristics observable in the present day as features to predict key aspects of their evolutionary history, such as the dominant mechanism by which they lose gas (classification), and/or the timescale of gas loss (regression). We will then apply the simulation-trained algorithm to the characteristics of our own Milky Way’s dwarf galaxy population, which have uncertain evolutionary histories. Identifying the likely histories of dwarf galaxies will help us to understand how galaxies form and evolve over time.


Question: Can simulations of populations of dwarf galaxies around Milky Way-like galaxies be used to reliably predict the evolutionary histories of the observed dwarf galaxies aroudn the Milky Way?

We'll work with two types of data: simulations and observations. The observations come from a wide variety of sources in the literature. Our next task will be to get the data from the simulations. First stage will be to familiarize with the observational data. 

# Observational Data

This compilation of the observed characteristics of dwarf galaxies in the Local Group. The original basis of the list is from McConnachie 2012 (https://arxiv.org/abs/1204.1562) and expanded with data from a variety of sources (e.g. https://arxiv.org/pdf/1901.05465.pdf) from the literature. The list is created in the following notebook: (https://github.com/janagrc/HIdwarflimits/blob/master/HI_Limits.ipynb)

In [4]:
dwarfs = pd.read_csv('./dwarfs.csv',header=(0),comment='#')

In [5]:
dwarfs

Unnamed: 0.1,Unnamed: 0,GalaxyName,RA_hr,RA_min,RA_sec,Dec_deg,Dec_arcmin,Dec_arcsec,EB-V,m-M,...,r_apo_err2,v_apo_mean,v_apo_median,v_apo_std,n_halo_apo,n_halo_apo_err1,n_halo_apo_err2,ell_surf_log10_abs,min_dist_mw_or_m31,Closer_MW_M31
0,0,,,,,,,,,,...,,,,,,,,,,
1,1,,,,,,,,,,...,,,,,,,,,,
2,2,SagittariusdSph,18.0,55.0,19.5,-30.0,32.0,43.0,0.153,17.10,...,12.227629,104.536117,105.278226,8.764286,0.004338,0.000753,0.001042,-2.325838,18.587576,MW
3,3,TucanaIII,23.0,56.0,36.0,-59.0,36.0,0.0,9.999,17.01,...,1.745390,36.939552,37.089534,3.743090,0.000098,0.000089,0.000177,-2.318699,23.298941,MW
4,4,DracoII,15.0,52.0,47.6,64.0,33.0,55.0,0.016,16.90,...,26.110572,77.314244,77.777376,12.929979,0.000521,0.000460,0.000959,-2.330260,26.126576,MW
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
138,138,HydrusI,2.0,29.0,33.3,-79.0,18.0,32.0,,17.20,...,25.920137,73.949053,73.697460,9.908038,0.000494,0.000187,0.000267,-2.310180,25.741743,MW
139,139,CarinaII,7.0,36.0,25.5,-57.0,59.0,57.0,,17.79,...,61.056593,49.882156,50.330522,7.637017,0.001654,0.001038,0.001686,-2.285622,37.296364,MW
140,140,CarinaIII,7.0,38.0,31.1,-57.0,53.0,59.0,,17.22,...,226.758301,61.517261,55.847231,28.095470,0.003703,0.003150,0.010406,-2.303074,29.180644,MW
141,141,CraterII,11.0,49.0,14.4,-18.0,24.0,47.0,,20.35,...,6.749871,80.577944,81.659607,19.234086,0.000411,0.000158,0.000361,-2.122559,116.471698,MW


In [6]:
list(dwarfs.columns)

['Unnamed: 0',
 'GalaxyName',
 'RA_hr',
 'RA_min',
 'RA_sec',
 'Dec_deg',
 'Dec_arcmin',
 'Dec_arcsec',
 'EB-V',
 'm-M',
 'm-M_err_pos',
 'm-M_err_neg',
 'vh(km/s)',
 'Vmag',
 'Vmag_err_pos',
 'Vmag_err_neg',
 'PA',
 'e=1-b/a',
 'muVo',
 "rh(')",
 'rh_err_pos',
 'rh_err_neg',
 'vsig_s',
 'vsig_err_pos',
 'vsig_err_neg',
 'vrot_s',
 'vrot_s_err_pos',
 'vrot_s_err_neg',
 'MHI',
 'Data',
 'Key',
 'vsig_g',
 'vsig_g_err_pos',
 'vsig_g_err_neg',
 'vrot_g',
 'note',
 'M-m',
 'Notes',
 'ra',
 'dec',
 'M_V',
 'dist_kpc',
 'M_dyn',
 'M_*',
 'vh_(km/s)_err',
 'vsig_s_upper_limit',
 'vsig_s_err',
 'vh(km/s)_err',
 'orb_pericenter',
 'orb_apocenter',
 'orb_eccentricity',
 'orb_period',
 'orb_period_type',
 'orb_pericenter_0.8',
 'orb_apocenter_0.8',
 'orb_eccentricity_0.8',
 'tau_50',
 'tau_90',
 'dist_pc',
 'dist_mpc',
 'Radius',
 'gal_interference_min',
 'gal_interference_max',
 'Galactic Interference Range',
 'MHI_computed',
 'MHI_source',
 'MHI_type',
 '1sigma',
 'rh_subtable',
 'MHI_method',


In [8]:
#Rename columns to be more descriptive
cols = ['min_dist_mw_or_m31','orb_pericenter','orb_apocenter','orb_eccentricity','orb_period','Radius','M_V','ell_surf_dist','M_dyn','vsig_s','vrot_s','vsig_g','vrot_g','muVo','tau_90','[Fe/H]','V_GSR','V_LGSR','MHI_computed']

d ={'l':'l','b':'b','dist_pc': 'Distance','min_dist_mw_or_m31': 'Dist MW or M31, nearest','orb_pericenter': 'Orbit Pericenter','orb_apocenter': 'Orbit Apocenter','orb_eccentricity':'Orbit Eccentricity','orb_period': 'Orbit Period','rh(\')': 'Angular Radius','Radius':'Radius [pc]','M_V': 'V band Magnitude M$_{V}$','ell_surf_dist': 'Distance to LG surf','M_dyn': 'Estimated Dynamical Mass','vsig_s': 'Stellar Velocity Dispersion $\sigma_{star}$','vrot_s': 'Stellar Rotation Velocity $V_{rot,star}$','vsig_g': 'HI Velocity Dispersion $\sigma_{gas}$','vrot_g': 'HI Rotation Velocity $V_{rot,gas}$','muVo': 'Surface Brightness $\mu_{V}$','vh(km/s)':'Heliocentric Velocity $v_{helio}$','tau_90':"Star Formation Timescale $\\tau_{90}$",'[Fe/H]':'Mean Metallicity $\\langle $[Fe/H]$\\rangle $','V_GSR':'Galactocentric Velocity $V_{GSR}$','V_LSR':'Local Std of Rest Velocity $V_{LSR}$','V_LGSR':'Local Group Std of Rest Velocity $V_{LGSR}$','MHI_computed':'HI Mass $M_{HI}$ (Detected or Limit)','X':'3D Position (X)','Y':'3D Position (Y)','Z':'3D Position (Z)'}
dwarfs = dwarfs.rename(columns=d)

-familiarize with observational data

-what do the columns represent physically

-do some open ended exploration, perhaps looking at relationships between different characteristics and making some plots

-Particularly interested in the star formation timescale (tau_90) and the neutral hydrogen gas content/HI mass (MHI_computed). Why? Only because we observed the HI mass for the first time for some of these dwarfs, so it is our unique contribution, and is also relates to the evolutionary history of the dwarfs. 

# Simulations

In [None]:
We still need to get the simulated data. I'll email Christine Simpson at UChicago and cc you to request data access, then we can compare