# Workshop 3: Exploring IllustrisTNG simulations to derive observationally comparable star formation rates and metallicities

## Notebook 1: IllustrisTNG API and downloading data

In this notebook, you will be introduced to the different data types availbale for the TNG simulations, and how to use the TNG API to download the data you need.

In [1]:
#BRYANNE: give install link
import iapi_TNG as iapi
#this package contains useful functions for downloading the neccessary data
#make sure you have edited iapi_TNG.py to include your personal API key
import numpy as np
import h5py #most TNG data is downloaded as hdf5 files

h=0.6774 #BRYANNE: remove for final version
baseUrl = 'http://www.tng-project.org/api/'


In [16]:
b=np.arange(99)
iapi.getredshift(b[:])

TypeError: only integer scalar arrays can be converted to a scalar index

### General simulation data

BRYANNE: pull examples from your code

In [2]:
#EDIT THIS FOR YOUR MACHINE
#dirc ='/path/to/directory/
dirc='/projectnb/res-star/TNG Workshop/'

##specify which simulation you want to explore
sim='TNG100-1'
"""
TNG100-1 is the highest resolution simulation in the 100 Mpc simulation box
TNG50-1 is the highest resolution available in a 50 Mpc box
TNG300-1 is the largest volume simulation (300 Mpc)
Lower resolutions are available with -N replacing -1, allowing for testing resolution dependency
'Dark' simulations are also available: these are dark-matter only runs
Subboxes are availble that provide higher time resolution
For the exercises in this worrkshop, you will need to use a baryonic simulation, we recommend TNG100-1, TNG50-1, or TNG300-1
Check all available simulations by uncommenting the line below:
"""
r=iapi.get(baseUrl)
#print([sim['name'] for sim in r['simulations']])

In [3]:
#check the properties of the simulation you have selected
simUrl = baseUrl+sim
print(simUrl) #view the simulation data in your browser by following the URL (make sure you are logged in!)
simdata = iapi.get(simUrl)
print(simdata['description'])

##uncomment line below to see all the simulation-level information available, or follow the simUrl
#print(simdata.keys())


http://www.tng-project.org/api/TNG100-1
Main high-resolution IllustrisTNG100 run including the full TNG physics model.


#### Exercise: Find value of hubble's constant used in the simulation you chose to explore
In simulations, units often include 'little h' or Hubble's constant divided by 100.

In these simulations, the value of h is stored in the simulation data as the 'hubble' key.

In [4]:
##Complete this line
h=
print(h)

SyntaxError: invalid syntax (4157845934.py, line 2)

### Group catalogs

Group catalogs contain properies of all identified halos or subhalos (galaxies) in a given snapshot. These are good for obtaining masses, positions, and other global properties. You can check out details about the available fields here: https://www.tng-project.org/data/docs/specifications/#sec2 

In iapi_TNG there are two similar functions that obtain a field for all subhalos or all halos in a given simulation at a given snapshot:

> getSubhaloField(field, simulation='TNG100-1', snapshot=99, fileName='tempCat', rewriteFile=1)

> getHaloField(field, simulation='TNG100-1', snapshot=99, fileName='tempCat', rewriteFile=1)

- field (str): name of field to be returned from the table linked above, e.g. 'SubhaloPos'
- simulation (str): name of simulation, e.g. 'TNG100-1'
- snapshot (int): snapshot to pull data from. For TNG, snapshot=99 is z=0, which is the default
- fileName (str): path to the file where you want to save the data, recommended to avoid repeated API requests
- rewriteFile (0 or 1): if 0 (recommended), will attempt to pull from an existing file (fileName) before downloading; if 1 will download and overwrite

Now let's fetch the fields we will want for our later analysis

In [4]:
#the flag field indicates whether a subhalo is cosmological in origin
#you will generally only want to use subhalos that have flag=1
flag=iapi.getSubhaloField('SubhaloFlag',simulation=sim,fileName=dirc+'catalogs/SubhaloFlag',rewriteFile=0)

In [5]:
#let's fetch a field that will tell us about the mass of the galaxy
#SubhaloMassType gives the total mass of all bound particles, separated by particle type
mass=iapi.getSubhaloField('SubhaloMassType',simulation=sim,fileName=dirc+'catalogs/MassType',rewriteFile=0)
print(mass.shape)

#note that there are 6 entries for each subhalo, one for each particle type:
#0 - gas
#1 - dark matter
#2 - unused
#3 - tracers (you can ignore these)
#4 - stars/wind
#5 - black holes

#Pull the stellar mass: 
stellar_mass=mass[:,4]

#OLIVIA: turn this into an exercise?
#check the subhalo catalog for the default units, convert into stellar masses
stellar_mass=stellar_mass*10**10/h
#print(min(stellar_mass[np.nonzero(stellar_mass)]))

(4371211, 6)


#### Exercise 1:

There are several other fields relating to galaxy mass. Review those found at the link above and fetch at least one other field relating to mass. Later, we will test the effect of using other definitions of mass on the global star formation main sequence. Generally, you will want to use a mass that is most comparable to how mass was measured in observations you want to compare to.

OLIVIA

In [6]:
sfr_inst = iapi.getSubhaloField('SubhaloSFR',simulation = sim,fileName=dirc+'SubhaloSFR',rewriteFile=0)
#the subhalo catalog includes SubhaloSFR, which is the sum of SFRs over all gas particles bound to the subhalo
#this is NOT directly comparable to SFRs obtained from observations, 
#because observational tracers generally detect already formed stars, not stars about to be formed

#if we want to get more comparable SFRs, we'll have to dig into particle data or merger trees

#### Exercise 2: Metallicities
BRYANNE what do you want here
OLIVIA set this up

In [7]:
#process the data to make our galaxy catalog
#it's useful to keep track of the subhalo ID (subID), the index into the fields
subID=np.arange(0,len(flag))

#make cuts based on flag, and any other cuts to generate your sample
wh_incl=np.nonzero((flag==1))
#to make additional cuts, add additional criteria to the line above 
#e.g. wh_incl=np.nonzero((flag==1) & (stellar_mass>masscut))

#Now store fields for our sample in a dictionary
##Update with any other fields you would like to store

IDs=subID[wh_incl]
s_mass=stellar_mass[wh_incl]
sfr_i = sfr_inst[wh_incl]

galcat = {
    'subID' : IDs,
    'M_*' : s_mass,
    'SFR_inst': sfr_i
}

#save the galaxy catalog for later use
np.save(dirc+'galcat', galcat)


### Merger Trees

Tracing a subhalo through cosmic time can be complicated by the major and minor mergers that ultimately form a z=0 galaxy. The merger trees trace the most massive progenitor of a subhalos through previous snapshots. See the TNG data specifications for more information: https://www.tng-project.org/data/docs/specifications/#sec2

In this workshop, we will be using the SubLink merger trees.

iapi_TNG contains the function gettree(snapnum,subID), which obtains the tree for a given galaxy. The trees contain all the fields in the Halo and Subhalo group catalogs, for each snapshot. Subhalo information will always be for the progenitor of the subID at snapnum. The group/halo of a subhalo may change, so the group information in previous snapshots may not be for the group the subhalo is a member of at snapnum.

getredshift(snapnum) is another useful function in iapi_TNG. This returns the redfshift of a given snapshot. 

In [8]:
#Let's explore the history of a random subhalo in our sample
sub = np.random.choice(IDs)
print(sub)
subTreeFile = iapi.gettree(99,sub)

#open the hdf5 file that contains the tree
subTree= h5py.File(subTreeFile,'r')

#What fields are available?
print(subTree.keys())


2153210
<KeysViewHDF5 ['DescendantID', 'FirstProgenitorID', 'FirstSubhaloInFOFGroupID', 'GroupBHMass', 'GroupBHMdot', 'GroupCM', 'GroupFirstSub', 'GroupGasMetalFractions', 'GroupGasMetallicity', 'GroupLen', 'GroupLenType', 'GroupMass', 'GroupMassType', 'GroupNsubs', 'GroupPos', 'GroupSFR', 'GroupStarMetalFractions', 'GroupStarMetallicity', 'GroupVel', 'GroupWindMass', 'Group_M_Crit200', 'Group_M_Crit500', 'Group_M_Mean200', 'Group_M_TopHat200', 'Group_R_Crit200', 'Group_R_Crit500', 'Group_R_Mean200', 'Group_R_TopHat200', 'LastProgenitorID', 'MainLeafProgenitorID', 'Mass', 'MassHistory', 'NextProgenitorID', 'NextSubhaloInFOFGroupID', 'NumParticles', 'RootDescendantID', 'SnapNum', 'SubfindID', 'SubhaloBHMass', 'SubhaloBHMdot', 'SubhaloBfldDisk', 'SubhaloBfldHalo', 'SubhaloCM', 'SubhaloGasMetalFractions', 'SubhaloGasMetalFractionsHalfRad', 'SubhaloGasMetalFractionsMaxRad', 'SubhaloGasMetalFractionsSfr', 'SubhaloGasMetalFractionsSfrWeighted', 'SubhaloGasMetallicity', 'SubhaloGasMetallicity

In [9]:
#pull the snapshot numbers that correspond to each entry in each of the fields for this tree
snaps = subTree['SnapNum'][:]
print(snaps[0])
#notice how the first entry corresponds to z=0! The latest entries are first, while the earliest entries are last
print(snaps[-1])
#Some subhalos have shorter merger trees than others
#If the random subhalo you selected earlier has a short merger tree (earliest snapshot > ~70), rerun the previous block

99
13


In [13]:
#What redshift does the earliest snapshot correspond to?
print(iapi.getredshift(snaps[-1]))

#construct an array of redshifts corresponding to snaps
z=np.empty(len(snaps))
for i in range(0,len(snaps)):
    #this may take a few minutes
    z[i] = iapi.getredshift(snaps[i])
#print(z)

6.0107573988449
[2.22044605e-16 9.52166697e-03 2.39744284e-02 3.37243719e-02
 4.85236300e-02 5.85073228e-02 7.36613847e-02 8.38844308e-02
 9.94018026e-02 1.09869940e-01 1.25759332e-01 1.41876204e-01
 1.52748769e-01 1.69252033e-01 1.80385262e-01 1.97284182e-01
 2.14425036e-01 2.25988386e-01 2.43540182e-01 2.61343256e-01
 2.73353347e-01 2.97717685e-01 3.10074120e-01 3.28829724e-01
 3.47853842e-01 3.60687657e-01 3.80167867e-01 3.99926965e-01
 4.19968942e-01 4.40297849e-01 4.60917794e-01 4.81832943e-01
 5.03047523e-01 5.24565820e-01 5.46392183e-01 5.75980845e-01
 5.98543288e-01 6.21428745e-01 6.44641841e-01 6.76110411e-01
 7.00106354e-01 7.32636182e-01 7.57441373e-01 7.91068249e-01
 8.16709979e-01 8.51470901e-01 8.86896938e-01 9.23000816e-01
 9.50531352e-01 9.97294226e-01 1.03551045e+00 1.07445789e+00
 1.11415056e+00 1.15460271e+00 1.20625808e+00 1.24847261e+00
 1.30237846e+00 1.35757667e+00 1.41409822e+00 1.49551217e+00
 1.53123903e+00 1.60423452e+00 1.66666956e+00 1.74357057e+00
 1.82268

#### Exercise: Construct a plot of star formation rate versus redshift for this subhalo


In [None]:
### do the exercise here
#hint: pull the SubhaloSFR field from the merger tree first



In [1]:
#BRYANNE: use astropy to pull timescales and get comparable redshifts to get sfrs from?

### Particle Data
BRYANNE: introduce

In [None]:
parttype='stars'
particle_fields = 'Coordinates,Masses,GFM_Metallicity,GFM_StellarFormationTime,GFM_InitialMass,GFM_StellarPhotometrics'
#Note that for time-averaged SFR calculations, the initial mass of a star should be used
#BRYANNE: link to Donnarri erratum

cut_file = getSubcutout(sub, parttype, particle_fields, sim=sim, snapnum=99, fName=dirc+parttype+'_'+sub)

with h5py.File(cut_file,'r') as f:
    #change PartType4 to PartType0 if working with gas particles
    mass = np.asarray(f['PartType4']['Masses'])
    SF_time = np.asarray(f['PartType4']['GFM_StellarFormationTime'])
    iMass = np.asarray(f['PartType4']['GFM_InitialMass'])
    
    #in this example, we will use r-band magnitude for luminosity weighting
    #use a different band by selecting a different column from GFM_StellarPhotometrics
    #check data specifications to see which column corresponds to which band
    rmag = np.asarray(f['PartType4']['GFM_StellarPhotometrics'][:,5]


