# Reading Snapshots

The snapshots are in HDF5 format. They can easily be read using the [h5py](http://docs.h5py.org/en/stable/) library

In [2]:
import h5py

Each simulation has a single point of entry file. This file gives access to:

- ALL snapshots
- ALL header information
- ALL FOF and SubFind outputs

In [3]:
simulation = h5py.File('/disk1/lbignone/data/cielo/simulations/LG1/LG1.hdf5', 'r')

In [4]:
# Exploring the keys() contained in simulation we can see that there is a single entry for each snapshot (redshift)

for key in simulation.keys():
    print(key)

SnapNumber_100
SnapNumber_101
SnapNumber_102
SnapNumber_103
SnapNumber_104
SnapNumber_105
SnapNumber_106
SnapNumber_107
SnapNumber_108
SnapNumber_109
SnapNumber_110
SnapNumber_111
SnapNumber_112
SnapNumber_113
SnapNumber_114
SnapNumber_115
SnapNumber_116
SnapNumber_117
SnapNumber_118
SnapNumber_119
SnapNumber_120
SnapNumber_121
SnapNumber_122
SnapNumber_123
SnapNumber_124
SnapNumber_125
SnapNumber_126
SnapNumber_127
SnapNumber_128
SnapNumber_30
SnapNumber_31
SnapNumber_32
SnapNumber_33
SnapNumber_34
SnapNumber_35
SnapNumber_36
SnapNumber_37
SnapNumber_38
SnapNumber_39
SnapNumber_40
SnapNumber_41
SnapNumber_42
SnapNumber_43
SnapNumber_44
SnapNumber_45
SnapNumber_46
SnapNumber_47
SnapNumber_48
SnapNumber_49
SnapNumber_50
SnapNumber_51
SnapNumber_52
SnapNumber_53
SnapNumber_54
SnapNumber_55
SnapNumber_56
SnapNumber_57
SnapNumber_58
SnapNumber_59
SnapNumber_60
SnapNumber_61
SnapNumber_62
SnapNumber_63
SnapNumber_64
SnapNumber_65
SnapNumber_66
SnapNumber_67
SnapNumber_68
SnapNumber_69
SnapN

In [5]:
# To access a snapshot simply select one

snapshot = simulation['SnapNumber_127']

In [6]:
# We can now explore what information each snapshot contains

for key in snapshot.keys():
    print(key)

Groups
Header
PartType0
PartType1
PartType4
PartType5
SubGroups


# Information stored in the snapshots

## Header information

Notice that the header datagroup contains basic information about the snapshot, such as redshift, cosmological parameters and the MassTable

In [7]:
Header = snapshot['Header']

for key in Header.keys():
    print(f'{key}')

BoxSize
HubbleParam
MassTable
Omega0
OmegaLambda
Redshift
Time


In general, a description of each dataset can be viewed by accesing the .attrs property

In [8]:
for key in Header.keys():
    dataset = Header[f'{key}']
    description = dataset.attrs['description']
    
    print(f'{key}:\n\t{description}')

BoxSize:
	Spatial extent of the periodic box (in co-moving units)
HubbleParam:
	Hubble parameter
MassTable:
	The mass of each particle type. If set to 0 for a type which is present, individual particlemasses can found in the Masses dataset instead
Omega0:
	The cosmological density parameter for matter
OmegaLambda:
	 The cosmological density parameter for the cosmological constant
Redshift:
	The redshift corresponding to the current snapshot
Time:
	The scale factor a (=1/(1+z)) corresponding to the current snapshot


In [9]:
for key in Header.keys():
    dataset = Header[f'{key}']
    
    print(f'{key}:\t{dataset[()]}')

BoxSize:	100000.0
HubbleParam:	0.6711
MassTable:	[0.         0.00010843 0.         0.         0.         0.        ]
Omega0:	0.3175
OmegaLambda:	0.6825
Redshift:	2.220446049250313e-16
Time:	0.9999999999999999


## Particle information

Particle level information can be access through the PartType0, PartType1, PartType2, PartType3, PartType4 and PartType5 datagroups

The particle numbers correspond to:

|# | type|
|--|------------------------------|
|0 | high-resolution gas|
|1 | high-resolution dark matter|
|2 | intermediate-resolution dark matter|
|3 | intermediate-resolution dark matter|
|4 | stars|
|5 | low-resolution dark matter|

The information actually contained depends on each type. But again, descriptions are available using the .attrs property

In [10]:
# For example for stars

PartType4 = snapshot['PartType4']

for key in PartType4.keys():
    dataset = PartType4[f'{key}']
    description = dataset.attrs['description']
    
    print(f'{key}:\n\t{description}')

Abundances:
	Mass in individual elements: He, C, Mg, O, Fe, Si, H, N, Ne, S, Ca, Zi (in this order)
BindingEnergy:
	Particle binding energy
Circularity:
	The particle circularity calculated as $J_z/J(E)$, where J(E) is the maximum angular momentum of the particles at positions between 50 before and 50 after the particle in question in a list where the stellar particles are sorted by their binding energy
Coordinates:
	Spatial position within the periodic box. Co-moving coordinate
GroupNumber:
	FoF id of the Group this object belongs to. -1 if the object does not belong to any group
Masses:
	Mass of this particle
ParticleIDs:
	The unique particle ID
Potential:
	Gravitational potential energy
SpecificAngularMomentum:
	
StellarFormationTime:
	The time (given as the scale factor) when this star was formed
SubFindNumber:
	SubFind id of the object this particle belongs to. -1 if the object does not belong to any SubGroup
SubGroupNumber:
	SubGroup number this object belongs to. 0 for centrals.

## FoF data

The output of the FOF code, which describe the halos can be found under the Groups datagroup

In [26]:
Groups = snapshot['Groups']

for key in Groups.keys():
    dataset = Groups[f'{key}']
    try:
        description = dataset.attrs['description']
        print(f'{key}:\n\t{description}')
    except:
        pass

GroupCM:
	Center of mass of the Group
GroupLen:
	Total number of particles in the group
GroupLenType:
	Total number of particles in the group, split by particle type
GroupMassType:
	Total mass of particles in the group, split by particle type
GroupNsubs:
	Number of SubGroups in each Group
GroupNumber:
	FoF id of the Group this object belongs to. -1 if the object does not belong to any group
GroupSFR:
	 Sum of the individual star formation rates of all gas cells in this group
Group_M_Crit200:
	Total Mass of this group enclosed in a sphere whose mean density is 200 times the critical density of the Universe, at the time the halo is considered
Group_M_Mean200:
	Total Mass of this group enclosed in a sphere whose mean density is 200 times the mean density of the Universe, at the time the halo is considered
Group_M_TopHat200:
	
Group_R_Crit200:
	Comoving Radius of a sphere centered at this Group whose mean density is 200 times the critical density of the Universe, at the time the halo is co

## SubFind data

And similarly for the subfind data, which can be found under SubGroups

In [27]:
SubGroups = snapshot['SubGroups']

for key in SubGroups.keys():
    dataset = SubGroups[f'{key}']
    try:
        description = dataset.attrs['description']
        print(f'{key}:\n\t{description}')
    except:
        pass

GroupNumber:
	FoF id of the Group this object belongs to. -1 if the object does not belong to any group
OpticalRadius:
	Optical radii computed as the radius that encompass 83% of stellar and star-forming gas mass belonging to the SubGroup
SubFindNumber:
	SubFind id of the object this particle belongs to. -1 if the object does not belong to any SubGroup
SubGroupHalfMass:
	Half mass of the SubGroup
SubGroupLen:
	Number of particles contained in this SubGroup
SubGroupMostBoundID:
	ParticleID of the most bound particle
SubGroupNumber:
	SubGroup number this object belongs to. 0 for centrals. -1 if the object does not belongto any SubGroup
SubGroupPos:
	Position of the most bound particle
SubGroupVel:
	Peculiar velocity of the center of mass of the SubGroup


## Offsets data

Both Groups and SubGroups contain also information on the offset location of particles belonging to each particular halo or subhalo. This can be useful to limit the amount of data to load into memory

The offsets can be access using the key `PartType[N]/Offset`, where [N] is the particle number.

For example:

In [28]:
SubGroups['PartType0/Offsets']

<HDF5 dataset "Offsets": shape (6027, 2), type "<f8">

More details on using the offsets and reading only particlar halos/subhalos are given in a separate notebook

# Loading data

So far we have only explored the contents of the file, to actually use the data we need to load the contents to memory. To do so there are several ways, see (http://docs.h5py.org/en/stable/high/dataset.html#reading-writing-data)

In [29]:
# Notice that 

Coordinates = PartType4['Coordinates']

# Does not acutually load anaything, it acts simply as a placeholder

Coordinates

<HDF5 dataset "Coordinates": shape (17229508, 3), type "<f4">

In [30]:
# Oly when we provide indexes or slices the data is read

Coordinates = PartType4['Coordinates'][:]

Coordinates

array([[52939.516, 50580.35 , 42499.445],
       [52939.008, 50572.133, 42509.156],
       [52901.645, 50714.305, 42592.742],
       ...,
       [54639.01 , 51438.266, 45639.094],
       [54636.973, 51436.395, 45639.297],
       [54634.035, 51434.24 , 45645.77 ]], dtype=float32)

The above command reads all the Coordinates into memory, to limit the amount of information to load consult the other notebooks

# Units

Units and other useful information to convert the snapshot units to physical units can be access also using the .attrs propery

These fields rougly follow the EAGLE convention as described in https://arxiv.org/abs/1706.09899. See section 2.3.8 for an example

In [31]:
PartType4['Coordinates'].attrs.keys()

<KeysViewHDF5 ['a_exp', 'cgs_conversion_factor', 'cgs_units', 'description', 'description_units', 'h_exp']>