# Justice League Stellar Merger History

## Charlotte Christensen, June 11 2025

This jupyter notebook runs Anna Wright's code (/home/christenc/Code/python/AnnaWrite_startrace/RomZoomSHAnalysisScripts) to identify the stars in those halos.

From Anna's Email (June 4, 2025)

I managed to get through the pipeline and remember what everything did well enough to comment it today, but didn't have time to test it on a new halo. However, if you'd like to try it in the next 24 hours or so, I've attached a zip file with steps 1-7 of my pipeline (there are 8 files, but TrackDownStars_rz.ipynb and FixHostIDs_rz.py are really two halves of a single step). Step 0 is creating a tangos db for the simulation you'll be working with and my pipeline uses that and the simulation itself to create an hdf5 file with relevant data for each star particle. The most important bits are a unique host ID for each star particle so that stars that formed in the same halo can be grouped together, even if that halo doesn't exist at z=0, and the orbital circularity of each star particle, which I use to identify members of the stellar halo.
I will be testing it tomorrow on one of the newer Romulus Zooms, so there's a good chance I'll be sending you an updated version very soon with any bug fixes :) 
I apologize for how many pieces the pipeline is in - it really isn't all that complicated, but it was adapted from what I did for the FOGGIE sims and I split the steps of that pipeline up so that I could do as much as possible locally (rather than on Pleiades) and so that I could sanity check as often as possible. Please let me know if you have any issues, questions, or suggestions! I'd definitely be eager to hear what Juan does differently!

In [35]:
import os
import socket
hostname = socket.gethostname()
if 'emu' in hostname:
    # os.environ['TANGOS_SIMULATION_FOLDER'] = '/home/ns1917/tangos_sims/rogue.4096g5HbwK1BH_bn/'
    os.environ['TANGOS_SIMULATION_FOLDER'] = '/home/ns1917/tangos_sims/storm.4096g5HbwK1BH_bn/'
    os.environ['TANGOS_DB_CONNECTION'] = '/home/ns1917/Databases/Marvel_BN_N10.db'
    # os.environ['TANGOS_DB_CONNECTION'] = '/home/ns1917/pynbody/Tangos/Marvel_BN_N10.db'
    os.chdir('/home/ns1917/pynbody/AnnaWright_startrace/')
else: # grinnell
    os.environ['TANGOS_SIMULATION_FOLDER'] = '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/'
    # os.environ['TANGOS_DB_CONNECTION'] = '/home/selvani/MAP/Data/Marvel_BN_N10.db'
    os.environ['TANGOS_DB_CONNECTION'] = '/home/selvani/MAP/pynbody/Tangos/Marvel_BN_N10.db'
    os.chdir('/home/selvani/MAP/pynbody/AnnaWright_startrace/')

import tangos
tangos.all_simulations()

[<Simulation("cptmarvel.4096g5HbwK1BH_bn")>,
 <Simulation("rogue.4096g5HbwK1BH_bn")>,
 <Simulation("storm.4096g5HbwK1BH_bn")>]

In [3]:
# tangos.get_simulation("cptmarvel.4096g5HbwK1BH_bn").timesteps

In [37]:
import pynbody as pb
import numpy as np
import pandas as pd
import glob
import h5py
import tangos

# For importing modules
import importlib.util
import sys
from pathlib import Path

# Import the module
# base_path = '/home/selvani/MAP/pynbody/'
if 'emu' in hostname:
    base_path = '/home/ns1917/pynbody/AnnaWright_startrace'
else:
    base_path = '/home/selvani/MAP/pynbody/AnnaWright_startrace'

for root, dirs, files in os.walk(base_path):
    if root not in sys.path:
        print("Adding to sys.path:", root)
        sys.path.append(root)

In [38]:
# Simulation name and path
if 'emu' in hostname:
    simpath = '/home/ns1917/tangos_sims/'
    outfile_dir = "/home/ns1917/pynbody/stellarhalo_trace_aw/"
else:
    simpath = '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/'
    outfile_dir = "/home/selvani/MAP/pynbody/stellarhalo_trace_aw/"

# basename = 'cptmarvel.cosmo25cmb.4096g5HbwK1BH'
# basename = 'rogue.cosmo25cmb.4096g5HbwK1BH'
basename = 'storm.cosmo25cmb.4096g5HbwK1BH'
# ss_dir = 'cptmarvel.4096g5HbwK1BH_bn'#'snapshots_200crit_h329' # same as db_sim?
# ss_dir = 'rogue.4096g5HbwK1BH_bn'
ss_dir = 'storm.4096g5HbwK1BH_bn'
sim_base = simpath + ss_dir + '/'
ss_z0 = sim_base + basename + '.004096'

#### Tangos tests

In [39]:
all_timesteps = tangos.get_simulation(ss_dir).timesteps

In [40]:
timestep = all_timesteps[-1]

In [41]:
halos = timestep.halos.all()

In [42]:
halos_with_stars = [halo for halo in halos if halo.NStar > 0]
halos_with_stars

[<Halo 'storm.4096g5HbwK1BH_bn/storm.cosmo25cmb.4096g5HbwK1BH.004096/halo_1' | NDM=11956794 Nstar=2854124 Ngas=5544225>,
 <Halo 'storm.4096g5HbwK1BH_bn/storm.cosmo25cmb.4096g5HbwK1BH.004096/halo_2' | NDM=6020296 Nstar=581636 Ngas=3055754>,
 <Halo 'storm.4096g5HbwK1BH_bn/storm.cosmo25cmb.4096g5HbwK1BH.004096/halo_3' | NDM=4023224 Nstar=143680 Ngas=1610867>,
 <Halo 'storm.4096g5HbwK1BH_bn/storm.cosmo25cmb.4096g5HbwK1BH.004096/halo_4' | NDM=2840249 Nstar=225480 Ngas=364281>,
 <Halo 'storm.4096g5HbwK1BH_bn/storm.cosmo25cmb.4096g5HbwK1BH.004096/halo_5' | NDM=1674684 Nstar=49882 Ngas=222673>,
 <Halo 'storm.4096g5HbwK1BH_bn/storm.cosmo25cmb.4096g5HbwK1BH.004096/halo_6' | NDM=1521044 Nstar=20212 Ngas=279119>,
 <Halo 'storm.4096g5HbwK1BH_bn/storm.cosmo25cmb.4096g5HbwK1BH.004096/halo_7' | NDM=1473366 Nstar=65167 Ngas=139001>,
 <Halo 'storm.4096g5HbwK1BH_bn/storm.cosmo25cmb.4096g5HbwK1BH.004096/halo_8' | NDM=1407596 Nstar=62251 Ngas=115365>,
 <Halo 'storm.4096g5HbwK1BH_bn/storm.cosmo25cmb.4096g5H

In [43]:
halo = halos_with_stars[0]

In [44]:
halo.all_properties

[<HaloProperty Mvir=6.41e+10 of <Halo 1 of ...>>,
 <HaloProperty Rvir=8.38e+01 of <Halo 1 of ...>>,
 <HaloProperty Xc=9.10e+03 of <Halo 1 of ...>>,
 <HaloProperty Yc=1.27e+04 of <Halo 1 of ...>>,
 <HaloProperty Zc=8.88e+03 of <Halo 1 of ...>>,
 <HaloProperty VXc=3.25e+00 of <Halo 1 of ...>>,
 <HaloProperty VYc=1.43e+02 of <Halo 1 of ...>>,
 <HaloProperty VZc=-1.10e+01 of <Halo 1 of ...>>,
 <HaloProperty Vmax=7.44e+01 of <Halo 1 of ...>>,
 <HaloProperty fMhires=1.00e+00 of <Halo 1 of ...>>,
 <HaloProperty M_gas=5.39e+09 of <Halo 1 of ...>>,
 <HaloProperty M_star=5.74e+08 of <Halo 1 of ...>>,
 <HaloProperty n_gas=5544225 of <Halo 1 of ...>>,
 <HaloProperty n_star=2854124 of <Halo 1 of ...>>,
 <HaloProperty n_dm=11956794 of <Halo 1 of ...>>,
 <HaloProperty npart=20355143 of <Halo 1 of ...>>]

In [45]:
for halo in halos_with_stars:
    if halo['Mvir'] > 1e8:
        print(f"Halo ID: {halo.halo_number}, M_star: {halo['M_star']:.2e}, NStar: {halo.NStar}, Mvir: {halo['Mvir']:.2e}")

Halo ID: 1, M_star: 5.74e+08, NStar: 2854124, Mvir: 6.41e+10
Halo ID: 2, M_star: 1.15e+08, NStar: 581636, Mvir: 3.25e+10
Halo ID: 3, M_star: 2.85e+07, NStar: 143680, Mvir: 2.12e+10
Halo ID: 4, M_star: 4.39e+07, NStar: 225480, Mvir: 1.42e+10
Halo ID: 5, M_star: 9.67e+06, NStar: 49882, Mvir: 8.38e+09
Halo ID: 6, M_star: 4.05e+06, NStar: 20212, Mvir: 7.68e+09
Halo ID: 7, M_star: 1.27e+07, NStar: 65167, Mvir: 7.32e+09
Halo ID: 8, M_star: 1.19e+07, NStar: 62251, Mvir: 6.97e+09
Halo ID: 10, M_star: 4.30e+05, NStar: 2032, Mvir: 6.34e+09
Halo ID: 11, M_star: 6.73e+05, NStar: 3257, Mvir: 4.92e+09
Halo ID: 13, M_star: 1.95e+06, NStar: 10030, Mvir: 4.78e+09
Halo ID: 14, M_star: 1.07e+06, NStar: 5615, Mvir: 3.00e+09
Halo ID: 15, M_star: 5.72e+02, NStar: 3, Mvir: 2.69e+09
Halo ID: 16, M_star: 4.15e+05, NStar: 1949, Mvir: 2.59e+09
Halo ID: 17, M_star: 8.20e+04, NStar: 430, Mvir: 2.63e+09
Halo ID: 20, M_star: 2.90e+06, NStar: 14754, Mvir: 8.96e+09
Halo ID: 22, M_star: 2.25e+05, NStar: 1081, Mvir: 8.8

In [33]:
s = pb.load(sim_base + basename + '.000480')
h = s.halos(halo_numbers='v1')

unique_halos, counts = np.unique(s.s['amiga.grp'][s.s['tform'] > 0], return_counts=True)
iord_counts = pd.Series(counts, index=unique_halos)
print(iord_counts)

-1       353194
 1       257970
 2        83526
 3        40725
 4         8061
 6         6073
 8          537
 9         1820
 10       16863
 11        3160
 13        4459
 14        1374
 15        3621
 16        1429
 17        3429
 20        1311
 21        5798
 23        1049
 24         358
 25          61
 26        1531
 27        2216
 28          42
 32        1285
 36         779
 38         326
 64          27
 88           2
 93          51
 113         28
 147        106
 157          5
 1895       912
dtype: int64


In [32]:
s = pb.load(sim_base + basename + '.000384')
h = s.halos(halo_numbers='v1')

unique_halos, counts = np.unique(s.s['amiga.grp'][s.s['tform'] > 0], return_counts=True)
iord_counts = pd.Series(counts, index=unique_halos)
print(iord_counts)

1       233950
2       140971
3         5322
4        30720
5        21688
6         1555
8        16341
10        1995
11        2775
12        1010
13         542
14        1614
16        1341
18        1934
19         643
20        1137
21         832
23         728
24         595
25        2228
26        1248
30          14
33        1053
38         330
39          39
73          27
84           2
97          51
106         28
113        106
173       1649
1035        70
dtype: int64


In [24]:
!tangos serve

2025-07-30 10:54:24,889 INFO  [matplotlib.font_manager:1639][MainThread] generated new fontManager
Starting server in PID 2281410.
2025-07-30 10:54:25,406 INFO  [waitress:449][MainThread] Serving on http://[::1]:6543
2025-07-30 10:54:25,406 INFO  [waitress:449][MainThread] Serving on http://127.0.0.1:6543
2025-07-30 10:54:38,125 WARNI [waitress.queue:113][MainThread] Task queue depth is 1
2025-07-30 10:54:38,275 WARNI [waitress.queue:113][MainThread] Task queue depth is 1
2025-07-30 10:54:38,461 WARNI [waitress.queue:113][MainThread] Task queue depth is 1
2025-07-30 10:54:38,598 : Tree build complete; total time 0.13s
2025-07-30 10:54:38,598 INFO  [tangos.log:72][waitress-2] Tree build complete; total time 0.13s
2025-07-30 10:54:38,598 :   Progenitor query took 0.09s
2025-07-30 10:54:38,598 INFO  [tangos.log:73][waitress-2]   Progenitor query took 0.09s
2025-07-30 10:54:38,598 :   Property query took 0.01s
2025-07-30 10:54:38,598 INFO  [tangos.log:74][waitress-2]   Property query took 

In [20]:
h[1].derivable_keys()

['HII',
 'HeIII',
 'ne',
 'hetot',
 'hydrogen',
 'feh',
 'oxh',
 'ofe',
 'mgfe',
 'nefe',
 'sife',
 'c_s',
 'c_s_turb',
 'mjeans',
 'mjeans_turb',
 'ljeans',
 'ljeans_turb',
 'U_mag',
 'U_lum_den',
 'B_mag',
 'B_lum_den',
 'V_mag',
 'V_lum_den',
 'R_mag',
 'R_lum_den',
 'I_mag',
 'I_lum_den',
 'J_mag',
 'J_lum_den',
 'H_mag',
 'H_lum_den',
 'K_mag',
 'K_lum_den',
 'u_mag',
 'u_lum_den',
 'g_mag',
 'g_lum_den',
 'r_mag',
 'r_lum_den',
 'i_mag',
 'i_lum_den',
 'z_mag',
 'z_lum_den',
 'y_mag',
 'y_lum_den',
 'r',
 'rxy',
 'vr',
 'v2',
 'vt',
 'ke',
 'te',
 'j',
 'j2',
 'jz',
 'vrxy',
 'vcxy',
 'vphi',
 'vtheta',
 'v_mean',
 'v_disp',
 'v_curl',
 'vorticity',
 'v_div',
 'age',
 'theta',
 'alt',
 'az',
 'cs',
 'mu',
 'p',
 'u',
 'temp',
 'zeldovich_offset',
 'aform',
 'tform',
 'iord_argsort',
 'smooth',
 'rho']

### 1) GrabTF_rz.py

Step 1 of stellar halo pipeline

**What it does:**
- Loads the final snapshot (z=0) of the simulation
- Extracts formation time (`tform`) and particle IDs (`iord`) for all star particles that have `tform > 0` (i.e., actual stars, not wind particles)
- Converts formation times to Gyr units
- Creates a 2D array with particle IDs in the first row and formation times in the second row
- Saves this data as a `.npy` file named `<simulation_name>_tf.npy`
- Prints the total number of star particles found as a sanity check

**Input:** Simulation snapshot (specifically the final snapshot `.004096`)

**Output:** `<sim>_tf.npy` - NumPy file containing:
- Row 0: Star particle IDs (`iord`)  
- Row 1: Formation times in Gyr (`tform`)

**Purpose:** This creates the foundation dataset that subsequent steps will use to trace back each star particle to determine which halo it was forming in at its birth time.

* Usage:   `python GrabTF_rz.py <simpath> <output_directory>`
* Example: `python GrabTF_rz.py /path/to/sim/ /path/to/output/`
* Runtime:  ~1 min (cptmarvel), ~5 mins (rogue), ~3 mins (storm)

In [46]:
# # Import Anna's code, even if not along my path
# file_path = '/home/selvani/MAP/pynbody/AnnaWright_startrace/RomZoomSHAnalysisScripts/GrabTF_rz.py'
# module_name = 'GrabTF_rz'

# spec = importlib.util.spec_from_file_location(module_name, file_path)
# module = importlib.util.module_from_spec(spec)
# spec.loader.exec_module(module)
import GrabTF_rz

In [47]:
GrabTF_rz.main(ss_z0, outfile_dir)

4045429 stars found!
Save to  /home/ns1917/pynbody/stellarhalo_trace_aw/storm.cosmo25cmb.4096g5HbwK1BH_tf.npy


### 2) LocAtCreation_pool_rz.py

<!-- Step 2 of stellar halo pipeline -->
<!-- Identifies the host of each star particle in \<sim\>_tf.npy at the  -->
<!-- time it was formed.  -->
<!-- **Note that what is stored is NOT the amiga.grp 
ID, but the index of that halo in the tangos database. The amiga.grp
ID can be backed out via tangos with sim[stepnum][halonum].finder_id.** 
(CC, I believe that I edited this so now the halo_) -->

<!-- Output: <sim>_stardata_<snapshot>.h5
        where <snapshot> is the first snapshot that a given process
        analyzed. There will be <nproc> of these files generated
        and processes will not necessarily analyze adjacent snapshots -->

<!-- * Usage:   python LocAtCreation_pool_rz.py <sim> optional:<nproc>
* Example: python LocAtCreation_pool_rz.py r634 2 -->

<!-- Includes an optional argument to specify number of processes to run
with; default is 4. Note that this will get reduced if you've specified
more processes than you have snapshots to process. -->

<!-- Note that this has the name of the snapshots directory hardcoded into FindHaloStars.py (L63)
-- Will need to be adjusted
The 

CC: When I did my edits, I moved code into FindHaloStars so that it can be imported for multiprocessing -->


Step 2 of stellar halo pipeline

**What it does:**
- Loads the `<sim>_tf.npy` file created in step 1 (containing star particle IDs and formation times)
- Queries the Tangos database to get all available simulation snapshots and their cosmic times
- Determines which snapshots contain newly formed stars by binning star formation times
- For each relevant snapshot, identifies which halo each star particle belonged to at the time it formed
- Extracts additional data for each star: formation position, formation time, and host halo ID
- Converts Amiga halo group IDs to Tangos database indices for consistency
- Handles unbound particles (those not in any halo) by assigning them host ID = -1

**Detailed Process:**
1. **Data Loading**: Loads the `_tf.npy` file containing star particle IDs and formation times
2. **Snapshot Analysis**: Gets all simulation timesteps from Tangos database and sorts by cosmic time
3. **Star Distribution**: Creates histogram of star formation times to identify which snapshots contain new stars
4. **Chunk Creation**: Divides snapshots among multiple processes for parallel processing
<!-- 5. **Multiprocessing Execution**: Each process handles a subset of snapshots independently -->

**FindHaloStars Function (called by each process):**
- **Time Matching**: For each snapshot, finds stars that formed between the previous snapshot and current one
- **Particle Matching**: Uses `iord` (particle IDs) to match stars from step 1 with their counterparts in historical snapshots
- **Host Identification**: Determines which halo (`amiga.grp`) each star was in when it formed
- **Database Indexing**: Converts halo IDs to Tangos database indices using a lookup dictionary
- **Position/Time Extraction**: Records formation positions, times, and snapshot locations
- **Data Writing**: Periodically saves data to HDF5 files to manage memory usage

<!-- **Key Technical Details:**
<!-- - Each process loads the same `_tf.npy` data independently to avoid sharing conflicts -->
<!-- - Uses `np.searchsorted()` for efficient particle ID matching between snapshots -->
<!-- - Creates `fid` dictionary to map Amiga group IDs to Tangos database indices
- Handles missing particles gracefully (assigns host ID = -1 for unbound stars -->

**Input:** 
- `<sim>_tf.npy` from step 1
- Simulation snapshots (all timesteps)
- Tangos database connection

**Output:** `<sim>_stardata_<snapshot>.h5` files (one per process) containing:
- `particle_IDs`: Star particle IDs (`iord`) of stars formed between snapshot and previous snapshot
- `particle_positions`: 3D positions at formation time (Mpc)
- `particle_creation_times`: Formation times (Gyr)
- `timestep_location`: Snapshot number where star was first found
- `particle_hosts`: Host halo index in Tangos database (-1 for unbound stars)

<!-- **Performance Features:**
- **Multiprocessing**: Uses all available CPU cores (up to 72 logical cores) for parallel processing
- **Load Balancing**: Randomly shuffles snapshot order to distribute work evenly
- **Memory Management**: Periodically writes data to disk to prevent memory overflow
- **Process Isolation**: Each process works independently to avoid conflicts -->

**Important Notes:**
<!-- - **Host IDs are Tangos database indices, NOT Amiga group IDs** -->
- Multiple output files are created (one per process) that will be combined in later steps. Just run with n_threads>num_snapshots to make one file per snapshot.
<!-- - Uses multiprocessing for significant speed improvement on multi-core systems -->
<!-- - Automatically handles load balancing by shuffling snapshot order -->
<!-- - Each process creates its own output file named by the first snapshot it processes -->

**Purpose:** This step creates the detailed formation history for each star particle, linking it to its birth halo and enabling stellar halo analysis. The multiprocessing approach significantly reduces computation time for large simulations.
* Usage: `python LocAtCreation_pool_rz.py <simpath> <db_sim_name> <output_dir> [n_processes]`
* Example: `python LocAtCreation_pool_rz.py /path/to/sim/ cptmarvel.4096g5HbwK1BH_bn /output/ 36`
* Runtime: ~1 hour sequential

In [5]:
import LocAtCreation_pool_rz

In [6]:
import psutil
n_cpus = psutil.cpu_count(logical=True) # use up to 36 quirm cores
LocAtCreation_pool_rz.main(simpath, basename, ss_dir, outfile_dir, 128, overwrite=True)

Stars from 42 steps left to deal with
Initializing  42
Shuffled chunks: [['rogue.cosmo25cmb.4096g5HbwK1BH.000288']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.000384']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.002400']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.002880']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.002016']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.002688']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.003168']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.000960']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.000672']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.003360']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.003648']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.000768']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.004032']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.002112']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.002208']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.002784']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.002976']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.003552']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.001152']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.000576']
 ['rogue.cosmo25cmb.4096g5HbwK1BH.003744']
 ['rogue.cosmo25cmb.4096g

Processing:   0%|          | 0/42 [00:00<?, ?chunks/s]

Processing chunk 1/42: ['rogue.cosmo25cmb.4096g5HbwK1BH.000288']
MyFirstStep:  000288
/home/ns1917/tangos_sims/
HaloStarsPath: /home/ns1917/tangos_sims/rogue.4096g5HbwK1BH_bn/rogue.cosmo25cmb.4096g5HbwK1BH.000288
112341 relevant stars in rogue.cosmo25cmb.4096g5HbwK1BH.000288
Host Array: [3 3 3 ... 4 4 4]
Looking through 20837 halos
Amiga Host IDs [-1  1  2  3  4  5  7  8 10 11 13 14 16 17 18 19 20 24 25 42 55]
FID Host IDs [-1  1  2  3  4  5  7  8 10 11 13 14 16 17 18 19 20 24 25 42 55]
Process completed 1 snapshots, output: /home/ns1917/pynbody/stellarhalo_trace_aw/rogue.cosmo25cmb.4096g5HbwK1BH_stardata_000288.h5
  Completed: 000288

Processing chunk 2/42: ['rogue.cosmo25cmb.4096g5HbwK1BH.000384']
MyFirstStep:  000384
/home/ns1917/tangos_sims/
HaloStarsPath: /home/ns1917/tangos_sims/rogue.4096g5HbwK1BH_bn/rogue.cosmo25cmb.4096g5HbwK1BH.000384
312481 relevant stars in rogue.cosmo25cmb.4096g5HbwK1BH.000384
Host Array: [1 1 1 ... 3 3 3]
Looking through 27130 halos
Amiga Host IDs [  1   

### 3) writeouthosts_rz.py

<!-- Step 3 of stellar halo pipeline                                                                                   
For each snapshot, writes out a list of halos that formed stars                                                   
between the last snapshot and this one and the number of stars formed;                                            
as in step 2, note that the IDs of these halos will be their index in                                             
the tangos database, not necessarily their amiga.grp ID. This is used                                             
to construct a unique ID for each star-forming halo in the next step.                                             
                                                                                                                   -->
<!-- Output: <sim>_halostarhosts.txt                                                                                   
                                                                                                                  
Usage:   python writeouthosts_rz.py <sim>                                                                         
Example: python writeouthosts_rz.py r634                                                                          
                                                                                                                   -->
<!-- Note that this is currently set up for MMs, but should be easily adapted                                          
by e.g., changing the paths or adding a path CL argument.     -->

Step 3 of stellar halo pipeline

**What it does:**
- Reads all the `<sim>_stardata_*.h5` files created in Step 2 
- For each simulation snapshot, identifies which halos formed new stars between that snapshot and the previous one
- Counts how many stars formed in each halo during each time interval
- Creates a comprehensive timeline of star formation activity across all halos
- Outputs a summary text file listing star-forming halos and their activity over snapshots

**Detailed Process:**
1. **File Discovery**: Locates all `*_stardata_*.h5` files from Step 2 using glob pattern matching
2. **Data Extraction**: For each HDF5 file, extracts:
   - `particle_hosts`: Host halo indices 
   - `timestep_location`: Snapshot numbers where each star particle first appears
3. **Temporal Binning**: Groups star particles by the snapshot where they formed
4. **Halo Counting**: For each snapshot, counts how many stars formed in each unique halo
5. **Output Formatting**: Creates chronologically ordered summary with format:
   ```
   <snapshot_number>    <halo_id_1>,<star_count_1>    <halo_id_2>,<star_count_2>    ...
   ```

<!-- **Key Technical Details:**
- **Vectorized Operations**: Uses `np.unique(return_counts=True)` for efficient halo counting instead of slow loops
- **Memory Efficient**: Processes one HDF5 file at a time to minimize memory usage
- **Chronological Ordering**: Sorts output by snapshot number for temporal analysis
- **Duplicate Handling**: Aggregates data from multiple stardata files that may have overlapping snapshots -->

**Input Files:**
- Multiple `<sim>_stardata_<snapshot>.h5` files from Step 2
- Each file contains star formation data for a subset of simulation snapshots

**Output File:** `<sim>_halostarhosts.txt`
- Text file with tab-separated values
- Each line represents one simulation snapshot
- Format: `<timestep>\t<halo_id>,<count>\t<halo_id>,<count>\t...`
- Example line: `3840    -1,234    1,156    3,89`
  - At snapshot 3840: halo -1 formed 234 unbound stars, halo 1 formed 156 stars, halo 3 formed 89 stars

<!-- **Data Flow:**
```
Step 2 Output: Multiple *_stardata_*.h5 files
                    ↓
Step 3: Aggregate and summarize by snapshot/halo
                    ↓
Step 3 Output: Single *_halostarhosts.txt file
``` -->

<!-- **Performance Optimizations:**
- **Batch Processing**: Handles large datasets efficiently using vectorized NumPy operations
- **String Building**: Uses `list.join()` instead of repeated string concatenation for speed
- **Efficient I/O**: Single-pass reading of HDF5 files with minimal memory footprint -->

**Purpose:** This step creates a compact summary of star formation activity that enables Step 4 to efficiently track halo merger histories and assign unique IDs to star-forming halos. 
<!-- The chronological format makes it easy to identify:
- Which halos were actively forming stars at each epoch
- How star formation rates varied over cosmic time  
- Which halos contributed most significantly to stellar mass assembly -->
<!-- 
**Example Usage:**
- Input: 40+ `cptmarvel.cosmo25cmb.4096g5HbwK1BH_stardata_*.h5` files
- Output: `cptmarvel.cosmo25cmb.4096g5HbwK1BH_halostarhosts.txt`
- Result: Timeline of ~100 snapshots showing star formation in ~1000s of halos -->

* Usage: `python writeouthosts_rz.py <sim> <output_dir>`
* Example: `python writeouthosts_rz.py cptmarvel.cosmo25cmb.4096g5HbwK1BH /output/path/`
* Runtime: instant

**Important Notes:**
<!-- - **Halo IDs are Tangos database indices**, not Amiga group IDs (consistent with Step 2)
- Output file size is typically much smaller than input HDF5 files (text vs binary format)
- This summary enables efficient processing in Step 4 without re-reading large HDF5 files -->
- Unbound stars (host ID = -1) are included in the summary for completeness

In [7]:
import writeouthosts_rz

In [8]:
writeouthosts_rz.main(basename, odir=outfile_dir)

Found 42 stardata files
Output file: /home/ns1917/pynbody/stellarhalo_trace_aw/rogue.cosmo25cmb.4096g5HbwK1BH_halostarhosts.txt


### 3.5) Find and Fill In Main Progenitors

In [None]:
!tangos serve

In [4]:
tangos.all_simulations()
import tqdm.notebook as tqdm
import halo_trace as ht

In [17]:
# Get all halos in 004096 snapshot
timestep = tangos.get_timestep("cptmarvel.4096g5HbwK1BH_bn/%4096")
all_halos = timestep.halos.all()
print("There are %d halos in the snapshot." % len(all_halos))

# Filter for halos with stars
halos_with_stars = [h for h in all_halos if h.NStar > 0]

print("\nThere are %d halos with stars." % len(halos_with_stars))
for halo in halos_with_stars:
    print("Halo ID: %s, Mass: %1.2eM☉, Stars: %d" % (halo.halo_number, halo['Mvir'], halo.NStar))

There are 12546 halos in the snapshot.

There are 16 halos with stars.
Halo ID: 1, Mass: 1.34e+10M☉, Stars: 205466
Halo ID: 2, Mass: 9.35e+09M☉, Stars: 55462
Halo ID: 4, Mass: 8.28e+09M☉, Stars: 31452
Halo ID: 5, Mass: 6.24e+09M☉, Stars: 45841
Halo ID: 6, Mass: 5.88e+09M☉, Stars: 43645
Halo ID: 7, Mass: 4.46e+09M☉, Stars: 4199
Halo ID: 8, Mass: 6.00e+10M☉, Stars: 10015
Halo ID: 10, Mass: 2.85e+09M☉, Stars: 11867
Halo ID: 11, Mass: 2.61e+09M☉, Stars: 1954
Halo ID: 13, Mass: 1.54e+09M☉, Stars: 1166
Halo ID: 14, Mass: 1.47e+09M☉, Stars: 23
Halo ID: 16, Mass: 4.04e+09M☉, Stars: 3493
Halo ID: 27, Mass: 3.09e+08M☉, Stars: 14
Halo ID: 167, Mass: 3.83e+07M☉, Stars: 24
Halo ID: 455, Mass: 1.04e+07M☉, Stars: 149
Halo ID: 1328, Mass: 3.80e+06M☉, Stars: 15


In [22]:
# Get all halos in 004096 snapshot
all_timesteps = tangos.get_simulation("cptmarvel.4096g5HbwK1BH_bn").timesteps
halos_stars_dict = {}
for timestep in all_timesteps:
    all_halos = timestep.halos.all()
    # print("There are %d halos in the snapshot." % len(all_halos))

    # Filter for halos with stars
    halos_with_stars = [h for h in all_halos if h.NStar > 0]
    print("Timestep: %s" % timestep.extension[-6:])
    print("There are %d halos with stars.\n" % len(halos_with_stars))
    halos_stars_dict[timestep.extension[-6:]] = halos_with_stars
    # for halo in halos_with_stars:
    #     print("Halo ID: %s, Mass: %1.2eM☉, Stars: %d" % (halo.halo_number, halo['Mvir'], halo.NStar))

Timestep: 000199
There are 17 halos with stars.

Timestep: 000291
There are 20 halos with stars.

Timestep: 000384
There are 20 halos with stars.

Timestep: 000482
There are 19 halos with stars.

Timestep: 000512
There are 19 halos with stars.

Timestep: 000640
There are 20 halos with stars.

Timestep: 000672
There are 20 halos with stars.

Timestep: 000768
There are 20 halos with stars.

Timestep: 000818
There are 20 halos with stars.

Timestep: 000896
There are 19 halos with stars.

Timestep: 001025
There are 20 halos with stars.

Timestep: 001152
There are 20 halos with stars.

Timestep: 001162
There are 20 halos with stars.

Timestep: 001280
There are 20 halos with stars.

Timestep: 001331
There are 18 halos with stars.

Timestep: 001408
There are 19 halos with stars.

Timestep: 001536
There are 18 halos with stars.

Timestep: 001543
There are 18 halos with stars.

Timestep: 001664
There are 18 halos with stars.

Timestep: 001792
There are 18 halos with stars.

Timestep: 001813
The

In [23]:
halos_stars_dict['004096']

[<Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.004096/halo_1' | NDM=2673328 Nstar=205466 Ngas=341360>,
 <Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.004096/halo_2' | NDM=1809471 Nstar=55462 Ngas=531508>,
 <Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.004096/halo_4' | NDM=1624243 Nstar=31452 Ngas=367221>,
 <Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.004096/halo_5' | NDM=1257579 Nstar=45841 Ngas=116308>,
 <Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.004096/halo_6' | NDM=1179048 Nstar=43645 Ngas=134861>,
 <Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.004096/halo_7' | NDM=897819 Nstar=4199 Ngas=88876>,
 <Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.004096/halo_8' | NDM=279916 Nstar=10015 Ngas=536699>,
 <Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.004096/halo_10' | NDM=582175 Nstar=11867 Ngas=19174>,
 <Ha

### Parallelization Helpers

In [6]:
import math
import pynbody
from multiprocessing import Pool, Lock, Manager, Queue
import tqdm.notebook as tqdm

progress_queue = None
print_lock = None

def _init_worker(lock):
    """Initialize worker with shared lock"""
    global print_lock
    print_lock = lock

def safe_print(*args, **kwargs):
    """Thread-safe print function"""
    with print_lock:
        print(*args, **kwargs)

def safe_update_pbar(pbar, total):
    """Thread-safe update of progress bar"""
    with print_lock:
        pbar.update(total)
        pbar.refresh()

def create_thread_groups(items, n_threads):
    """
    Split a list of items into groups for parallel processing.
    
    This function divides a list of items (typically simulation timesteps) into 
    roughly equal groups that can be processed by different threads or processes
    in parallel. It ensures efficient load balancing across available threads.
    
    Parameters
    ----------
    items : list or array-like
        Items to be processed (e.g., simulation timesteps, file paths)
    n_threads : int
        Number of threads/processes to use for parallel processing
        
    Returns
    -------
    groups : list of lists
        Each sublist contains items for one thread to process. The number of
        sublists equals min(n_threads, len(items))
    """
    items = list(items)  # Ensure it's a list
    n_items = len(items)
    
    if n_threads >= n_items:
        # If more threads than items, give each item its own thread
        return [[item] for item in items]
    
    # Calculate items per thread
    items_per_thread = math.ceil(n_items / n_threads)
    
    groups = []
    for i in range(0, n_items, items_per_thread):
        group = items[i:i + items_per_thread]
        groups.append(group)
    
    return groups

def _parallelize(func, item, index, progress_bar=None):
    """
    Worker function for parallel processing with progress tracking.
    
    This is an internal helper function used by parallelize_func() to execute
    a given function on a group of items while tracking progress. It handles
    both single items and lists of items.
    
    Parameters
    ----------
    func : callable
        Function to apply to each item. Should accept a single argument.
    item : object or list
        Single item or list of items to process
    index : int
        Index of the worker process (used for progress bar positioning)
    progress_bar : tqdm.tqdm object, optional
        Progress bar instance for tracking completion. If None and item is a list,
        creates a new progress bar.
        
    Returns
    -------
    index : int
        Returns the worker index (used for process identification) when done
    """
    if type(item) is not str:
        if progress_bar:
            global pbar
        if not progress_bar:
            pbar = tqdm.tqdm(total=len(item), position=index)
        for it in item:
            func(it)
            safe_update_pbar(pbar, 1)
        if not progress_bar:
            pbar.close()
    return index

def parallelize_func(func: callable, items, n_threads: int, group_pbar=False):
    """
    Execute a function in parallel across multiple threads with progress tracking.
    
    This function provides a high-level interface for parallel processing of items
    using multiprocessing.Pool. It automatically handles thread group creation,
    progress bar management, and process coordination.
    
    Parameters
    ----------
    func : callable
        Function to apply to each item. Must accept a single argument.
        Should be picklable for multiprocessing.
    items : list or array-like
        Collection of items to process (e.g., file paths, timesteps)
    n_threads : int
        Number of parallel processes to use
    group_pbar : bool, optional
        If True, shows one progress bar rather than individual ones per thread.
        Default is False.
        
    Returns
    -------
    None
    """
    groups = create_thread_groups(items, n_threads)
    lock = Lock()
    if group_pbar:
        global pbar
        pbar = tqdm.tqdm(total=len(groups))
        with Pool(processes=len(groups), initializer=_init_worker, initargs=(lock,)) as pool:
            pool.starmap(_parallelize, [(func, group, i, True) for i, group in enumerate(groups)])
    else:
        with Pool(processes=n_threads, initializer=_init_worker, initargs=(lock,)) as pool:
            pool.starmap(_parallelize, [(func, group, i, None) for i, group in enumerate(groups)])

def _process_step(step):
    """
    Process a single simulation timestep to create amiga.grp halo grouping files.
    
    This function loads a simulation snapshot, identifies halos using pynbody's
    halo finder, and writes the halo group information to an amiga.grp file.
    
    Parameters
    ----------
    step : str
        Timestep identifier/extension (e.g., '004096')
        Used to construct the full file path: sim_base + step
        
    Returns
    -------
    None
        Function operates by side effect, writing amiga.grp files
    """
    path = sim_base + step
    # print(path)
    # time.sleep(1)
    safe_print('Loading <{}>'.format(step))
    f = pynbody.load(path)
    safe_print('  Loading halos for <{}>'.format(step))
    try:
        h = f.halos(halo_numbers='v1')
        safe_print('    Found halos for <{}>: {}'.format(step, h))
        # safe_print('    Writing amiga.grp for <{}>'.format(step))
        # f['amiga.grp'] = h.get_group_array()
        # f['amiga.grp'].write(overwrite=True)
        # safe_print('    Finished writing amiga.grp for <{}>'.format(step))
    except Exception as e:
        safe_print('  ERROR loading halos for <{}>: {}'.format(step, e))


In [7]:
import math
import time
import pynbody
from multiprocessing import Pool, Lock
import tqdm.notebook as tqdm # Using notebook-compatible tqdm

# This should be defined in your main script execution block
# sim_base = "/path/to/your/simulation/data/snapshot_" 

# Global lock for safe printing in the multi-pbar case
print_lock = None

def _init_worker_lock(lock):
    """Makes the lock global for use in worker processes."""
    global print_lock
    print_lock = lock

def safe_print(*args, **kwargs):
    """Thread-safe print function, requires the lock to be initialized."""
    with print_lock:
        print(*args, **kwargs)

def create_thread_groups(items, n_threads):
    """
    Splits a list of items into groups for parallel processing. 
    (Your implementation is correct and retained).
    """
    items = list(items)
    n_items = len(items)
    
    if n_threads >= n_items:
        return [[item] for item in items]
    
    items_per_thread = math.ceil(n_items / n_threads)
    groups = []
    for i in range(0, n_items, items_per_thread):
        group = items[i:i + items_per_thread]
        groups.append(group)
    
    return groups

def worker_task_single_pbar(args):
    """
    Worker for the single-pbar case.
    It applies a function to a group of items and returns the number processed.
    It does NOT touch any progress bar.
    """
    func, item_group = args
    results = []
    for item in item_group:
        results.append(func(item))
    return len(item_group) # Report how many items were processed

def worker_task_multi_pbar(args):
    """
    Worker for the multi-pbar case.
    It creates and manages its own progress bar for its assigned items.
    """
    func, item_group, index, lock = args
    _init_worker_lock(lock) # Initialize lock for this worker
    
    with tqdm.tqdm(total=len(item_group), position=index, desc=f"Process {index+1}") as pbar:
        for item in item_group:
            func(item)
            pbar.update(1)

def parallelize_func(func: callable, items, n_threads: int, group_pbar: bool = False):
    """
    Executes a function in parallel with a choice of progress bar display.

    Parameters
    ----------
    func : callable
        Function to apply to each item. Must be picklable.
    items : list
        A list of items to process.
    n_threads : int
        Number of parallel processes to use.
    group_pbar : bool
        If True (default), shows one progress bar for the entire task.
        If False, shows a separate progress bar for each process.
    """
    groups = create_thread_groups(items, n_threads)
    
    if group_pbar:
        # --- SINGLE, CENTRALIZED PBAR (RACE-CONDITION-FREE) ---
        with Pool(processes=n_threads) as pool:
            # Create the progress bar in the main process
            with tqdm.tqdm(total=len(items), desc="Overall Progress") as pbar:
                # pool.imap_unordered processes items and yields results as they complete
                # This allows for real-time updates to the progress bar
                args_list = [(func, g) for g in groups]
                for num_processed in pool.imap_unordered(worker_task_single_pbar, args_list):
                    pbar.update(num_processed)
    else:
        # --- MULTIPLE PBARS (ONE PER PROCESS) ---
        lock = Lock()
        with Pool(processes=len(groups)) as pool:
            args_list = [(func, group, i, lock) for i, group in enumerate(groups)]
            pool.map(worker_task_multi_pbar, args_list)

# --- Your Specific Processing Function ---
def _process_step(step):
    """
    Processes a single simulation timestep.
    NOTE: `sim_base` must be a global variable or passed differently.
          Making it global is simplest for multiprocessing if it's read-only.
    """
    try:
        path = sim_base + step
        # Use safe_print if you need console output during processing
        # safe_print(f'Loading <{step}>') 
        f = pynbody.load(path)
        h = f.halos(halo_numbers='v1')
        # f['amiga.grp'] = h.get_group_array()
        # f['amiga.grp'].write(overwrite=True)
    except Exception as e:
        # It's safer to print errors to see what failed
        print(f'ERROR processing step <{step}>: {e}')
    return step # Return something to confirm completion


In [8]:
steps = [step.extension for step in tangos.get_simulation("cptmarvel.4096g5HbwK1BH_bn").timesteps]

In [11]:
parallelize_func(_process_step, steps, n_threads=8, group_pbar=False)

In [9]:
# !tangos serve

### NEW

In [49]:
# Get all halos in 004096 snapshot
timestep = tangos.get_timestep("cptmarvel.4096g5HbwK1BH_bn/%4096")
all_halos = timestep.halos.all()
print("There are %d halos in the snapshot." % len(all_halos))

# Filter for halos with stars
halos_with_stars = [h for h in all_halos if h.NStar > 0]

There are 12546 halos in the snapshot.


In [86]:
import warnings
from sqlalchemy.exc import SAWarning
import tangos.examples.mergers as mergers
import pandas as pd

warnings.filterwarnings("ignore", category=SAWarning)

for halo in tqdm.tqdm(halos_with_stars):
    datab = pd.DataFrame(columns=["snapshot", "time_gyr", "main_halo_num", "merging_halo_num", "main_mstar", "merging_mstar", "main_mvir", "merging_mvir", "main_mgas", "merging_mgas", "redshift", "main_haloid", "merging_haloid"])
    save_path = outfile_dir + 'mergers/' + halo.timestep.extension + '_' + str(halo.halo_number) + '.csv'
    # print("\nMain Progenitor z=0 Halo ID: %s" % (halo.halo_number))
    redshift, ratio, progenitor_halos = mergers.get_mergers_of_major_progenitor(halo)
    # each item of progenitor_halos is a pair; the first is the major progenitor, the second is the thing merging into it
    progenitor_halos = [x for x in progenitor_halos if x[1].NStar > 0]
    print(progenitor_halos)
    merging_structures = [x[1] for x in progenitor_halos if x[1].NStar > 0]
    main_structures = [x[0] for x in progenitor_halos if x[0].NStar > 0]
    if len(merging_structures) == 0:
        continue
    else:
        print("\nMain Progenitor z=0 Halo ID: %s" % (halo.halo_number))
        print("There are %d mergers into the major progenitor branch.\n" % len(merging_structures))

        for i, merging_halo in enumerate(merging_structures):
            snapshot = merging_halo.timestep.extension[-6:]
            print("Merging Halo %d/%d at Snapshot %s" % (i+1, len(merging_structures), snapshot))
            try:
                merging_mvir = merging_halo['Mvir']
            except:
                merging_mvir = np.nan
            try:
                main_mvir = progenitor_halos[i][0]['Mvir']
            except:
                main_mvir = np.nan
            row = [snapshot, merging_halo.timestep.time_gyr, 
                   main_structures[i].halo_number, merging_halo.halo_number, 
                   main_structures[i]['M_star'], merging_halo['M_star'], 
                   main_mvir, merging_mvir, 
                   main_structures[i]['M_gas'], merging_halo['M_gas'], 
                   merging_halo.timestep.redshift,
                   main_structures[i].id, merging_halo.id]

            datab.loc[len(datab)] = row

            merging_halo_num = merging_halo.halo_number
            progenitor_halo_num = main_structures[i].halo_number
            # print(merging_halo.keys())
            # print("  Mass: %f" % merging_halo['Mvir'])
            print("  Merging Halo ID: %s, NStars: %d, NGas: %d" % (merging_halo_num, merging_halo.NStar, merging_halo.NGas))
            print("  Main Progenitor Halo ID: %s, NStars: %d, NGas: %d" % (progenitor_halo_num, main_structures[i].NStar, main_structures[i].NGas))
            # print("  Redshift: %f" % redshift[i])
            # print("  Ratio: %f" % ratio[i])
            print()
        # sort by time_gyr
        datab = datab.sort_values(by="time_gyr",ignore_index=True)
        print("Saving to %s" % save_path)
        datab.to_csv(save_path, index=False)
        print(datab)

  0%|          | 0/16 [00:00<?, ?it/s]

[(<Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000384/halo_4' | NDM=119930 Nstar=1372 Ngas=105353>, <Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000384/halo_9' | NDM=74503 Nstar=7531 Ngas=58044>), (<Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000384/halo_4' | NDM=119930 Nstar=1372 Ngas=105353>, <Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000384/halo_8' | NDM=78020 Nstar=2454 Ngas=66317>), (<Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000384/halo_4' | NDM=119930 Nstar=1372 Ngas=105353>, <Halo 'cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000384/halo_14' | NDM=23121 Nstar=233 Ngas=34652>)]

Main Progenitor z=0 Halo ID: 1
There are 3 mergers into the major progenitor branch.

Merging Halo 1/3 at Snapshot 000384
  Merging Halo ID: 9, NStars: 7531, NGas: 58044
  Main Progenitor Halo ID: 4, NStars: 1372, NGas: 105353

Merging Halo 2/3 at Snapshot 000384
  

In [69]:
len(merging_structures)

0

In [60]:
merging_halo.all_properties

[<HaloProperty Xc=9.24e+03 of <Halo 32 of ...>>,
 <HaloProperty Yc=8.95e+03 of <Halo 32 of ...>>,
 <HaloProperty Zc=9.20e+03 of <Halo 32 of ...>>,
 <HaloProperty VXc=-9.98e+00 of <Halo 32 of ...>>,
 <HaloProperty VYc=-8.00e-02 of <Halo 32 of ...>>,
 <HaloProperty VZc=2.22e+01 of <Halo 32 of ...>>,
 <HaloProperty Vmax=1.26e+01 of <Halo 32 of ...>>,
 <HaloProperty fMhires=1.00e+00 of <Halo 32 of ...>>,
 <HaloProperty M_gas=1.69e+06 of <Halo 32 of ...>>,
 <HaloProperty M_star=5.34e+03 of <Halo 32 of ...>>,
 <HaloProperty n_gas=1648 of <Halo 32 of ...>>,
 <HaloProperty n_star=22 of <Halo 32 of ...>>,
 <HaloProperty n_dm=5976 of <Halo 32 of ...>>,
 <HaloProperty npart=7646 of <Halo 32 of ...>>,
 <HaloProperty shrink_center (array) of <Halo 32 of ...>>,
 <HaloProperty max_radius=1.32e+00 of <Halo 32 of ...>>]

In [74]:
progenitor_halos[i][0].timestep

IndexError: list index out of range

In [77]:
halodict['000384'][0].timestep.redshift

4.812711810726945

In [9]:
import warnings
from sqlalchemy.exc import SAWarning

warnings.filterwarnings("ignore", category=SAWarning)

halodict = {}

def safe_add(dict, key, value):
    if key not in dict:
        dict[key] = [value]
    else:
        dict[key].append(value)
    return dict

for halo in tqdm.tqdm(halos_with_stars):
    import tangos.examples.mergers as mergers
    # print("\nMain Progenitor z=0 Halo ID: %s" % (halo.halo_number))
    redshift, ratio, progenitor_halos = mergers.get_mergers_of_major_progenitor(halo)
    # each item of progenitor_halos is a pair; the first is the major progenitor, the second is the thing merging into it
    merging_structures = [x[1] for x in progenitor_halos if x[1].NStar > 0]
    if len(merging_structures) == 0:
        pass
        # print("There are no mergers into the major progenitor branch.\n")
    else:
        print("\nMain Progenitor z=0 Halo ID: %s" % (halo.halo_number))
        print("There are %d mergers into the major progenitor branch.\n" % len(merging_structures))

        for i, merging_halo in enumerate(merging_structures):
            snapshot = merging_halo.timestep.extension[-6:]
            print("Merging Halo %d/%d at Snapshot %s" % (i+1, len(merging_structures), snapshot))
            halodict = safe_add(halodict, snapshot, merging_halo)
            
            merging_halo_num = merging_halo.halo_number
            progenitor_halo_num = progenitor_halos[i][0].halo_number
            # print(merging_halo.keys())
            # print("  Mass: %f" % merging_halo['Mvir'])
            print("  Merging Halo ID: %s, NStars: %d, NGas: %d" % (merging_halo_num, merging_halo.NStar, merging_halo.NGas))
            print("  Main Progenitor Halo ID: %s, NStars: %d, NGas: %d" % (progenitor_halo_num, progenitor_halos[i][0].NStar, progenitor_halos[i][0].NGas))
            # print("  Redshift: %f" % redshift[i])
            # print("  Ratio: %f" % ratio[i])
            print()

# for halo in tqdm.tqdm(halos_with_stars):
#     pass
    # print_main_progenitor_info(halo)

  0%|          | 0/16 [00:00<?, ?it/s]


Main Progenitor z=0 Halo ID: 1
There are 3 mergers into the major progenitor branch.

Merging Halo 1/3 at Snapshot 000384
  Merging Halo ID: 9, NStars: 7531, NGas: 58044
  Main Progenitor Halo ID: 1, NStars: 163833, NGas: 346127

Merging Halo 2/3 at Snapshot 000384
  Merging Halo ID: 8, NStars: 2454, NGas: 66317
  Main Progenitor Halo ID: 1, NStars: 110643, NGas: 372171

Merging Halo 3/3 at Snapshot 000384
  Merging Halo ID: 14, NStars: 233, NGas: 34652
  Main Progenitor Halo ID: 1, NStars: 88821, NGas: 311008


Main Progenitor z=0 Halo ID: 2
There are 4 mergers into the major progenitor branch.

Merging Halo 1/4 at Snapshot 001280
  Merging Halo ID: 17, NStars: 1072, NGas: 45664
  Main Progenitor Halo ID: 2, NStars: 44474, NGas: 518779

Merging Halo 2/4 at Snapshot 001162
  Merging Halo ID: 14, NStars: 887, NGas: 40851
  Main Progenitor Halo ID: 2, NStars: 43626, NGas: 518262

Merging Halo 3/4 at Snapshot 001025
  Merging Halo ID: 13, NStars: 780, NGas: 40409
  Main Progenitor Halo I

In [10]:
halodict.keys()

dict_keys(['000384', '001280', '001162', '001025', '000896', '002176', '002048', '001920', '001813', '001543', '001408', '000672', '000640', '000512', '000482', '000199', '001331', '001152', '000291'])

In [11]:
all_timesteps = tangos.get_simulation("cptmarvel.4096g5HbwK1BH_bn").timesteps
all_timesteps = [ts.extension[-6:] for ts in all_timesteps]
tsteps_used = list(halodict.keys())

# maxim = '001408'
# select_timesteps = [ts for ts in all_timesteps if int(ts) <= int(maxim)]
# select_timesteps

In [12]:
# sort tsteps_used by int(i) for i in tsteps_used, reversed
tsteps_used = sorted(tsteps_used, key=int, reverse=True)
tsteps_used

['002176',
 '002048',
 '001920',
 '001813',
 '001543',
 '001408',
 '001331',
 '001280',
 '001162',
 '001152',
 '001025',
 '000896',
 '000672',
 '000640',
 '000512',
 '000482',
 '000384',
 '000291',
 '000199']

### Halo Trace (Elaad)

In [34]:
# parallelize_func(print_main_progenitor_info, all_halos, n_threads=20, group_pbar=False)

In [13]:
# sim_base = '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/snapshots_200crit_cptmarvel'
sim_base = '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn'
# ahf_dir = None
# ahf_dir = 'ahf_200'
# grplist = None

In [14]:
def trace_step(ts):
    print(ts)
    grplist = np.array([halo.halo_number for halo in halodict[ts]])
    print(grplist)
    steplist = [int(tstep) for tstep in all_timesteps if int(tstep) <= int(ts)]
    print(steplist)
    trace = ht.tracing.trace_halos(sim_base=sim_base,
                                   grplist=grplist,
                                   steplist=steplist)
    return 0

In [15]:
for i in tqdm.tqdm(tsteps_used):
    trace_step(i)

  0%|          | 0/19 [00:00<?, ?it/s]

002176
[49]
[199, 291, 384, 482, 512, 640, 672, 768, 818, 896, 1025, 1152, 1162, 1280, 1331, 1408, 1536, 1543, 1664, 1792, 1813, 1920, 2048, 2162, 2176]
cptmarvel.cosmo25cmb.4096g5HbwK1BH.000113 does not exist or has incorrect permissions
Steplist ['/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.002176', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.002162', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.002048', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.001920', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.00181

pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


Tracing halos:   0%|          | 0/12 [00:00<?, ?step/s]

Starting step 001152
Advanced step 001025


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
108
	done matching halos


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
108
	Zeros in Ni: 5, Zeros in Nj: 3
Finished

Starting step 001025


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
93
	Zeros in Ni: 5, Zeros in Nj: 3
	done matching halos
Cross-checking step 001025

Finished

Starting step 000896
Advanced step 000818


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
42
	Zeros in Ni: 3, Zeros in Nj: 2
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
42
	Zeros in Ni: 7, Zeros in Nj: 5
Finished

Starting step 000818


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 4, Zeros in Nj: 3
	done matching halos
Cross-checking step 000818

Finished

Starting step 000768
Advanced step 000672


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
42
	Zeros in Ni: 2, Zeros in Nj: 0
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
42
	Zeros in Ni: 10, Zeros in Nj: 3
Finished

Starting step 000672


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 9, Zeros in Nj: 3
	done matching halos
Cross-checking step 000672

Finished

Starting step 000640
Advanced step 000512


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 1, Zeros in Nj: 1
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 12, Zeros in Nj: 3
Finished

Starting step 000512


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
39
	Zeros in Ni: 11, Zeros in Nj: 2
	done matching halos
Cross-checking step 000512

Finished

Starting step 000482
Advanced step 000384


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
30
	Zeros in Ni: 6, Zeros in Nj: 3
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
30
	Zeros in Ni: 19, Zeros in Nj: 6
Finished

Starting step 000384


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
31
	Zeros in Ni: 15, Zeros in Nj: 6
	done matching halos
Cross-checking step 000384

Finished

Starting step 000291


  merit_f = mat**2/np.outer(Ni, Nj)


Advanced step 000199


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 16, Zeros in Nj: 8
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)


	Matching halos by merit function
43
	Zeros in Ni: 27, Zeros in Nj: 6
Finished

Starting step 000199
	Matching halos by merit function
36
	Zeros in Ni: 23, Zeros in Nj: 5
	done matching halos
Cross-checking step 000199

Finished



  merit_f = mat**2/np.outer(Ni, Nj)


001152
[62]
[199, 291, 384, 482, 512, 640, 672, 768, 818, 896, 1025, 1152]
cptmarvel.cosmo25cmb.4096g5HbwK1BH.000113 does not exist or has incorrect permissions
Steplist ['/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.001152', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.001025', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000896', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000818', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000768', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5Hb

pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


Tracing halos:   0%|          | 0/11 [00:00<?, ?step/s]

Starting step 001025
Advanced step 000896


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
93
	Zeros in Ni: 5, Zeros in Nj: 3
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
93
	Zeros in Ni: 8, Zeros in Nj: 4
Finished

Starting step 000896


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
42
	Zeros in Ni: 3, Zeros in Nj: 2
	done matching halos
Cross-checking step 000896

Finished

Starting step 000818
Advanced step 000768


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 4, Zeros in Nj: 3
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 6, Zeros in Nj: 3
Finished

Starting step 000768


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
42
	Zeros in Ni: 2, Zeros in Nj: 0
	done matching halos
Cross-checking step 000768

Finished

Starting step 000672
Advanced step 000640


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 9, Zeros in Nj: 3
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 9, Zeros in Nj: 3
Finished

Starting step 000640


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 1, Zeros in Nj: 1
	done matching halos
Cross-checking step 000640

Finished

Starting step 000512
Advanced step 000482


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
39
	Zeros in Ni: 11, Zeros in Nj: 2
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
39
	Zeros in Ni: 15, Zeros in Nj: 1
Finished

Starting step 000482


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
30
	Zeros in Ni: 6, Zeros in Nj: 3
	done matching halos
Cross-checking step 000482

Finished

Starting step 000384
Advanced step 000291


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
31
	Zeros in Ni: 15, Zeros in Nj: 6
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
31
	Zeros in Ni: 26, Zeros in Nj: 6
Finished

Starting step 000291


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 16, Zeros in Nj: 8
	done matching halos
Cross-checking step 000291

Finished

Starting step 000199


  merit_f = mat**2/np.outer(Ni, Nj)


	Matching halos by merit function
36
	Zeros in Ni: 23, Zeros in Nj: 5
	done matching halos
Cross-checking step 000199

Finished

001025
[13 28]
[199, 291, 384, 482, 512, 640, 672, 768, 818, 896, 1025]


  merit_f = mat**2/np.outer(Ni, Nj)
  df.loc[mask, step[-6:]] = df2.loc[mask, step[-6:]]
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->integer,key->block1_values] [items->Index(['000199'], dtype='object')]

  df.to_hdf(save_file, key='ids')


cptmarvel.cosmo25cmb.4096g5HbwK1BH.000113 does not exist or has incorrect permissions
Steplist ['/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.001025', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000896', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000818', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000768', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000672', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000640'

pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


Tracing halos:   0%|          | 0/10 [00:00<?, ?step/s]

Starting step 000896
Advanced step 000818


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
42
	Zeros in Ni: 3, Zeros in Nj: 2
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
42
	Zeros in Ni: 7, Zeros in Nj: 5
Finished

Starting step 000818


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 4, Zeros in Nj: 3
	done matching halos
Cross-checking step 000818

Finished

Starting step 000768
Advanced step 000672


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
42
	Zeros in Ni: 2, Zeros in Nj: 0
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
42
	Zeros in Ni: 10, Zeros in Nj: 3
Finished

Starting step 000672


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 9, Zeros in Nj: 3
	done matching halos
Cross-checking step 000672

Finished

Starting step 000640
Advanced step 000512


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 1, Zeros in Nj: 1
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 12, Zeros in Nj: 3
Finished

Starting step 000512


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
39
	Zeros in Ni: 11, Zeros in Nj: 2
	done matching halos
Cross-checking step 000512

Finished

Starting step 000482
Advanced step 000384


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
30
	Zeros in Ni: 6, Zeros in Nj: 3
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
30
	Zeros in Ni: 19, Zeros in Nj: 6
Finished

Starting step 000384


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
31
	Zeros in Ni: 15, Zeros in Nj: 6
	done matching halos
Cross-checking step 000384

Finished

Starting step 000291
Advanced step 000199


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
43
	Zeros in Ni: 16, Zeros in Nj: 8
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)


	Matching halos by merit function
43
	Zeros in Ni: 27, Zeros in Nj: 6
Finished

Starting step 000199
	Matching halos by merit function
36
	Zeros in Ni: 23, Zeros in Nj: 5
	done matching halos
Cross-checking step 000199

Finished

000896
[12]
[199, 291, 384, 482, 512, 640, 672, 768, 818, 896]


  merit_f = mat**2/np.outer(Ni, Nj)


cptmarvel.cosmo25cmb.4096g5HbwK1BH.000113 does not exist or has incorrect permissions
Steplist ['/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000896', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000818', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000768', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000672', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000640', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000512'

pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


Tracing halos:   0%|          | 0/9 [00:00<?, ?step/s]

Starting step 000818
Advanced step 000768


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
18
	Zeros in Ni: 4, Zeros in Nj: 3
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
18
	Zeros in Ni: 6, Zeros in Nj: 3
Finished

Starting step 000768


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
15
	Zeros in Ni: 2, Zeros in Nj: 0
	done matching halos
Cross-checking step 000768

Finished

Starting step 000672
Advanced step 000640


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
15
	Zeros in Ni: 9, Zeros in Nj: 3
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
15
	Zeros in Ni: 9, Zeros in Nj: 3
Finished

Starting step 000640


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
15
	Zeros in Ni: 1, Zeros in Nj: 1
	done matching halos
Cross-checking step 000640

Finished

Starting step 000512
Advanced step 000482


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
15
	Zeros in Ni: 11, Zeros in Nj: 2
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
15
	Zeros in Ni: 15, Zeros in Nj: 1
Finished

Starting step 000482


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
16
	Zeros in Ni: 6, Zeros in Nj: 3
	done matching halos
Cross-checking step 000482

Finished

Starting step 000384
Advanced step 000291


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
18
	Zeros in Ni: 15, Zeros in Nj: 6
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
18
	Zeros in Ni: 26, Zeros in Nj: 6
Finished

Starting step 000291


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
25
	Zeros in Ni: 16, Zeros in Nj: 8
	done matching halos
Cross-checking step 000291

Finished

Starting step 000199


  merit_f = mat**2/np.outer(Ni, Nj)


	Matching halos by merit function
33
	Zeros in Ni: 23, Zeros in Nj: 5
	done matching halos
Cross-checking step 000199

Finished

000672
[121]
[199, 291, 384, 482, 512, 640, 672]


  merit_f = mat**2/np.outer(Ni, Nj)
  df.loc[mask, step[-6:]] = df2.loc[mask, step[-6:]]
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->integer,key->block1_values] [items->Index(['000199'], dtype='object')]

  df.to_hdf(save_file, key='ids')


cptmarvel.cosmo25cmb.4096g5HbwK1BH.000113 does not exist or has incorrect permissions
Steplist ['/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000672', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000640', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000512', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000482', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000384', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000291'

pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


Tracing halos:   0%|          | 0/6 [00:00<?, ?step/s]

Starting step 000640
Advanced step 000512


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
181
	Zeros in Ni: 6, Zeros in Nj: 3
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
181
	Zeros in Ni: 24, Zeros in Nj: 4
Finished

Starting step 000512


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
100
	Zeros in Ni: 11, Zeros in Nj: 2
	done matching halos
Cross-checking step 000512

Finished

Starting step 000482
Advanced step 000384


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
34
	Zeros in Ni: 6, Zeros in Nj: 3
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
34
	Zeros in Ni: 19, Zeros in Nj: 6
Finished

Starting step 000384


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
25
	Zeros in Ni: 15, Zeros in Nj: 6
	done matching halos
Cross-checking step 000384

Finished

Starting step 000291
Advanced step 000199


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
24
	Zeros in Ni: 16, Zeros in Nj: 8
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)


	Matching halos by merit function
24
	Zeros in Ni: 27, Zeros in Nj: 6
Finished

Starting step 000199
	Matching halos by merit function
22
	Zeros in Ni: 23, Zeros in Nj: 5
	done matching halos
Cross-checking step 000199

Finished

000640
[67]
[199, 291, 384, 482, 512, 640]


  merit_f = mat**2/np.outer(Ni, Nj)


cptmarvel.cosmo25cmb.4096g5HbwK1BH.000113 does not exist or has incorrect permissions
Steplist ['/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000640', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000512', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000482', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000384', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000291', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000199'

pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


Tracing halos:   0%|          | 0/5 [00:00<?, ?step/s]

Starting step 000512
Advanced step 000482


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
100
	Zeros in Ni: 11, Zeros in Nj: 2
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
100
	Zeros in Ni: 15, Zeros in Nj: 1
Finished

Starting step 000482


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
34
	Zeros in Ni: 6, Zeros in Nj: 3
	done matching halos
Cross-checking step 000482

Finished

Starting step 000384
Advanced step 000291


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
25
	Zeros in Ni: 15, Zeros in Nj: 6
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
25
	Zeros in Ni: 26, Zeros in Nj: 6
Finished

Starting step 000291


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
24
	Zeros in Ni: 16, Zeros in Nj: 8
	done matching halos
Cross-checking step 000291

Finished

Starting step 000199


  merit_f = mat**2/np.outer(Ni, Nj)


	Matching halos by merit function
22
	Zeros in Ni: 23, Zeros in Nj: 5
	done matching halos
Cross-checking step 000199

Finished

000512
[23]
[199, 291, 384, 482, 512]


  merit_f = mat**2/np.outer(Ni, Nj)
  df.loc[mask, step[-6:]] = df2.loc[mask, step[-6:]]
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->integer,key->block1_values] [items->Index(['000199'], dtype='object')]

  df.to_hdf(save_file, key='ids')


cptmarvel.cosmo25cmb.4096g5HbwK1BH.000113 does not exist or has incorrect permissions
Steplist ['/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000512', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000482', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000384', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000291', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000199']
File will be saved as /home/selvani/MAP/pynbody/AnnaWright_startrace/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cp

pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


Tracing halos:   0%|          | 0/4 [00:00<?, ?step/s]

Starting step 000482
Advanced step 000384


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
34
	Zeros in Ni: 6, Zeros in Nj: 3
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
34
	Zeros in Ni: 19, Zeros in Nj: 6
Finished

Starting step 000384


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
25
	Zeros in Ni: 15, Zeros in Nj: 6
	done matching halos
Cross-checking step 000384

Finished

Starting step 000291
Advanced step 000199


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
24
	Zeros in Ni: 16, Zeros in Nj: 8
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)


	Matching halos by merit function
24
	Zeros in Ni: 27, Zeros in Nj: 6
Finished

Starting step 000199
	Matching halos by merit function
22
	Zeros in Ni: 23, Zeros in Nj: 5
	done matching halos
Cross-checking step 000199

Finished

000482
[17]
[199, 291, 384, 482]


  merit_f = mat**2/np.outer(Ni, Nj)


cptmarvel.cosmo25cmb.4096g5HbwK1BH.000113 does not exist or has incorrect permissions
Steplist ['/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000482', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000384', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000291', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000199']
File will be saved as /home/selvani/MAP/pynbody/AnnaWright_startrace/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000482.trace_back_merge.hdf5
grplist= [17]


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


Tracing halos:   0%|          | 0/3 [00:00<?, ?step/s]

Starting step 000384
Advanced step 000291


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
25
	Zeros in Ni: 15, Zeros in Nj: 6
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)
pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
25
	Zeros in Ni: 26, Zeros in Nj: 6
Finished

Starting step 000291


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
24
	Zeros in Ni: 16, Zeros in Nj: 8
	done matching halos
Cross-checking step 000291

Finished

Starting step 000199


  merit_f = mat**2/np.outer(Ni, Nj)


	Matching halos by merit function
22
	Zeros in Ni: 23, Zeros in Nj: 5
	done matching halos
Cross-checking step 000199

Finished

000384
[ 9  8 14 16]
[199, 291, 384]


  merit_f = mat**2/np.outer(Ni, Nj)
  df.loc[mask, step[-6:]] = df2.loc[mask, step[-6:]]
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->integer,key->block1_values] [items->Index(['000199'], dtype='object')]

  df.to_hdf(save_file, key='ids')


cptmarvel.cosmo25cmb.4096g5HbwK1BH.000113 does not exist or has incorrect permissions
Steplist ['/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000384', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000291', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000199']
File will be saved as /home/selvani/MAP/pynbody/AnnaWright_startrace/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000384.trace_back_merge.hdf5
grplist= [ 9  8 14 16]


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


Tracing halos:   0%|          | 0/2 [00:00<?, ?step/s]

Starting step 000291
Advanced step 000199


pynbody.halo : Unable to load AHF substructure file; continuing without. To expose the underlying problem as an exception, pass ignore_missing_substructure=False to the AHFCatalogue constructor


	Matching halos by merit function
24
	Zeros in Ni: 16, Zeros in Nj: 8
	done matching halos


  merit_f = mat**2/np.outer(Ni, Nj)


	Matching halos by merit function
24
	Zeros in Ni: 27, Zeros in Nj: 6
Finished

Starting step 000199
	Matching halos by merit function
25
	Zeros in Ni: 23, Zeros in Nj: 5
	done matching halos
Cross-checking step 000199

Finished

000291
[12]
[199, 291]


  merit_f = mat**2/np.outer(Ni, Nj)


cptmarvel.cosmo25cmb.4096g5HbwK1BH.000113 does not exist or has incorrect permissions
Steplist ['/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000291', '/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000199']
File will be saved as /home/selvani/MAP/pynbody/AnnaWright_startrace/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000291.trace_back_merge.hdf5
grplist= [12]
Must trace through at least 3 steps to use cross check
000199
[32]
[199]
cptmarvel.cosmo25cmb.4096g5HbwK1BH.000113 does not exist or has incorrect permissions
Steplist ['/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.000199']
File will be saved as /home/se

In [44]:
filenames = glob.glob(sim_base + '/*merge.hdf5')
for file in filenames:
    print(file)
    trace = pd.read_hdf(file, key='ids')
    # Fix the index by removing the '004096' prefix
    print("Original index:", trace.index.name)
    # trace.index = trace.index.str.replace(trace.index.name, '')
    # Convert to integers if desired
    trace.index = trace.index.astype(int)
    print(trace.head())
    # trace.to_hdf(file, key='ids')

/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.001162.trace_back_merge.hdf5
Original index: 001162
        001152  001025  000896  000818  000768  000672  000640  000512  \
001162                                                                   
14          14      13      12      10      10      10      10      11   
72          62      28      29      28      29      29      26      20   

        000482  000384  000291  000199  
001162                                  
14          12      17      22      31  
72          21      29      24      10  
/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.002176.trace_back_merge.hdf5
Original index: 002176
        002162  002048  001920  001813  001792  001664  001543  001536  \
002176                                                                   
49   

In [33]:
import pandas as pd
from tangos import get_timestep, core
from tangos.core import get_default_session as session

def create_phantom(ts, halo_number):
    # Make a new Halo object with a special flag or negative halo number
    phantom = core.Halo(
        halo_number=halo_number,
        timestep=ts,
        finder_id=-halo_number,  # Use negative to distinguish phantom
        # Add other properties as needed
    )
    session.add(phantom)
    session.commit()
    return phantom

def check_and_fix_links_with_phantoms(df, sim_prefix="sim", dry_run=True, create_phantoms=False):
    for chain_idx, row in df.iterrows():
        halos = []
        for snapshot, halo_number in row.items():
            if pd.isnull(halo_number):
                halos.append(None)
            else:
                ts = get_timestep(f"{sim_prefix}/%{snapshot}")
                halo = ts.halos.filter_by(halo_number=int(halo_number)).first() if ts else None
                halos.append(halo)
        for i in range(1, len(halos)):
            descendant = halos[i-1]
            progenitor = halos[i]
            # If progenitor missing from DB and create_phantoms is True, create phantom
            if progenitor is None and not pd.isnull(row.iloc[i]):
                ts = get_timestep(f"{sim_prefix}/{row.index[i]}")
                if create_phantoms and ts is not None:
                    print(f"[CHAIN {chain_idx}] Creating PHANTOM for halo {row.iloc[i]} in snapshot {row.index[i]}")
                    progenitor = create_phantom(ts, int(row.iloc[i]))
                    halos[i] = progenitor
            if descendant is None or progenitor is None:
                continue
            found = False
            for link in descendant.previous:
                if link.halo_from == progenitor:
                    found = True
                    break
            if not found:
                print(f"[CHAIN {chain_idx}] Missing link: {descendant} (halo_number={descendant.halo_number}, snapshot={descendant.timestep.extension}) "
                      f"should have progenitor {progenitor} (halo_number={progenitor.halo_number}, snapshot={progenitor.timestep.extension})")
                if not dry_run:
                    new_link = core.HaloLink(
                        halo_from=progenitor,
                        halo_to=descendant,
                        type='progenitor',
                        weight=1.0
                    )
                    session.add(new_link)
                    print(f"  -> Link ADDED")
        if not dry_run:
            session.commit()
            print(f"[CHAIN {chain_idx}] Committed new links to DB.")


In [63]:
def get_halo(snapshot, halo_number):
    ts = tangos.get_timestep(f"cptmarvel.4096g5HbwK1BH_bn/%{snapshot}")
    # print(f"Retrieved timestep: {ts}")
    return ts.halos.filter_by(halo_number=int(halo_number)).first()

In [None]:
def process_halo_data(filename):
    df = pd.read_hdf(filename, key='ids')
    snapshot = df.index.name
    # rows = df.values
    print("Snapshot:", snapshot)
    for index, row in df.iterrows():
        # print(index)
        halo_numbers_df = row.values.tolist()
        halo_numbers_df.insert(0, index)
        snapshots_df = row.index.values.tolist()
        snapshots_df.insert(0, snapshot)
        unique_id_strings = []
        for halo_num, snap in zip(halo_numbers_df, snapshots_df):
            unique_id_string = f"{snap}_{halo_num}"
            unique_id_strings.append(unique_id_string)
        print(f"    Unique IDs: {unique_id_strings}")
        # print(halo_numbers_df)
        # print(snapshots_df)
        halo = get_halo(snapshot, index)
        print(f"  Retrieved halo: {halo.halo_number}")

        unique_id_tangos = []
        halo_numbers, dbids = halo.calculate_for_progenitors('halo_number()','dbid()')
        snapshots = [tangos.get_halo(dbid).timestep.extension[-6:] for dbid in dbids]
        print(f"    Retrieved progenitor halos: {halo_numbers}")
        # print(f"    Retrieved progenitor snapshots: {snapshots}")
        for halo_num, snap in zip(halo_numbers, snapshots):
            unique_id_string = f"{snap}_{halo_num}"
            unique_id_tangos.append(unique_id_string)
        # print(f"    Unique IDs: {unique_id_tangos}")

        # check if each one is in the tangos database
        for unique_id in unique_id_strings:
            if unique_id not in unique_id_tangos and not unique_id.endswith('-1'):
                print(f"    ID {unique_id} not found in tangos links")

        for unique_id in unique_id_tangos:
            if unique_id not in unique_id_strings:
                print(f"    ID {unique_id} not found in halo links")

In [83]:
for filename in filenames:
    process_halo_data(filename)

Snapshot: 001162
    Unique IDs: ['001162_14', '001152_14', '001025_13', '000896_12', '000818_10', '000768_10', '000672_10', '000640_10', '000512_11', '000482_12', '000384_17', '000291_22', '000199_31']
  Retrieved halo: 14
    Retrieved progenitor halos: [14 14 13 12 10 10 10 10 11 12 17 22 31]
    Unique IDs: ['001162_72', '001152_62', '001025_28', '000896_29', '000818_28', '000768_29', '000672_29', '000640_26', '000512_20', '000482_21', '000384_29', '000291_24', '000199_10']
  Retrieved halo: 72
    Retrieved progenitor halos: [72 62 28 29 28 29 29 26 20 21 29 24 10]
Snapshot: 002176
    Unique IDs: ['002176_49', '002162_48', '002048_43', '001920_43', '001813_43', '001792_43', '001664_45', '001543_42', '001536_43', '001408_42', '001331_40', '001280_42', '001162_42', '001152_43', '001025_48', '000896_51', '000818_49', '000768_44', '000672_39', '000640_37', '000512_35', '000482_37', '000384_37', '000291_37', '000199_25']
  Retrieved halo: 49
    Retrieved progenitor halos: [49 48 43 4

In [None]:
df = pd.read_hdf(filenames[0], key='ids')
# Set create_phantoms=True if you want to insert placeholder halos for missing ones
check_and_fix_links_with_phantoms(df, sim_prefix="cptmarvel.4096g5HbwK1BH_bn", dry_run=True, create_phantoms=True)

In [None]:
trace = pd.read_hdf('/home/selvani/MAP/pynbody/cptmarvel.cosmo25cmb.4096g5HbwK1BH.004096.trace_back.hdf5', key='ids')
# Fix the index by removing the '004096' prefix
trace.index = trace.index.str.replace('004096', '')

# Convert to integers if desired
trace.index = trace.index.astype(int)

#### Unused

In [None]:
main_branch = halo.calculate_for_progenitors()
print(main_branch)

In [None]:
main_branch = []
halo_current = halo

# Step 1: Walk the main progenitor branch
while True:
    main_branch.append(halo_current)
    if len(halo_current.progenitors)==0:
        break
    # The main progenitor is usually the most massive
    halo_current = max(halo_current.progenitors, key=lambda h: getattr(h, 'Mvir', 0))

# Step 2: For each snapshot (along main branch), find all progenitors
for h in main_branch:
    all_progs = h.progenitors
    if not all_progs:
        continue
    # The main progenitor (by definition, on the main branch)
    main_prog = max(all_progs, key=lambda h: getattr(h, 'Mvir', 0))
    # Others are mergers into the main branch at this step
    mergers = [prog for prog in all_progs if prog != main_prog]
    
    print(f"Snapshot: {h.timestep.extension} Halo number: {h.halo_number}")
    print(f"  Main progenitor: {main_prog.halo_number if main_prog else 'N/A'}")
    print(f"  Mergers at this step: {[m.halo_number for m in mergers]}")

### 4) IDUniqueHost_rz.py

Step 4 of stellar halo pipeline                                                                                                                       
Creates a unique ID for each host that forms a star. The format of                                                                                    
this ID is \<last snapshot where host was IDed\>_\<index at this snapshot\>.                                                                              
So, if a host was halo 5 at snapshot 3552 and then merged with halo 1                                                                                 
before the next snapshot, its unique ID will be 3552_5. Stars that form                                                                               
in its main progenitors will also be associated with this ID. These IDs                                                                               
are written out to a file with a similar format to <sim>_halostarhosts.txt.                                                                           
                                                                                                                                                      
Output: <sim>_uniquehalostarhosts.txt                                                                                                                 
                                                                                                                                                      
Usage:   python IDUniqueHost_rz.py <sim>                                                                                                              
Example: python IDUniqueHost_rz.py r634                                                                                                               
                                                                                                                                                      
Note that this is currently set up for MMs, but should be easily adapted                                                                              
by e.g., changing the paths or adding a path CL argument. It is also                                                                                  
designed to accommodate the phantoms that rockstar generates when it                                                                                  
temporarily loses track of a halo, which slows it down quite a bit.                                                                                   
If you're only ever going to be using it with other types of merger                                                                                   
trees, it can be simplified.  

Step 4 of stellar halo pipeline

**What it does:**
- Reads the `<sim>_halostarhosts.txt` file created in Step 3 (which lists halos that formed stars at each snapshot)
- For each star-forming halo, traces its merger history forward in time using Tangos database merger trees
- Creates a unique, persistent ID for each halo that accounts for mergers and halo evolution
- Assigns the same unique ID to stars formed in progenitor halos that later merge
- Handles "phantom" halos (temporary tracking losses in halo finders) for robust merger tree following
- Outputs a mapping file that connects local halo IDs to persistent unique IDs

**Detailed Process:**
1. **Input Parsing**: Reads the timeline of star-forming halos from `_halostarhosts.txt`
2. **Merger Tree Traversal**: For each halo, uses Tangos database to trace descendants forward in time
3. **Self-Consistency Checking**: Verifies that merger tree connections are bidirectional (descendant→progenitor matches)
4. **Unique ID Assignment**: Creates IDs in format `<last_snapshot>_<halo_index>` where the halo was last independently identified
5. **Progenitor Chain Building**: Links all progenitors in a merger chain to the same unique ID
6. **Phantom Handling**: Accommodates temporary halo finder failures using robust tree traversal algorithms

**Key Technical Details:**
- **Unique ID Format**: `SSSS_H` where `SSSS` = 4-digit snapshot number, `H` = halo index at that snapshot
- **Forward Tracking**: Uses `trackforward()` function to find the last snapshot where a halo exists independently
- **Merger Tree Validation**: Employs `checkmatch_p()` and `checkmatch_d()` to verify progenitor/descendant relationships
- **Phantom Accommodation**: Filters out phantom halos (type > 0) but maintains merger tree integrity
- **Caching System**: Uses dictionary `d` to store previously computed unique IDs for efficiency

**Example Unique ID Creation:**
```
Halo 5 at snapshot 3552 merges with halo 1 before snapshot 3553
→ Unique ID: "3552_5"
→ All stars formed in this halo's progenitors get ID "3552_5"
→ Stars formed in halo 1 (after merger) get a different unique ID
```

**Algorithm Workflow:**
1. **For each timestep** in the simulation:
   - **For each halo** that formed stars at that timestep:
     - Check if unique ID already computed (use cached result)
     - If not cached: trace forward to find last independent existence
     - Create unique ID based on final independent snapshot
     - Trace backward through progenitor chain
     - Assign same unique ID to all progenitors in the chain
     - Cache results for future lookups

**Merger Tree Functions:**
- **`trackforward(step, halo)`**: Traces halo forward to find last independent snapshot
- **`checkmatch_p(step, halo, hid, disp)`**: Verifies progenitor relationship
- **`checkmatch_d(step, halo, hid, disp)`**: Verifies descendant relationship

**Input Files:**
- `<sim>_halostarhosts.txt` from Step 3 (timeline of star-forming halos)
- Tangos database with merger tree information

**Output File:** `<sim>_uniquehalostarhosts.txt`
- Text file with same format as input but with unique IDs replacing local halo indices
- Format: `<timestep>\t<unique_id>,<local_halo_id>[,<star_count>]\t...`
- Example line: `3840    3552_5,15,234    3721_42,42,156`
  - At snapshot 3840: unique halo "3552_5" (local ID 15) formed 234 stars

<!-- **Performance Considerations:**
- **Caching**: Avoids recomputing unique IDs for halos already processed
- **Phantom Handling**: Designed for Rockstar halo finder but slows processing
- **Database Queries**: Intensive use of Tangos merger tree calculations
- **Memory Usage**: Stores merger tree data and caching dictionary -->

**Data Flow:**
```
Step 3 Output: *_halostarhosts.txt (local halo IDs)
                    ↓
Step 4: Merger tree analysis + unique ID assignment
                    ↓
Step 4 Output: *_uniquehalostarhosts.txt (persistent unique IDs)
```

**Special Cases Handled:**
- **Unbound Stars**: Particles with host ID = -1 get unique ID `<snapshot>_0`
- **Phantom Halos**: Temporary tracking losses in halo finder are filtered but accounted for
- **Merger Events**: Multiple local halos can map to the same unique ID if they're part of the same merger tree
- **Isolated Halos**: Halos that never merge retain their snapshot-specific unique ID

**Purpose:** This step solves the fundamental problem that halo IDs change over time due to mergers, making it impossible to track stellar populations. By creating persistent unique IDs, we can:
- Group stars by their true formation halo, even after mergers
- Trace stellar populations through cosmic time
- Identify which stars belong to the main galaxy vs. accreted satellites
- Enable stellar halo analysis based on formation environment

* Usage: `python IDUniqueHost_rz.py <sim> <tangos_simulation> <input_file> <output_file>`
* Example: `python IDUniqueHost_rz.py cptmarvel.cosmo25cmb.4096g5HbwK1BH sim_object halostarhosts.txt uniquehalostarhosts.txt`
* Runtime: ~2 minutes 

<!-- **Important Notes:**
- **Computationally Intensive**: Merger tree queries can be slow for large simulations
- **Rockstar Optimized**: Phantom handling is specific to Rockstar halo finder behavior
- **Database Dependent**: Requires properly constructed Tangos merger trees
- **Bidirectional Verification**: Ensures merger tree consistency through forward/backward checking
- **Memory Scaling**: Caching dictionary grows with number of unique halos processed -->

**Output Validation:**
The output file enables Step 5 to create a comprehensive database where every star particle is assigned to a persistent halo ID, regardless of when it formed or how many mergers occurred afterward.

<!-- ### 4) IDUniqueHost_rz.py

Step 4 of stellar halo pipeline

**What it does:**
- Reads the `<sim>_halostarhosts.txt` file created in Step 3 (timeline of star-forming halos)
- For each star-forming halo, traces its merger history forward in time using Tangos database merger trees
- Creates a unique, persistent ID for each halo that accounts for mergers and halo evolution  
- Assigns the same unique ID to stars formed in progenitor halos that later merge
- Handles "phantom" halos (temporary tracking losses in halo finders) for robust merger tree following

**Input:** `<sim>_halostarhosts.txt` from Step 3 (timeline of star-forming halos with local IDs)

**Output:** `<sim>_uniquehalostarhosts.txt` - Text file containing:
- Same format as input but with unique IDs replacing local halo indices
- Format: `<timestep>\t<unique_id>,<local_halo_id>,<star_count>\t...`
- Unique ID format: `SSSS_H` where `SSSS` = 4-digit snapshot number, `H` = halo index

**Purpose:** This solves the fundamental problem that halo IDs change over time due to mergers. By creating persistent unique IDs, we can group stars by their true formation halo even after mergers, enabling stellar halo analysis based on formation environment.

* Usage: `python IDUniqueHost_rz.py <sim> <tangos_simulation> <input_file> <output_file>`
* Example: `python IDUniqueHost_rz.py cptmarvel.cosmo25cmb.4096g5HbwK1BH sim_object halostarhosts.txt uniquehalostarhosts.txt` -->

In [21]:
import IDUniqueHost_rz
# import unique_host_ids_modified_Version2

In [22]:
sim = tangos.get_simulation(ss_dir)
print(f"Simulation: {sim}")
import collections
d = collections.defaultdict(list)
print(d)

hsfile = os.path.join(outfile_dir, f"{basename}_halostarhosts.txt")
ofile = os.path.join(outfile_dir, f"{basename}_uniquehalostarhosts.txt")

IDUniqueHost_rz.main(sim, d, hsfile, ofile)

Simulation: <Simulation("rogue.4096g5HbwK1BH_bn")>
defaultdict(<class 'list'>, {})
------ 192
Current: 0192_1
Unique:  0768_34121
Current: 0192_2
Unique:  0384_4
Current: 0192_3
Unique:  0192_3
Current: 0192_4
Unique:  0192_4
Current: 0192_5
Unique:  0192_5
Current: 0192_6
Unique:  1824_36
Current: 0192_7
Unique:  0864_1
Current: 0192_8
Unique:  0192_8
Current: 0192_9
Unique:  0192_9
Current: 0192_10
Unique:  0384_10
Current: 0192_11
Unique:  4096_34
Current: 0192_12
Unique:  0192_12
Current: 0192_13
Unique:  0192_13
Current: 0192_14
Unique:  4096_12
Current: 0192_15
Unique:  1536_23
Current: 0192_17
Unique:  0384_18
Current: 0192_18
Unique:  0192_18
Current: 0192_21
Unique:  4096_61
Current: 0192_22
Unique:  3648_1435
Current: 0192_24
Unique:  0960_33
Current: 0192_25
Unique:  0192_25
Current: 0192_27
Unique:  0576_60
Current: 0192_36
Unique:  0192_36
Current: 0192_38
Unique:  4096_77
Current: 0192_45
Unique:  4096_123
Current: 0192_51
Unique:  4096_32
------ 288
Current: 0288_1
Found

  return type(class_name,class_base,class_attrs)


Current: 0576_2
Found key:  0864_1
Current: 0576_3
Found key:  4096_8
Current: 0576_4
Found key:  4096_12
Current: 0576_5
Found key:  4096_3
Current: 0576_8
Found key:  1152_23
Current: 0576_10
Found key:  1536_23
Current: 0576_11
Found key:  2496_27
Current: 0576_12
Found key:  1056_32
Current: 0576_13
Found key:  1920_24
Current: 0576_14
Found key:  4096_10
Current: 0576_15
Found key:  0960_33
Current: 0576_16
Found key:  4096_16
Current: 0576_17
Found key:  4096_30
Current: 0576_18
Found key:  0768_21
Current: 0576_21
Unique:  4096_11


  return type(class_name,class_base,class_attrs)


Current: 0576_22
Found key:  1824_36
Current: 0576_23
Found key:  4096_7
Current: 0576_25
Found key:  1344_48
Current: 0576_26
Found key:  0960_101
Current: 0576_27
Found key:  0672_39
Current: 0576_28
Found key:  3648_1435
Current: 0576_29
Found key:  0864_37
Current: 0576_32
Found key:  0768_52
Current: 0576_34
Unique:  4096_18
Current: 0576_36
Found key:  4096_34
Current: 0576_49
Unique:  1536_816
Current: 0576_60
Found key:  0576_60
------ 672
Current: 0672_1
Found key:  4096_1
Current: 0672_2
Found key:  0864_1
Current: 0672_3
Found key:  4096_8
Current: 0672_4
Found key:  4096_12
Current: 0672_5
Found key:  4096_3
Current: 0672_8
Found key:  4096_7
Current: 0672_9
Found key:  1920_24
Current: 0672_12
Found key:  1152_23
Current: 0672_13
Found key:  0960_33
Current: 0672_14
Found key:  4096_10
Current: 0672_15
Found key:  1536_23
Current: 0672_16
Found key:  4096_11
Current: 0672_17
Found key:  0768_21
Current: 0672_18
Found key:  1056_32
Current: 0672_19


  return type(class_name,class_base,class_attrs)


Unique:  0768_27
Current: 0672_20
Found key:  2496_27
Current: 0672_22
Found key:  4096_16
Current: 0672_23
Found key:  1824_36
Current: 0672_24
Found key:  4096_30
Current: 0672_25
Found key:  1344_48
Current: 0672_26
Found key:  4096_18
Current: 0672_30
Found key:  0864_37
Current: 0672_31
Found key:  0960_101
Current: 0672_34
Found key:  4096_34
Current: 0672_39
Found key:  0672_39
Current: 0672_42
Found key:  3648_1435
Current: 0672_43
Found key:  0768_52
Current: 0672_44
Found key:  1536_816
------ 768
Current: 0768_1
Found key:  4096_1
Current: 0768_2
Found key:  0864_1
Current: 0768_3
Found key:  4096_8
Current: 0768_5
Found key:  4096_7
Current: 0768_6
Found key:  4096_12
Current: 0768_8
Found key:  4096_3
Current: 0768_9
Found key:  1920_24
Current: 0768_11
Found key:  4096_11
Current: 0768_14
Found key:  0960_33
Current: 0768_15
Found key:  1152_23
Current: 0768_17
Found key:  4096_10
Current: 0768_18
Found key:  2496_27
Current: 0768_19
Found key:  1536_23
Current: 0768_20
U

  return type(class_name,class_base,class_attrs)


Unique:  2208_15587
------ 2304
Current: 2304_1
Found key:  4096_1
Current: 2304_2
Found key:  4096_3
Current: 2304_5
Found key:  4096_5
Current: 2304_7
Found key:  4096_7
Current: 2304_8
Found key:  4096_8
Current: 2304_9
Found key:  4096_10
Current: 2304_10
Found key:  4096_9
Current: 2304_11
Found key:  4096_11
Current: 2304_12
Found key:  4096_12
Current: 2304_13
Found key:  4096_14
Current: 2304_15
Found key:  2496_27
Current: 2304_17
Found key:  4096_16
Current: 2304_19
Found key:  4096_23
Current: 2304_20
Found key:  4096_36
Current: 2304_27
Found key:  4096_26
Current: 2304_34
Found key:  4096_37
Current: 2304_156
Found key:  2784_365
------ 2400
Current: 2400_1
Found key:  4096_1
Current: 2400_2
Found key:  4096_3
Current: 2400_5
Found key:  4096_5
Current: 2400_7
Found key:  4096_7
Current: 2400_8
Found key:  4096_8
Current: 2400_9
Found key:  4096_9
Current: 2400_10
Found key:  4096_10
Current: 2400_11
Found key:  4096_11
Current: 2400_12
Found key:  4096_12
Current: 2400_13

In [None]:
unique_host_ids_modified_Version2.main(sim, d, hsfile, ofile)

Using newest snapshot: 4096, previous snapshot: 3968
------ 199
Current: 0199_1
Unique:  3968_6
Current: 0199_2
Unique:  3968_2
Current: 0199_3
Unique:  3968_4
Current: 0199_4


  return type(class_name,class_base,class_attrs)
  return type(class_name,class_base,class_attrs)
  return type(class_name,class_base,class_attrs)
  return type(class_name,class_base,class_attrs)
  return type(class_name,class_base,class_attrs)
  return type(class_name,class_base,class_attrs)
  return type(class_name,class_base,class_attrs)
  return type(class_name,class_base,class_attrs)
  return type(class_name,class_base,class_attrs)


### 5) StoreUniqueHostID_rz.py

Step 5 of stellar halo pipeline
Stores the unique ID of each star particle's host at formation time.
Creates an hdf5 file that contains this in addition to all of the data
from the <sim>_stardata_<snapshot>.h5 files. Note that all star particles
that don't have a host in the snapshot after they formed will be assigned 
a unique ID of <snapshot_index>_0 and a particle host (i.e., host at
formation time) of -1. It is recommended that you use the TrackDownStars
Jupyter notebook to try to manually identify hosts for these stars and then
use FixHostIDs_rz.py to amend <sim>_allhalostardata.h5.

Output: <sim>_allhalostardata.h5

Usage:   python StoreUniqueHostID_rz.py <sim>
Example: python StoreUniqueHostID_rz.py r634 

Note that this is currently set up for MMs, but should be easily adapted 
by e.g., changing the paths or adding a path CL argument.



Step 5 of stellar halo pipeline

**What it does:**
- Combines all `<sim>_stardata_*.h5` files from Step 2 into a single HDF5 file
- Maps each star particle's local host ID to its unique host ID from Step 4
- Creates final dataset linking every star particle to its persistent formation halo
- Handles unbound stars (host ID = -1) by assigning unique IDs like `3840_0`

**Input:** 
- Multiple `<sim>_stardata_*.h5` files from Step 2
- `<sim>_uniquehalostarhosts.txt` from Step 4 (halo ID mapping)

**Output:** `<sim>_allhalostardata.h5` containing:
- `particle_IDs`: Star particle IDs (`iord`)
- `particle_positions`: Formation positions (Mpc)
- `particle_creation_times`: Formation times (Gyr)
- `timestep_location`: Formation snapshot numbers
- `particle_hosts`: Local halo IDs at formation
- `host_IDs`: **Unique persistent halo IDs** (e.g., "3552_5")

**Key Process:**
1. Load unique ID mapping from Step 4: `"timestep,hostid" → "unique_id"`
2. For each star particle: lookup `(timestep, local_halo_id)` → assign unique ID
3. Unbound particles get default IDs: `f"{timestep:04d}_0"`
4. Combine all data into single compressed HDF5 file

**Purpose:** Creates analysis-ready dataset enabling stellar population studies with persistent halo tracking across mergers and cosmic time.

* Usage: `python StoreUniqueHostID_rz.py <sim> <output_directory>`
* Example: `python StoreUniqueHostID_rz.py cptmarvel.cosmo25cmb.4096g5HbwK1BH /output/path/`

In [23]:
import StoreUniqueHostID_rz

StoreUniqueHostID_rz.main(basename, outfile_dir)

/home/ns1917/pynbody/stellarhalo_trace_aw/rogue.cosmo25cmb.4096g5HbwK1BH_stardata_000192.h5 <KeysViewHDF5 ['particle_IDs', 'particle_creation_times', 'particle_hosts', 'particle_positions', 'timestep_location']>
/home/ns1917/pynbody/stellarhalo_trace_aw/rogue.cosmo25cmb.4096g5HbwK1BH_stardata_000288.h5 <KeysViewHDF5 ['particle_IDs', 'particle_creation_times', 'particle_hosts', 'particle_positions', 'timestep_location']>
/home/ns1917/pynbody/stellarhalo_trace_aw/rogue.cosmo25cmb.4096g5HbwK1BH_stardata_000576.h5 <KeysViewHDF5 ['particle_IDs', 'particle_creation_times', 'particle_hosts', 'particle_positions', 'timestep_location']>
/home/ns1917/pynbody/stellarhalo_trace_aw/rogue.cosmo25cmb.4096g5HbwK1BH_stardata_001152.h5 <KeysViewHDF5 ['particle_IDs', 'particle_creation_times', 'particle_hosts', 'particle_positions', 'timestep_location']>
/home/ns1917/pynbody/stellarhalo_trace_aw/rogue.cosmo25cmb.4096g5HbwK1BH_stardata_001248.h5 <KeysViewHDF5 ['particle_IDs', 'particle_creation_times', 'p

### 6a) TrackDownStars_rz.ipnb

### 6b) FixHostID_rz

Optional: Step 6b of stellar halo pipeline
Updates the host_ID values stored in the allhalostardata hdf5 file based
on user input. This is designed as a follow-up to TrackDownStars and can 
be used in a couple of ways. If ffile=True, this script will look for 
numpy files with names <sim>_new_ID_?.npy and will assign the particles
with the iords in a given file to new_ID. If ffile=False, it will assign
all particles with a host_ID in the old_ID list to new_ID.

Output: <sim>_allhalostardata_upd.h5

Usage:   python FixHostIDs_rz.py <sim>
Example: python FixHostIDs_rz.py r634 

The script will print out all host_IDs for which the number of assigned particles 
changed and how many particles each gained/lost. If the output looks correct,
the user should manually rename <sim>_allhalostardata_upd.h5 to <sim>_allhalostardata.h5.
It's often necessary to go back and forth between this and TrackDownStars, in which 
case I usually move the *.npy files that have already been processed to a subfolder. 


In [9]:
import FixHostIDs_rz

In [12]:
FixHostIDs_rz.main(outfile_dir, basename)

1280_17: 508 particles gained
1543_12: 532 particles gained
1920_17: 294 particles lost
2048_29: 295 particles lost
4096_2: 508 particles lost
4096_4: 57 particles gained
4096_5: 135 particles lost
0291_1000: 135 particles gained
---------------------------------
If this looks correct, run
mv /home/selvani/MAP/pynbody/stellarhalo_trace_aw/cptmarvel.cosmo25cmb.4096g5HbwK1BH_allhalostardata_upd.h5 /home/selvani/MAP/pynbody/stellarhalo_trace_aw/cptmarvel.cosmo25cmb.4096g5HbwK1BH_allhalostardata.h5


In [20]:
# Read in your data
with h5py.File(outfile_dir+'/'+basename+'_allhalostardata_upd.h5','r') as f:
    hostids = f['host_IDs'].asstr()[:] # unique host IDs
    partids = f['particle_IDs'][:] # iords
    pct = f['particle_creation_times'][:] # formation times
    ph = f['particle_hosts'][:] # local host IDs (i.e., host at formation time)
    pp = f['particle_positions'][:] # position at formation time
    tsloc = f['timestep_location'][:] # snapshot where star particle first appears
uIDs = np.unique(hostids)

In [14]:
df = pd.DataFrame({
    'host_IDs': hostids,
    'particle_IDs': partids,
    # 'particle_creation_times': pct,
    # 'particle_hosts': ph,
    # 'particle_positions': pp,
    # 'timestep_location': tsloc
})
# to csv
df.to_csv(outfile_dir+'/final_'+basename+'_allhalostardata.csv', index=False)

### 6c) CompTwoHalos_rz

Optional: Step 6c of stellar halo pipeline
Compares two halos to see how likely it is that one is
the main progenitor of the other based on how many particles
they have in common. This is particularly useful when you've
used a merger tree constructor that doesn't use phantoms or
some equivalent and may therefore fail to connect a halo at
snapshot1 to the same halo at snapshot3 if it lost track of it
at snapshot2. This information can then be used with FixHostIDs_rz
to merge two unique IDs and/or to create a new link in the 
relevant tangos db.

Usage: python CompTwoHalos_rz.py <sim> <halo1> <halo2>
Example: python CompTwoHalos_rz.py r718 0136_4 0192_3

Output: prints out the fraction of <halo1>'s DM particles
that are in <halo2> and vice-versa.

It takes three arguments: the simulation you're working with
and the tangos IDs of the two halos you want to compare, which
are formatted as <snapshot>_<IDatsnapshot>. Note that this ID
is assumed to be the tangos ID, not necessarily the amiga.grp
ID.


## Extras

In [103]:
# boxsize, mass unit, vel unit, h

import pynbody

print(os.listdir('/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/'))

s = pynbody.load('/home/selvani/MAP/Sims/cptmarvel.cosmo25cmb/cptmarvel.cosmo25cmb.4096g5HbwK1BH/cptmarvel.4096g5HbwK1BH_bn/cptmarvel.cosmo25cmb.4096g5HbwK1BH.004096')
s.properties

['cptmarvel.cosmo25cmb.4096g5HbwK1BH.000482.amiga.stat', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.001792.igasorder', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.002048.z0.824.AHF_fpos', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.000768', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.002048.amiga.grp.pynbody-meta', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.002162', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.001664.amiga.stat', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.003456.FeMassFrac', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.000291.amiga.stat', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.003968.igasorder', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.002176.z0.741.AHF_profiles', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.000512', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.003245.amiga.grp', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.001025.z1.999.AHF_halos', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.003328.OxMassFrac', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.001162.trace_back_merge.hdf5', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.002816.amiga.stat', 'cptmarvel.cosmo25cmb.4096g5HbwK1BH.0006

{'omegaM0': 0.24,
 'omegaL0': 0.76,
 'h': 0.7299490542599526,
 'boxsize': Unit("2.50e+04 kpc a"),
 'a': 1.0000000000142635,
 'time': Unit("1.40e+01 s kpc km**-1")}

In [11]:
!tangos serve

Starting server in PID 67784.
2025-07-25 15:54:10,352 INFO  [waitress:449][MainThread] Serving on http://[::1]:6543
2025-07-25 15:54:10,352 INFO  [waitress:449][MainThread] Serving on http://127.0.0.1:6543
2025-07-25 15:54:21,074 : Tree build complete; total time 1.28s
2025-07-25 15:54:21,074 INFO  [tangos.log:72][waitress-0] Tree build complete; total time 1.28s
2025-07-25 15:54:21,074 :   Progenitor query took 1.09s
2025-07-25 15:54:21,074 INFO  [tangos.log:73][waitress-0]   Progenitor query took 1.09s
2025-07-25 15:54:21,074 :   Property query took 0.01s
2025-07-25 15:54:21,074 INFO  [tangos.log:74][waitress-0]   Property query took 0.01s
2025-07-25 15:54:21,074 :   Tree post-processing took 0.18s
2025-07-25 15:54:21,074 INFO  [tangos.log:75][waitress-0]   Tree post-processing took 0.18s
2025-07-25 15:54:31,899 : Tree build complete; total time 0.31s
2025-07-25 15:54:31,899 INFO  [tangos.log:72][waitress-0] Tree build complete; total time 0.31s
2025-07-25 15:54:31,899 :   Progenitor

In [105]:
stars = s.s

In [114]:
stars['age'].in_units('Gyr')

SimArray([ 1.35174602e+01,  1.35124357e+01,  1.35082486e+01, ...,
          -6.11688358e-05, -6.11688358e-05, -6.11688358e-05],
         shape=(414599,), 'Gyr')

In [115]:
stars['tform'].in_units('Gyr')

SimArray([ 0.21365837,  0.21868286,  0.22286992, ..., 13.7311797 ,
          13.7311797 , 13.7311797 ], shape=(414599,), 'Gyr')

In [118]:
halos = s.halos()

In [119]:
halos['pos'].in_units('kpc')

AttributeError: 'SubhaloCatalogue' object has no attribute 'in_units'

In [117]:
stars['pos'].in_units('kpc')

SimArray([[  188.98196286,  -441.26994908,  1448.17242401],
          [  189.13126551,  -440.93732723,  1447.86080348],
          [  189.34544642,  -440.64465911,  1448.39063289],
          ...,
          [  566.63681754,  -234.70097222,  1668.89894756],
          [ -563.58575822,  -739.46160265,  1163.97785025],
          [-1974.30718693, -2313.09272352,   533.08499046]],
         shape=(414599, 3), 'kpc')

In [112]:
stars['vel'].in_units('km s**-1')

SimArray([[ 35.8012164 , -61.13322961,  50.41188651],
          [ 30.08138783, -47.59994133,  40.65547787],
          [ 31.14195046, -61.25638715,  41.8654215 ],
          ...,
          [ 19.43312753,   9.88569869,  44.19992337],
          [ -9.5371297 , -59.62350931,  23.09402733],
          [-89.22849431, -72.43745011,  -7.20320855]], shape=(414599, 3), 'km s**-1')

In [109]:
stars.derivable_keys()

['HII',
 'HeIII',
 'ne',
 'hetot',
 'hydrogen',
 'feh',
 'oxh',
 'ofe',
 'mgfe',
 'nefe',
 'sife',
 'c_s',
 'c_s_turb',
 'mjeans',
 'mjeans_turb',
 'ljeans',
 'ljeans_turb',
 'U_mag',
 'U_lum_den',
 'B_mag',
 'B_lum_den',
 'V_mag',
 'V_lum_den',
 'R_mag',
 'R_lum_den',
 'I_mag',
 'I_lum_den',
 'J_mag',
 'J_lum_den',
 'H_mag',
 'H_lum_den',
 'K_mag',
 'K_lum_den',
 'u_mag',
 'u_lum_den',
 'g_mag',
 'g_lum_den',
 'r_mag',
 'r_lum_den',
 'i_mag',
 'i_lum_den',
 'z_mag',
 'z_lum_den',
 'y_mag',
 'y_lum_den',
 'r',
 'rxy',
 'vr',
 'v2',
 'vt',
 'ke',
 'te',
 'j',
 'j2',
 'jz',
 'vrxy',
 'vcxy',
 'vphi',
 'vtheta',
 'v_mean',
 'v_disp',
 'v_curl',
 'vorticity',
 'v_div',
 'age',
 'theta',
 'alt',
 'az',
 'cs',
 'mu',
 'p',
 'u',
 'temp',
 'zeldovich_offset',
 'aform',
 'tform',
 'iord_argsort',
 'smooth',
 'rho',
 'mass_holder',
 'HI_frac',
 'HI_mass',
 'HI_N']

In [11]:
print(s['mass'].units)
print(s['vel'].units)

2.31e+15 Msol
6.30e+02 km a s**-1


In [14]:
result = 'cosmo25cmb.4096g5HbwK1BH_stardata_002688.h5'
result.split('.')[-2][-6:]

'002688'

### Make new amiga.grp files

In [12]:
steps = [step.extension for step in tangos.get_simulation(ss_dir).timesteps]

In [14]:
def create_thread_groups(items, n_threads):
    """
    Split items into groups for parallel processing
    
    Parameters:
    -----------
    items : list or array
        Items to be processed (in your case, the steps)
    n_threads : int
        Number of threads/processes to use
        
    Returns:
    --------
    groups : list of lists
        Each sublist contains items for one thread to process
    """
    import math
    items = list(items)  # Ensure it's a list
    n_items = len(items)
    
    if n_threads >= n_items:
        # If more threads than items, give each item its own thread
        return [[item] for item in items]
    
    # Calculate items per thread
    items_per_thread = math.ceil(n_items / n_threads)
    
    groups = []
    for i in range(0, n_items, items_per_thread):
        group = items[i:i + items_per_thread]
        groups.append(group)
    
    return groups

import pynbody
import tqdm.auto as tqdm
import time

from multiprocessing import Pool

def process_step(step):
    if type(step) is not str:
        pbar = tqdm.tqdm(total=len(step))
        for st in step:
            _process_step(st)
            pbar.update(1)
        pbar.close()
    else:
        _process_step(step)
    return 0    

def _process_step(step):
    path = sim_base + step
    # print(path)
    # time.sleep(1)
    print('Loading <{}>'.format(step))
    f = pynbody.load(path)
    print('  Loading halos for <{}>'.format(step))
    try:
        h = f.halos(halo_numbers='v1')
        print('    Writing amiga.grp for <{}>'.format(step))
        f['amiga.grp'] = h.get_group_array()
        f['amiga.grp'].write(overwrite=True)
        print('    Finished writing amiga.grp for <{}>'.format(step))
    except Exception as e:
        print('  ERROR loading halos for <{}>: {}'.format(step, e))

In [15]:
n_threads = 14

groups = create_thread_groups(steps, n_threads=n_threads)
for group in groups:
    print(group)

['rogue.cosmo25cmb.4096g5HbwK1BH.000192', 'rogue.cosmo25cmb.4096g5HbwK1BH.000288', 'rogue.cosmo25cmb.4096g5HbwK1BH.000384']
['rogue.cosmo25cmb.4096g5HbwK1BH.000480', 'rogue.cosmo25cmb.4096g5HbwK1BH.000576', 'rogue.cosmo25cmb.4096g5HbwK1BH.000672']
['rogue.cosmo25cmb.4096g5HbwK1BH.000768', 'rogue.cosmo25cmb.4096g5HbwK1BH.000864', 'rogue.cosmo25cmb.4096g5HbwK1BH.000960']
['rogue.cosmo25cmb.4096g5HbwK1BH.001056', 'rogue.cosmo25cmb.4096g5HbwK1BH.001152', 'rogue.cosmo25cmb.4096g5HbwK1BH.001248']
['rogue.cosmo25cmb.4096g5HbwK1BH.001344', 'rogue.cosmo25cmb.4096g5HbwK1BH.001440', 'rogue.cosmo25cmb.4096g5HbwK1BH.001536']
['rogue.cosmo25cmb.4096g5HbwK1BH.001632', 'rogue.cosmo25cmb.4096g5HbwK1BH.001728', 'rogue.cosmo25cmb.4096g5HbwK1BH.001824']
['rogue.cosmo25cmb.4096g5HbwK1BH.001920', 'rogue.cosmo25cmb.4096g5HbwK1BH.002016', 'rogue.cosmo25cmb.4096g5HbwK1BH.002112']
['rogue.cosmo25cmb.4096g5HbwK1BH.002208', 'rogue.cosmo25cmb.4096g5HbwK1BH.002304', 'rogue.cosmo25cmb.4096g5HbwK1BH.002400']
['rogue.

In [16]:
with Pool(processes=n_threads) as pool:
    results = pool.map(process_step, groups)

Loading <rogue.cosmo25cmb.4096g5HbwK1BH.001632>Loading <rogue.cosmo25cmb.4096g5HbwK1BH.001344>Loading <rogue.cosmo25cmb.4096g5HbwK1BH.000192>Loading <rogue.cosmo25cmb.4096g5HbwK1BH.003648>Loading <rogue.cosmo25cmb.4096g5HbwK1BH.003936>Loading <rogue.cosmo25cmb.4096g5HbwK1BH.002496>Loading <rogue.cosmo25cmb.4096g5HbwK1BH.003360>Loading <rogue.cosmo25cmb.4096g5HbwK1BH.002784>Loading <rogue.cosmo25cmb.4096g5HbwK1BH.000768>Loading <rogue.cosmo25cmb.4096g5HbwK1BH.001056>Loading <rogue.cosmo25cmb.4096g5HbwK1BH.003072>Loading <rogue.cosmo25cmb.4096g5HbwK1BH.001920>Loading <rogue.cosmo25cmb.4096g5HbwK1BH.000480>



Loading <rogue.cosmo25cmb.4096g5HbwK1BH.002208>









  Loading halos for <rogue.cosmo25cmb.4096g5HbwK1BH.001920>
  Loading halos for <rogue.cosmo25cmb.4096g5HbwK1BH.002784>  Loading halos for <rogue.cosmo25cmb.4096g5HbwK1BH.003360>

  Loading halos for <rogue.cosmo25cmb.4096g5HbwK1BH.003072>
  Loading halos for <rogue.cosmo25cmb.4096g5HbwK1BH.003936>
  Loading halos for <rogue.c

In [18]:
fname = sim_base + basename+ '.004096'
print(fname)
s = pb.load(fname)
unique_gp = np.unique(s.s['amiga.grp'])
print(unique_gp)

/home/ns1917/tangos_sims/rogue.4096g5HbwK1BH_bn/rogue.cosmo25cmb.4096g5HbwK1BH.004096
[   -1     1     3     5     7     8     9    10    11    12    13    14
    16    17    18    23    26    29    30    32    34    35    36    37
    61    77   123   702   848  2944  3626  4563  5225  5625  6092  7676
 10432 12440 25767]


### Image

In [None]:
!pip install imageio imageio-ffmpeg

Collecting imageio
  Downloading imageio-2.37.0-py3-none-any.whl.metadata (5.2 kB)
Collecting imageio-ffmpeg
  Downloading imageio_ffmpeg-0.6.0-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Downloading imageio-2.37.0-py3-none-any.whl (315 kB)
Downloading imageio_ffmpeg-0.6.0-py3-none-manylinux2014_x86_64.whl (29.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m29.5/29.5 MB[0m [31m33.2 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hInstalling collected packages: imageio-ffmpeg, imageio
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2/2[0m [imageio]m1/2[0m [imageio]
[1A[2KSuccessfully installed imageio-2.37.0 imageio-ffmpeg-0.6.0


In [22]:
import imageio.v2 as imageio
import numpy as np
import glob
import os
import re

def create_animation_with_padding(input_folder, output_file, fps=10):
    """
    Creates a GIF or MP4 from a folder of PNG images with varying sizes.

    It finds the largest image dimensions and pads smaller images with black
    borders to create uniformly-sized frames, preventing distortion.
    
    Args:
        input_folder (str): Path to the folder containing the PNG images.
        output_file (str): Path for the output animation file. The extension 
                           (.gif or .mp4) determines the format.
        fps (int): Frames per second for the output animation.
    """
    search_path = os.path.join(input_folder, '*.png')
    filenames = glob.glob(search_path)

    if not filenames:
        print(f"Error: No PNG files found in '{input_folder}'")
        return

    # Sort files naturally to handle numbers like 'plot_2.png' before 'plot_10.png'
    def natural_sort_key(s):
        return [int(text) if text.isdigit() else text.lower() for text in re.split('([0-9]+)', s)]
    
    filenames.sort(key=natural_sort_key)

    # --- Step 1: Scan all images to find the maximum dimensions ---
    print("Scanning images to determine maximum dimensions...")
    max_height = 0
    max_width = 0
    
    # Also determine the color channel configuration (RGB vs RGBA) from the first image
    first_image = imageio.imread(filenames[0])
    print(f"First image shape: {first_image.shape}")
    channels = first_image.shape[2] if len(first_image.shape) == 3 else 1
    dtype = first_image.dtype

    for filename in filenames:
        img = imageio.imread(filename)
        if len(img.shape) == 3: # Ensure it's a color image
            h, w, _ = img.shape
            max_height = max(max_height, h)
            max_width = max(max_width, w)

    print(f"All frames will be resized to {max_width}x{max_height} pixels.")

    # --- Step 2: Create the animation with padded frames ---
    print(f"Creating animation at '{output_file}'...")
    with imageio.get_writer(output_file, fps=fps) as writer:
        for filename in tqdm.tqdm(filenames, desc="Processing frames"):
            # Read the original image
            img = imageio.imread(filename)
            
            # Create a new black canvas with the target dimensions
            padded_frame = np.zeros((max_height, max_width, channels), dtype=dtype)
            
            # Calculate offsets to center the image
            h, w, _ = img.shape
            y_offset = (max_height - h) // 2
            x_offset = (max_width - w) // 2
            
            # Paste the original image onto the canvas
            padded_frame[y_offset:y_offset+h, x_offset:x_offset+w, :] = img
            
            # Append the padded frame to the video
            writer.append_data(padded_frame)

    print("Animation created successfully!")

def create_animation(input_folder, output_file, fps=10):
    """
    Creates a GIF or MP4 from a folder of PNG images.

    The function assumes filenames can be sorted naturally (e.g., frame_1.png, frame_2.png, ... frame_10.png).

    Args:
        input_folder (str): Path to the folder containing the PNG images.
        output_file (str): Path for the output animation file. The extension (.gif or .mp4) determines the format.
        fps (int): Frames per second for the output animation.
    """
    search_path = os.path.join(input_folder, '*.png')
    filenames = glob.glob(search_path)

    if not filenames:
        print(f"Error: No PNG files found in the directory: {input_folder}")
        return

    filenames.sort(key=lambda x: str.split(str.split(x, '.')[-2], '_')[-1])  # Sort by the numeric part of the filename

    print(f"Found {len(filenames)} images. Creating animation at {output_file}...")

    # Create the animation
    with imageio.get_writer(output_file, fps=fps) as writer:
        for filename in filenames:
            image = imageio.imread(filename)
            writer.append_data(image)

    print("Animation created successfully!")
    
# # --- Example Usage ---
# # Replace this with the actual path to your saved plots
# image_folder = '/home/selvani/MAP/pynbody/stellarhalo_trace_aw/merge_plots/3' 

# # Define the output file path (can be .gif or .mp4)
# output_path = os.path.join(image_folder, 'merger_animation.mp4')

# # Create the animation
# create_animation_with_padding(image_folder, output_path, fps=5)

In [6]:
import tqdm.auto as tqdm

In [None]:
image_folder = os.path.join(outfile_dir, 'merge_plots')

'/home/selvani/MAP/pynbody/stellarhalo_trace_aw/merge_plots'

In [None]:
folders = ['2', '2zoom', '2morezoom']
for folder in folders:
# for folder in os.listdir(image_folder):
    folder_path = os.path.join(image_folder, folder)
    base_name = os.path.basename(folder_path)
    print(f'In folder {folder}')
    output_gif_path = os.path.join(image_folder, f'4096_{base_name}_animation.gif')
    output_mp4_path = os.path.join(image_folder, f'4096_{base_name}_animation.mp4')

    create_animation(folder_path, output_gif_path, fps=12)
    create_animation_with_padding(folder_path, output_mp4_path, fps=6)

In folder 5
Found 42 images. Creating animation at /home/selvani/MAP/pynbody/stellarhalo_trace_aw/merge_plots/4096_5_animation.gif...


  image = imageio.imread(filename)


Animation created successfully!
Scanning images to determine maximum dimensions...


  first_image = imageio.imread(filenames[0])


First image shape: (1841, 5390, 4)


  img = imageio.imread(filename)


All frames will be resized to 5390x1841 pixels.
Creating animation at '/home/selvani/MAP/pynbody/stellarhalo_trace_aw/merge_plots/4096_5_animation.mp4'...


Processing frames:   0%|          | 0/42 [00:00<?, ?it/s]

  img = imageio.imread(filename)


Animation created successfully!
In folder 5zoom
Found 42 images. Creating animation at /home/selvani/MAP/pynbody/stellarhalo_trace_aw/merge_plots/4096_5zoom_animation.gif...
Animation created successfully!
Scanning images to determine maximum dimensions...
First image shape: (1841, 5390, 4)
All frames will be resized to 5390x1841 pixels.
Creating animation at '/home/selvani/MAP/pynbody/stellarhalo_trace_aw/merge_plots/4096_5zoom_animation.mp4'...


Processing frames:   0%|          | 0/42 [00:00<?, ?it/s]



Animation created successfully!
In folder 5morezoom
Found 42 images. Creating animation at /home/selvani/MAP/pynbody/stellarhalo_trace_aw/merge_plots/4096_5morezoom_animation.gif...
Animation created successfully!
Scanning images to determine maximum dimensions...
First image shape: (1841, 5390, 4)
All frames will be resized to 5390x1841 pixels.
Creating animation at '/home/selvani/MAP/pynbody/stellarhalo_trace_aw/merge_plots/4096_5morezoom_animation.mp4'...


Processing frames:   0%|          | 0/42 [00:00<?, ?it/s]



Animation created successfully!
