## How to find FG candidates from a sample of halos
This is the most standard approach, using masks. But generally, you just need a `target_idx` of halo indices (or just one halo index). These halo indices are not necessarily the same as `forest['halo_id']`, because a halo's index depends on the size of the forest being used. `forest['halo_id']`s are made for the full LJ forest. Since we are usually looking only at a subset of the full forest (i.e. halos with z=0 masses in one of our narrow mass bins), our forest is usually much smaller than this. As such, you should compute your halo indices for your specific forest(s) each time you begin a new notebook.

In [2]:
import haccytrees.mergertrees
import numpy as np
import pandas as pd
from astropy.cosmology import FlatLambdaCDM
import astropy.units as u
%load_ext line_profiler
%reload_ext autoreload
%autoreload 1
%aimport help_func_haccytrees

The line_profiler extension is already loaded. To reload it, use:
  %reload_ext line_profiler


### Read in merger trees (may take a few minutes)

In [3]:
%%time
forest, progenitor_array = haccytrees.mergertrees.read_forest(
    '/data/a/cpac/mbuehlmann/LastJourney/forest/target_forest_aurora.hdf5',
    'LastJourney', nchunks=1, chunknum=0,
    include_fields = ["tree_node_mass", "snapnum", "fof_halo_tag", "sod_halo_cdelta", "fof_halo_center_x", "fof_halo_center_y", "fof_halo_center_z"]
)

CPU times: user 27.7 s, sys: 49.1 s, total: 1min 16s
Wall time: 1min 16s


### Pick out halos we want to examine (those in our narrow mass bins)

In [5]:
"""
About helper function: help_func_haccytrees.make_masks()

Parameters
----------
my_forest: dict
    A forest generated by `haccytrees.mergertrees.read_forest()`
bins: list of lists (opt)
    default: [[1e13, 10**13.05], [10**13.3, 10**13.35], [10**13.6, 10**13.65]]
pre_masked_z0: boolean (opt)
    default: False
    Use if my_forest is already constrained to snapnum of 100 (this is unusual)
    
Returns
-------
masks: list of lists
    default: three masks, one for each narrow mass bin
    Each list is a boolean mask of shape (nhalos,) (that's the same as length of one column of my_forest).
    You can use np.nonzero(masks[i])[0] to return the list of indices (`halo_idxs`) of halos in your sample (for the ith mass bin).
    Then, use something like `forest[key][halo_idxs]` to find the value of any key stored in `forest` for the halos in in your index.
        e.g. `forest['tree_node_mass'][halo_idxs]` returns halo masses for the halos in your sample
"""

halo_masks = help_func_haccytrees.make_masks(forest) # Forest should already only contain values in these bins

### Find FG candidates within this sample
**Important:** at the moment, the output `fgs` only includes "pure FG candidates," meaning those that are *not* also QH candidates. But, in the paper, the FG sample (purple lines) includes both "pure FG candidates" and QH candidates together. So if you are trying to compare statistics to something in the paper, be sure to use the *sum* of the number of `fgs` and `qhs` below.

In [8]:
%%time
# Go find yourself some fossil groups!
for i, this_halo_mask in enumerate(halo_masks):
    target_idx = np.nonzero(this_halo_mask)[0]
    mainbranch_index, mainbranch_masses = help_func_haccytrees.get_branches(target_idx, forest)
    mainbranch_mergers = help_func_haccytrees.get_mainbranch_mergers(forest, progenitor_array, mainbranch_index)
    major_mergers = help_func_haccytrees.get_major_mergers(mainbranch_mergers)
    lmm_redshifts = help_func_haccytrees.get_lmms(major_mergers)
    fgs, qhs, mrich = help_func_haccytrees.find_specials(forest, mainbranch_index, major_mergers, lmm_redshifts, target_idx)
    print("bin ", str(i))
    print(len(target_idx), " halos")
    print(len(fgs), " fossils")
    print(len(qhs), " qhs")
    print(len(mrich), " merger rich halos\n")

bin  0
269358  halos
33569  fossils
4171  qhs
0  merger rich halos

bin  1
36181  halos
1364  fossils
9  qhs
0  merger rich halos

bin  2
2454  halos
13  fossils
0  qhs
0  merger rich halos

CPU times: user 3.08 s, sys: 559 ms, total: 3.63 s
Wall time: 2.21 s
