ABE is written for studying the co-evolution of the supermassive black holes with its host galaxies, particularly looking at the effect of AGN feedback. It uses Massive Black-Hole II simulation snapshots based on GADGET 3 to explore SBHs and its environment. Single redshift snapshot of this simulation is of size 1 TB. The python scripts implemented with Message Passing Interface help to handle the big data of MB II simulation snapshots.
will be added ...
Snapshot is the set of files saved during the Gadget simulation. It is the picture of the simulated universe at a particular cosmic time. In MB-II simulation, a single snapshot data comprises of 1024 files each sizing 1 GB. MB II simulation contains particles like Dark Matter, Black Holes(hereafter BH), gas particles, stars and so on. If you want to know more about the snapshot, read user guide of Gadget-2. Each file of this snapshot contains on an average 70 BHs. But if you want to find gas particles around one of this BH within a certain range, say 25Kpc, you need to look into every other 1023 files. In that sense, the snapshot individual files are not complete for a particular space of simulation. As a whole, it is.
ABE-I deals with the big data in three different sequential procedures. First, it scans the entire data and saves what it needs, also catalogue pieces of information about saved data. Next is a high-performance computation. Using the saved information ABE-I analyses the entire data deeply, but this time it looks only where it wants to look. The three main scripts are:
Other two python scripts which are used for formatting the outputs of main scripts are:
See the detailed Flowchart here.
CatalogMaker reads the snapshot files iteratively and saves BHs and surrounded gas particles and its properties. The user has the option to exclude the BHs whose mass is less than a certain cutoff value. It saves data in python pickle binary format. The outputs of CatalogMaker are explained below.
1.data.p: Python dictionary, keys are BH IDs, and the corresponding values are NumPy arrays having the gas particles' internal energy, density, position-x, position-y, position-z, smoothing length and electron abundance.
2.mass.p: Python dictionary, keys are BH IDs and the corresponding values are the BH masses.
3.bh_cat.p: Python array having BH IDs and their position. The data type of this array is the string (to convert the data type to float see lines 140-143 in CatalogAnalyser_MPI.py).
4.snap_cat.p: Python dictionary, keys are the filenames and corresponding values are the arrays of BH IDs in that files.
5.acc.p: Python dictionary, keys are BH IDs and the corresponding values are their accretion rates.
To Run this code use
mpiexec -n 8 python CatalogMaker_MPI.py
The output of the CatalogMaker is dumped from every processor. That means, If you use 8 processor for executing CatalogMaker then there will be 8 files for each of the above-listed files. So for combining it into a single file, you need to run
See the detailed Flowchart here.
CatalogAnalyser reads snapshot files iteratively and for each file, it checks which of the BHs in bh_cat.p has gas particles inside. If it finds gas particles then it updates data for that BHs in data.p and then goes to the next file. Updated data will be saved after the execution as dataUpd.p.
To Run this code use
mpiexec -n 8 python CatalogAnalyser_MPI.py
DataEditor finds the temperature of the gas particles using internal energy and Electron abundance.It removes the internal energy and Electron abundance arrays from dataUpd.p, then it adds Temperature array and save it as dataEdit.p.
dataEdt.p: Python dictionary, keys are BH IDs, and the corresponding values are NumPy arrays having the gas particles' temperature, density, position-x, position-y, position-z and smoothing length.
See the detailed Flowchart here.
This Code uses dataEdt.p, mass.p, and acc.p. BHs are binned according to their masses with a user-defined bin size. After binning, it finds the average flux of each BH, average flux of BHs in a bin, and the stacked maps of BHs in the bins. It saves few data files:
1.stack.p: Python dictionary, keys are bin Nos. and corresponding values are stacked map arrays (100, 100)
2.tab.p: Python dictionary, keys are bin Nos. and corresponding values are arrays. Each array has average flux, average mass, and average accretion of that bin
3.bin.p: Python dictionary, keys are bin Nos. and corresponding values are the list of BHs in the bins.
4.lum.p: Python dictionary, Keys are BH IDs, and corresponding values are the average flux of BHs
After the execution of DataAnalyser, you need to run Combiner again to combine the lum.p dumps. Combine them using
python Combiner lum. Argument 'lum' is mandatory.
How to use ABE.
Clone this GitHub repository using,
git clone https://github.com/antolonappan/ABE-I.git
Apart from the above-mentioned scripts, you will find one more python script, initial.py and a configuration file, abe.ini.
This script will do all initial setups for running ABE. Inside the root folder specified in the configuration file, it creates the folder with name as the date of running. It helps the user to identify the previous runs. If ABE is running multiple times in a day then it creates the folders with date+time as the name. It also edits abe.ini for specifying the output and log directories.
[inputs] snapshot = /home/snapshot # snapshot directory path, 'don't' put '/' at the end. [outputs] root_dir = Runs # Output folder name. If it doesn't exist, initial.py creates it [misc] mode_run = 'automated' # Mode of running, 'automated' or 'individual'. agn_matrix_x = 100 # X and agn_matrix_y = 100 # Y axis no. of pixels resolution box = 25 # Selection range of gas particles in Kpc. mass_cutoff = T # 'T' For taking a mass cutoff, if 'F' ABE takes full BHs for analysis lower_cutoff_value = 1e7 # If mass_cutoff is True then Specify the cutoff values here upper_cutoff_value = 1e10 no_of_cores = 4 # No. of cores using for running ABE. delete_dump = T # If True ABE won't keep any dump files of data, bh_cat, snap_cat, mass, acc, lum bin = 0.2 # Binning size in log10 scale. [live] # This is part is edited by initial.py and other main programs. But if you are running # ABE without abe.sh then you need to specify the output and log directories here. cat_mak_out = Runs/9-4-2018/out/CatMak # Output directory of CatalogMaker cat_anl_out = Runs/9-4-2018/out/CatAnl # Output directory of CatalogAnalyser dat_anl_out = Runs/9-4-2018/out/DatAnl # Output directory of DataAnalyser cat_mak_log = Runs/9-4-2018/log/CatMak # Log directory of CatalogMaker cat_anl_log = Runs/9-4-2018/log/CatAnl # Log directory of CatalogAnalyser dat_anl_log = Runs/9-4-2018/log/DatAnl # Log directory of DataAnalyser pro_cm = 4 # If no_of_cores is not a total divisor of No. of snapshot files # initial.py finds the total divisor and assign it here. # So CatalogMaker uses only this much cores. If no_of_cores is a # total divisor then pro_cm = no_of_cores last_program = DataEditor.py # After Executing every program ABE saves the name of the program # here. This helps the user to run scripts in the correct sequence
After configuring abe.ini you can run ABE in two modes:
This shell script automates the entire processes. It creates a folder 'RunStatus' where it saves outputs and standard errors of the python scripts. So the user can see the errors occurred while running the scripts. Finally, it tells you how much time it took to complete the analysis. For running abe.sh use
chmod +x abe.sh # first time only, for making it executable ./abe.sh
For running ABE in manual mode, one needs to edit abe.ini and run
initial.py. After that run the scripts in the order specified in section ABE-I.
Logs: Status of the Process
ABE logs detailed running statuses for all main scripts.
After reading and saving required information from snapshot files, it logs the filename of the finished file. All cores log separately in log files starting with 'snap'.
This script has two log files: 'snapshot log information about finished snapshot files, logs starting with the name 'process' logs the status of every core
Log files of DataAnalyser are 'process', 'MassPopulation' and 'Luminosity'. 'Process' is logged by the parent core when it initiates different processes. 'MassPopulation' logs details of binning, no. of bins, ranges of bins which excluded in the analysis and so on. 'Luminosity' is logged by every core, it contains the status of execution.
Untar the Example.tar.gz to see an automated run of ABE. For this run, ABE used only 4/1024 snapshot files. You can find all logs and data files inside. To read any data or catalog with an extension '.p' open a python terminal and use following lines.
import _pickle as pl with open('dataEdt.p', 'rb') as reader: data = pl.open(reader)