Modeling synthetic observations of the sky is an important task in modern cosmology, with the general hierarchy of necessary data products being as follows:
- Develop and run simulations of structure formation within a cosmological volume (from this we obtain the mass distribution of the contents of the universe).
- Place an "observer" at some location within the simulation volume, and construct a lightcone, which is a four dimensional hypersurface on which lie all spacetime events output by the simulation that are seen by the observer at z=0 (we then have the mass distribution that can actually be seen in the sky today).
- Finally, extract a particular field of view from this lightcone that a telescope would actually observe during an exposure.
The code distributed in this repository performs the final step above, slicing out a rectangular cone, from a cosmological lightcone catalog, defined by some angular bounds with respect to an observer.
This project utilizes GenericIO to read in simulation lightcone outputs, as well as OpenMP and MPI. The code is intended to be run on the system environments at the ALCF, and will otherwise need its Makefile to be edited; specifically, look toward the
INCLUDES to ensure that GenericIO is properly linked. The next section of this readme will assume that HACC, or some other source of GenericIO, is already built. If using an ALCF machine, the
Makefiles present in the repo already point to the correct locations.
If running at ALCF, first clone the repo
git clone https://github.com/jhollowed/cosmo-cutout.git
and make the following environment changes...
First edit your
~/.soft to include these lines
+mvapich2-2.1-intel +intel-composer-xe-2015 @default
Next, compile the source code
make -f Makefile.mira
make -f Makefile.cooley
Running the cutout
θ - coaltitutde coordinate
ϕ - azimuthal coordinate
x, y, z - comoving Cartesian coordinates
lc_cutout - the executable resultant from following the installation instructions
Two use cases are supported:
Use case 1: Constant angular bounds
Define the theta and phi bounds explicitly:
lc_cutout <input lightcone directory> <output directory> <min redshift> <max redshift> --theta <theta_center> <d_theta> --phi <phi_center> <d_phi;>
input lightcone directory- the location of the top-level directory of a simulation lightcone. The directory structure is expected to match that as described in section 4.5 (Figure 7) of Creating Lightcones in HACC, which contains step-wise subdirectories
output directory- where to save the result of the cutout. A new subdirectory will be created at this location, of the form
lcCutoutXXXfor each step
XXXincluded in the calculation
min redshift- allows the user to begin the cutout construction at some distance away from the observer position (the origin) in redshift-space. Setting this parameter to a nonzero value is intended to be used in the case that the user is breaking up the cutout procedure across separate jobs
max redshift- controls the depth of the cutout in redshift-space (limited, of course, by the maximum redshift of the input lightcone catalog)
theta_center- the θ coordinate of the center of the desired cutout field of view
d_theta- the distance from θcenter to the edge of the field of view
phi_centerthe ϕ coordinate of the center of the desired cutout field of view
d_phi- the distance from ϕcenter to the edge of the field of view
That is, the result of the cutout will be a sky area that spans
(θcenter - dθ) < θ < (θcenter + dθ)
(ϕcenter - dϕ) < ϕ < (ϕcenter + dϕ)
and redshift range...
The expected angular units are DEGREES. The
--phi flags can be replaced with
<min redshift> and
<max redshift> arguments can also be replaced with snapshot numbers, rather than
redshifts. The bounds, in either case, should be sommensurate with the argument order (that is, if using snapshots,
<min redshift> should be replaced with the larger time step number, e.g. redshift bounds [0.0, 1.0] are
equivalent to snapshot bounds [499, 247]). By default, if both
<min redshfit> and
,max redshift. are larger
than 20, it is assumed that the user intends the bounds to be snapshot numbers.
Use case 2: Nonlinear angular bounds
Allow the θ and ϕ bounds to be computed internally to obtain a cutout of a certain width (box length), in arcminutes, centered on a certain Cartesian position, (x₀, y₀, z₀) Mpc/h (intended to be used for making cutouts around specific simulation objects, like halos):
lc_cutout <input lightcone directory> <output directory> <min redshift> <max redshift> --halo <x_0> <y_0> <z_0> --boxLength <box length>
input lightcone directory,
max redshift- See description above
z_0- The comoving Cartesian position, in Mpc/h, of the object on which to center the cutout
box length- the angular width of the fov around the object of interest, in arcmin (let this value be denoted as B, then dθ, as defined above, B/2)
--boxLength flags can be replaced with
Multiple objects of interest
If one has many objects of interest around which they would like lightcone cutouts, then it would be inefficient to call the above command with the
-h option each time, since each one of those runs would be using resources to re-read the same input lightcone data (which has the potential to be very large). To address this, the program can be run in the following manner:
lc_cutout <input lightcone directory> <output directory> <min redshift> <max redshift> --haloFile <input object file> --boxLength <box length>
input lightcone directory,
box length- See description above.
input object file- A plain text file containing one line per object of interest, which includes an object identifier, a halo redshfit, the snapshot/lightcone shell of that halo, a mass, optionally three SOD quantities (radius, concentration, concentration error), and finally three Cartesian comoving positions, as such:
123456789 0.5, 1e14, 0.9, 3.0 1.0 50.0 55.0 20.0 987654321 0.1, 1e15, 1.4, 2.0 0.66 10.0 0.0 30.0 192837465 1, 1e13.5, 0.7, 4.5 1.5 110.0 35.0 20.0 ...
In this example, the first object has an id of
123456789, a redshift of
0.5, SO m200 mass of
1e14 M_sun/h, r200 radius of
0.9 Mpc/h, concentration 3.0*±*1.0, and position x=
20 Mpc/h. All quantities expect for the
id must be in such a form that they can be parsed as
floats. The radius and concentration quantites are optionally included, and can be removed as long as
massDef is set to
fof (see below).
The identifiers can be anything, and are parsed as strings (in this way, they can be used for storing other meta data if desired). Under this usage, a new subdirectory will be created per object as listed in the
input object file under
output directory, of the form
halo_123456789, for example. It is then under that directory that simulation step-wise directories will be created (as in the description of the
output directory argument). It is also within the
output directory that a
properties.csv file will be written, which will contain the halo redshift, mass, optional SO quantities, and information about the scale of the final cutout. Any invalid/missing quantities in that file will be recorded as
-1 (this occurs when running Use Case 2 with
-h rather than
--haloFile option can also be specified with
-f (and, as above,
-b). The cutouts for each of these objects will now be performed serially, with the lightcone read-in happening only once.
Click here to expand details on how exactly the cutout computation is done for Use Case 2
In making general lightcone cutouts around specific objects, a coordinate rotation is required. This is because the bounding functions which define the field of view of the observer, if they are described in constant angular terms, are in general nonlinear in Cartesian space (meaning the field of view will not appear square from the observer position). This distortion is maximized near the poles of the spherical coordinate system, and minimized at the equator, as long as the small-angle approximation holds for the size of the cutout. Areas defined by constant θ - ϕ bounds, then, appear trapezoidal to the observer when far from the equator. It is important that our cutout areas are maintained as square for at least two reasons:
- FFT restrictions of flat-sky lensing codes often require that the cutout is square
- The cutouts returned will not actually have all side lengths of
boxLengthif we don't do this rotation, which the user explicitly requested
We want to express the positions of all of our target lightcone objects in spherical coordinates, to perform the cutouts. Considering a single one of those objects; we want that coordinate system to be rotated such that the object of interest (which we will from now on assume is a halo) lies on the equator at
(r, θ, ϕ) = (r0, 90°, 0°)
r0 = (x02 + y02 + z02)1/2.
Let's call the position vector of the halo before this rotation
a = [x₀, y₀, z₀],
b = [xrot, yrot, zrot] = [r0, 0, 0]
We perform this rotation for the target halo via the Rodrigues rotation formula, which answers the following question: given a position vector v, a normalized axis of rotation k, and an angle of rotation β, what is an analytical form for a new vector vrot which is v rotated by an angle β about k?
First, we find k by taking the cross product of two vectors defining the plane of rotation. The obvious choice of these two vectors are a and b, as defined above;
k = (a ⨯ b) / ‖a ⨯ b‖
then, for any other position vector v, vrot is given by
vrot = Rv
where R is the rotation matrix through an angle β about the axis k, given as
R = I + sin(β)K + (1-cos(β))K2.
I, here, is the 3x3 identity matrix, and K is the "cross-product matrix" for the unit vector k:
⌈ 0 -kz ky ⌉
K = | kz 0 -kx |
⌊ -ky kz 0 ⌋
So for each target halo, k, β, and R are each computed once, and vrot is computed for each object (e.g. simulation particle), from the input lightcone, that should fill the resultant cutout. If we'd like to cut out M target halos from a particular lightcone, and have N particles filling that lightcone, then that means computing vrot M*N times, which may be a very large number. For efficiency, then, we follow this procedure (gorey details found in code comments):
Define the geometry of the cutout at the equator of our spherical coordinate system/on the x-axis of the base simulation's cartesian coordinate system. This means that dθ = dϕ = B/2, where B is the
Define the vectors pointing to the four corners of the square bounded by dθ and dϕ as A, B, C, and D.
Move the square field of view to the position of the target halo by inverting the rotation matrix:
Arot = R-1A
and similarly for B, C, and D.
Make an "initial guess" around the field of view by doing a cut in constant θ and ϕ bounds, given by the maximum and minimum angular coordinates of A, B, C, and D. Add a 10 arcmin buffer.
For all lightcone objects (e.g. particles) surviving this initial cut, perform the proper rotation vrot = Rv. This number of objects will surely be ≪N
If the 5 steps above don't make much sense, please let me know (contact info below) and perhaps I can put together an explanatory animation.
Additional Arguments and Options
--massDef specified the mass definition, in the case that
-f is being used. Can either be
sod, then it is expected that each row of the file pointed to by
-f has 7 elements, with teh fourth being the spherical overdensity radius. if
fof, then the radius should be omitted from that file, else everything will break :)
--verbose tells the application to generate tons of output, including explicity printing the rotation matrices and similar objects being used for Use Case 2
--timeit instruct the application to report wall-times for the data read, redistribution, cutout computation, and write-out
--overwrite allows the program to delete any contents inside of the
output directory, rather than crashing with a warning
--posOnly will cause only particle positions, ids, and redshifts to be output. Velocities and lightcone rotation/replication information will be discarded (this should speed up both the data redistribution and the write-out).
--forceWriteProps will force the file
properties.csv to be overwritten; the default behavior is to skip the writing of this file if it is found to already exist (only applies to use case 2).
--propsOnly will instruct the program to return after writing the
properties.csv file, without reading any lightcone shells or performing the cutout. This is useful if cutouts have already been built, but the properties need to be updated for any reason (only applies to use case 2).
For example, to run multiple cutouts under Use Case 2 with an
fof mass definition, and turn on all of the options above except the final two, one would execute
lc_cutout <input lightcone directory> <output directory> <min redshift> <max redshift> --haloFile <input object file> --boxLength <box length> --massDef fof --verbose --tiemit --overwrite --posOnly
Note that the Use Case 1 does not perform the coordinate rotation which is described in Use Case 2 (under the "click here to expand" details). So, cutouts returned will not necessarily be square, or symmetrical, if far from the coordinate equator. Even given Use Case 2, cutouts will not necessarily be square (though they should always be symmetrical) if the opening angle of the cutout breaks the small-angle approximation.
The parallelism in this application occurs spatially, not temporally. That is, the lightcone volume is decomposed across MPI ranks, which prallelizes the read in, computation, and write-out. But there is no parallelism in redshift-space, meaning that each lightcone "step" (portion of the lightcone volume originating from a particular simulation snapshot) are treated in serial. Further, if option
-fis used as described under Use Case 2, then those multiple requested cutouts are also treated serially.
max redshiftare converted to a simulation step number assuming a simulation run that included 500 total time steps, and began at a redshift of 200. At the moment, there is no way for the user to easily change this, other than modifying the calls to
src/main.cppand rebuilding (the default values and parameter names controlling this info can be seen in the
getLCSteps()function declaration in
Argonne National Laboratory, Cosmological Physics and Advanced Computing Group (CPAC), 2018
This code is based on earlier template code written by CPAC's Steve Rangel. I also received very helpful development contribution from Patricia Larsen at CPAC. Thanks friends :)