Skip to content

Commit

Permalink
added Big Sobol Sequence information
Browse files Browse the repository at this point in the history
  • Loading branch information
franciscovillaescusa committed Jan 7, 2024
1 parent ffbe5c4 commit 2a92f87
Show file tree
Hide file tree
Showing 7 changed files with 96 additions and 11 deletions.
4 changes: 2 additions & 2 deletions docs/source/access.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Data access
***********

Quijote contains over 700 terabytes of data. Given this large size, the data is currently distributed across two different clusters in New York (Rusty cluster) and San Diego (GordonS cluster). The data can be accessed in two different ways:
Quijote contains over 850 terabytes of data. Given this large size, the data is currently distributed across two different clusters in New York (Rusty cluster) and San Diego (GordonS cluster). The data can be accessed in two different ways:

- **Globus**. A system designed to easily transfer large amounts of data in a very efficient manner.
- **Binder**. A system that allows reading and manipulating the data online, without the need to download the data.
Expand Down Expand Up @@ -34,7 +34,7 @@ The table below describes the data each cluster contains and provides the links
| | - The 3D density fields | |
| | - The HADES data (if available) | .. image:: https://mybinder.org/badge_logo.svg |
| | - The 3D density fields | :target: https://binder.flatironinstitute.org/~fvillaescusa/Quijote |
| | - 488 Terabytes | |
| | - 700 Terabytes | |
+-------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+


Expand Down
79 changes: 79 additions & 0 deletions docs/source/bsq.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
.. _bsq:

******************
Big Sobol Sequence
******************

The Big Sobol Sequence (BSQ) is a collection of 32,768 N-body simulations designed for machine learning applications. Each simulation follows the evolution of :math:`512^3` dark matter particles in a periodic comoving volume of :math:`(1000~h^{-1}{\rm Mpc})^3`. Each of these simulations have a different initial random seed and a value of the cosmological parameters :math:`\Omega_{\rm m}`, :math:`\Omega_{\rm b}`, :math:`h`, :math:`n_s`, :math:`\sigma_8` that are arranged in a Sobol sequence with boundaries (the value of the cosmological parameters for each BSQ simulation can be found `here <https://raw.githubusercontent.com/franciscovillaescusa/Quijote-simulations/master/BSQ/BSQ_params.txt>`_):

.. math::
\Omega_{\rm m} \in [0.10 ; 0.50]\\
\Omega_{\rm b} \in [0.02 ; 0.08]\\
h \in [0.50 ; 0.90]\\
n_s \in [0.80 ; 1.20]\\
\sigma_8 \in [0.60 ; 1.00]
The value of the other cosmological parameters is the same in all simulations: :math:`M_\nu=0.0` eV, :math:`w=-1`, :math:`\Omega_{\rm K}=0`. The initial conditions were generated at :math:`z=127` using 2LPT, and the simulations have been run using Gadget-III with a slightly more stringent force accuracy parameters than the other Quijote simulations.

.. Warning::

As of January 7th 2024, 16,384 simulations have been run and are publicly available in both globus and binder (see :ref:`data_access`). The remaining simulations are being run and they are made publicly available inmediatly. The expected time to have the full set run is summer 2024.


For each simulation we dump 11 snapshots at redshifts 6, 5, 4, 3, 2, 1.5, 1, 0.7, 0.5, 0.2, and 0. We then post-process that data and saved halo catalogs, power spectra, bispectra, and density fields. We now describe the different data we store:


Snapshots
~~~~~~~~~

We have saved full snapshots for the initial conditions (ICs) and at redshifts 1 (``snap_006.hdf5``) and 0 (``snap_010.hdf5``). Note that that snapshots at redshifts 0 and 1 only contain a single file, in contrast with standard Quijote ones that have 8. This data can be read in the standard way (see :ref:`snapshots`).


Halo catalogs
~~~~~~~~~~~~~

For each of the 11 snapshots per simulation we have generated both FoF and Rockstar halo catalogs. We have also run consistent trees on the Rockstar catalogs and we have saved the generated merger tree. For FoF, the convention is this:

- 000: redshift 6
- 001: redshift 5
- 002: redshift 4
- 003: redshift 3
- 004: redshift 2
- 005: redshift 1.5
- 006: redshift 1
- 007: redshift 0.7
- 008: redshift 0.5
- 009: redshift 0.2
- 010: redshift 0

We refer the reader to :ref:`halo_catalogues` for details on how to read these files.


Power spectra
~~~~~~~~~~~~~

For each snapshot of each simulation we have computed the matter power spectrum in real- and redshift-space and saved the results.


Bispectra
~~~~~~~~~

For each snapshot of each simulation we have computed the matter bispectrum in real- and redshift-space and saved the results. The bispectrum is computed on grids with :math:`256^3` voxels and it contains ~2000 triangles down to :math:`k\sim0.5~h{\rm Mpc}^{-1}`.


Density fields
~~~~~~~~~~~~~~

We have generated density fields with the matter field with :math:`256^3` voxels in real- and redshift-space for all 11 available redshifts. The density fields have been generated using the Cloud-in-Cell (CIC) mass assignments scheme. The files are stored as hdf5 files, and can be read as this

.. code:: python
import numpy as np
import h5py
f = h5py.File('df_m_CIC_z=0.00.hdf5', 'r')
df = f['df'][:] #df contains the number of particles in each voxel
f.close()
13 changes: 7 additions & 6 deletions docs/source/features.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,23 @@ Features
********

- Simulations run with the TreePM code Gadget-III
- More than 40 Million CPU hours used
- Boxes of 1 Gpc/h. Combined total volume of more than 45,000 (Gpc/h)^3 at a single redshift
- More than 60 Million CPU hours used
- Boxes of 1 Gpc/h. Combined total volume of more than 78,000 (Gpc/h)^3 at a single redshift
- 17,100 simulations for a fiducial Planck cosmology
- Between 500 and 1,000 simulations/cosmology for 27 different cosmologies
- 1,000 Separate Universe simulations
- 8,000 simulations in different latin-hypercubes
- More than 10 trillions of particles at a single redshift from all simulations
- 32,768 simulations in a Sobol Sequence
- More than 12 trillions of particles at a single redshift from all simulations
- Billions of halos and voids identified
- Full snapshots at redshifts 0, 0.5, 1, 2, 3 and 127 (initial conditions)
- More than 200,000 halo catalogues
- More than 200,000 void catalogues
- More than 300,000 halo catalogues
- More than 300,000 void catalogues
- More than 1 million power spectra
- More than 1 million bispectra
- More than 1 million correlation functions
- More than 1 million marked power spectra
- More than 1 million probability distribution functions
- More than 700 Terabytes of data publicly available
- More than 850 Terabytes of data publicly available
- All data can be downloaded via globus
- All data can be accessed and manipulated without downloading it via binder
4 changes: 2 additions & 2 deletions docs/source/goals.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@
Science
*******

The Quijote simulations is a suite of more than 45,000 full N-body simulations that have been designed to accomplish two main goals:
The Quijote simulations is a suite of more than 78,000 full N-body simulations that have been designed to accomplish two main goals:

- Quantify the information content on generic cosmological observables
- Provide enough data to train machine learning algorithms


For the first goal, Quijote provides a set of more than 35,000 simulations designed to calculate the information content on a generic cosmological observable by means of evaluating its Fisher matrix.

For the second goal, Quijote provides not only thousands of simulations on different latin-hypercubes, but the a total number of 44,100 N-body simulations, with billion of halos, galaxies, voids and millions of summary statistics such as power spectra, bispectra...et, to train machine learning algorithms, where having more data is always better.
For the second goal, Quijote provides not only thousands of simulations on different latin-hypercubes and Sobol sequences, but the a total number of more than 78,000 N-body simulations, with billion of halos, galaxies, voids and millions of summary statistics such as power spectra, bispectra...et, to train machine learning algorithms, where having more data is always better.

The large number of simulations and data products available in Quijote allows many other scientific applications. See :ref:`publications` for a list of different scientific usages of the data.
3 changes: 2 additions & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Quijote simulations
.. image:: Quijote.jpg
:width: 49 %

The Quijote simulations is a suite of 45,500 full N-body simulations designed to:
The Quijote simulations is a suite of more than 78,000 full N-body simulations designed to:

- Quantify the information content on cosmological observables
- Provide enough statistics to train machine learning algorithms
Expand Down Expand Up @@ -53,6 +53,7 @@ Historically, Quijote was developed from the `HADES simulations <https://francis
odd
mg
LH
bsq
Hades
ulagam

Expand Down
2 changes: 2 additions & 0 deletions docs/source/news.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
News
====

**January 2024:** Data from the Big Sobol Sequence (BSQ), a collection of 32,768 N-body simulations varying 5 cosmological parameters (:math:`\Omega_{\rm m}`, :math:`\Omega_{\rm b}`, :math:`h`, :math:`n_s`, :math:`\sigma_8`) is made publicly available. See :ref:`BSQ` for details.

**November 2023:** The Sancho suite, a collection of 240,000 galaxy mock catalogs in redshift-space spanning across 11 cosmologies, 3 massive neutrino cosmologies, 6 primordial non-Gaussianity amplitudes, and 11 Halo Occupation Distribution (HOD) models (together with their corresponding power spectra and bispectra) is now publicly available. Check :ref:`Sancho` for details.

**October 2023:** Quijote now contains FoF halo catalogs that include the IDs of the particles belonging to the different halos. Check :ref:`halo_catalogues` for details.
Expand Down
2 changes: 2 additions & 0 deletions docs/source/types.rst
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,8 @@ A brief description of the different cosmologies is provided in the below table.
+-------------------+-------------------------+-------------------+--------------+-------------+-------------------+---------------+---------------+------------------+------------------------------+------------------------------+-------------------------------+-------------------------------+--------------------+--------------+----------------+------------+-------------------+---------------------+
| nwLH | [0.1 - 0.5] | [0.03 - 0.07] | [0.5 - 0.9] | [0.8 - 1.2] | [0.6 - 1.0] | [0.01 - 1.0] | [-1.3 - -0.7] | 0 | 0 | 0 | 0 | 0 | 0 | 2,000 | standard | Zeldovich | 512 | 512 |
+-------------------+-------------------------+-------------------+--------------+-------------+-------------------+---------------+---------------+------------------+------------------------------+------------------------------+-------------------------------+-------------------------------+--------------------+--------------+----------------+------------+-------------------+---------------------+
| BSQ | [0.1 - 0.5] | [0.02 - 0.08] | [0.5 - 0.9] | [0.8 - 1.2] | [0.6 - 1.0] | 0 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 32,768 | standard | 2LPT | 512 | 512 |
+-------------------+-------------------------+-------------------+--------------+-------------+-------------------+---------------+---------------+------------------+------------------------------+------------------------------+-------------------------------+-------------------------------+--------------------+--------------+----------------+------------+-------------------+---------------------+

- Simulations with :math:`\delta_b \neq 0` correspond to separate universe simulations and therefore have an amplitude of the DC mode different than 0 (or equivalently, a curvature different than 0).
- Simulations with :math:`f_{\rm NL} \neq 0` correspond to simulations with primordial non-Gaussianities (Quijote-PNG). See :ref:`png` for further details on these simulations.
Expand Down

0 comments on commit 2a92f87

Please sign in to comment.