# Parameter Configuration

This tutorials explains the role of each parameter in the parameter files
used to run the Python and C++ routines. The Python parameter file is in the
YAML format, whereas the C++ parameter file is in the INI format.
For each parameter mentioned below, the configurations for both formats
are listed.

```{seealso}
For the Python parameter set class
{py:class}`~triumvirate.parameters.ParameterSet`, see
[Parameter Set](./Parameters.ipynb) for more details. The equivalent
C++ class is {cpp:class}`~trv::ParameterSet`.
```

```{note}
Entry values in the parameter file snippets below are written in a
pseudo [extended Backus--Naur form](https://en.wikipedia.org/wiki/Extended_Backus–Naur_form),
where
  - angled brackets ``<`` and ``>`` denotes a parameter value;
  - square brackets ``[`` and ``]`` delimit an optional entry;
  - braces ``{`` and ``}`` delimit repetitions;
  - round brackets ``(`` and ``)`` delimit a grouping;
  - a vertical bar ``|`` separates mutually exclusive options;
  - an equal sign ``=`` inside optional entries denotes the default value;
  - a colon ``:`` inside an entry is followed by the data type.
```

## System I/O

### Directories

To set the catalogue directory from which catalogue files are read and
the measurement directory to which measurement files are saved, insert the
absolute or relative (to the working directory) paths in the following parts.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        directories:
          catalogues: [<directory: str>]
          measurements: [<directory: str>]

    .. code-block:: ini
        
        catalogue_dir = [<directory: string>]
        measurement_dir = [<directory: string>]

````

```{hint}
If left empty, the current working directory is assumed. The catalogue
directory can be unset if no catalogues are to be loaded.
```

### Files

To set the data and random catalogue files (inside the catalogue directory
specified as above) for which clustering measurements are made, insert the
file names with extension in the following parts.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        files:
          data_catalogue: [<filename-stem.ext: str>]
          rand_catalogue: [<filename-stem.ext: str>]

    .. code-block:: ini
        
        data_catalogue_file = [<filename-stem.ext: string>]
        rand_catalogue_file = [<filename-stem.ext: string>]

````

```{hint}
If no data/random catalogue(s) is/are to be loaded by the program, then
the corresponding parameter has no effect/can be unset.
```

For C++ routines, the catalogue data columns need to be specified as a
non-breaking comma-separated string in the following part in the INI file.
For the Python program, <strike>this is specified as a list of strings in the
YAML file</strike>this is a future feature, currently has no effect and can be
unset in the YAML file.

```{code-block} ini
:caption: INI
catalogue_columns = [<column-name: string>{","<column-name: string>}]
```

```{hint}
This parameter is required in the INI file when a catalogue file is to be
loaded by the C++ program<strike>, or when the Python catalogue reader
cannot infer the data columns</strike>.
```

```{seealso}
For the Python catalogue class
{py:class}`~triumvirate.catalogue.ParticleCatalogue`, see
[Particle Catalogue](./Catalogue.ipynb) for more details. The
equivalent C++ class is {cpp:class}`~trv::ParticleCatalogue`.
```

## Mesh sampling

### Physical properties

All catalogue particles are weighted and assigned to a mesh grid as
discrete sampling of a continuous field. The mesh grid box has sizes
(typically in $h^{-1}\,\mathrm{Mpc}$ units) specified in the following parts.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        boxsize:
          x: <boxsize: float>
          y: <boxsize: float>
          z: <boxsize: float>

    .. code-block:: ini
        
        boxsize_x = <boxsize: double>
        boxsize_y = <boxsize: double>
        boxsize_z = <boxsize: double>

````

```{hint}
Make sure the box sizes are large enough for the catalogue(s).
```

The number of grid cells in each dimension is specified in the following parts.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        ngrid:
          x: <grid-number: int>
          y: <grid-number: int>
          z: <grid-number: int>

    .. code-block:: ini
        
        ngrid_x = <grid-number: int>
        ngrid_y = <grid-number: int>
        ngrid_z = <grid-number: int>

````

```{hint}
Mesh grid numbers should be even and often a power of 2 for
fast Fourier transforms.
```

If the box sizes are to be determined after loading the particle catalogue(s),
the box expansion factor can be specified such that the box size in each
dimension is the particle coordinate span multiplied by it.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        expand: <box-expansion: float>

    .. code-block:: ini
        
        expand = <box-expansion: double>

````

```{hint}
The expansion factor is typically unity for cubic-box simulation-like
catalogues whose clustering statistics are computed in the global
plane-parallel approximation, and greater than one for survey-like catalogues
in the local plane-parallel approximation.
```

### Alignment

The particle coordinates are offset to box coordinates in length
units. The ``alignment`` of particles in the box has two options: their
mid-point either coincides with the box centre ('centre'), or their coordinate
minima are padded from the origin corner of the box ('pad').

If the padding option is used, the amount of padding is determined as
a multiple ``padfactor`` of ``padscale``, which is either the box size
('box') or the grid cell size in each dimension ('grid').

This is specified in the following parts.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        alignment: [=centre | pad]
        padscale: [=box | grid]
        padfactor: [<pad-factor: float>]

    .. code-block:: ini
        
        alignment = [=centre | pad]
        padscale = [=box | grid]
        padfactor = [<pad-factor: double>]

````

### Assignment

Mesh ``assignment`` schemes from order 1 to 4 are supported: nearest grid point
('ngp'), cloud-in-cell ('cic'), triangular-shaped cloud ('tsc') and
piecewise cubic spline ('pcs').

The interlacing technique can be used to reduce aliasing in discrete Fourier
transforms for two-point clustering statistics only.

These are specified in the following parts.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        assignment: [ngp | cic | =tsc | pcs]
        interlace: [(true | on | false | =off): bool]

    .. code-block:: ini
        
        assignment = [ngp | cic | =tsc | pcs]
        interlace = [(true | on | false | =off): string]

````

## Measurements

### Types

To specify the statistic being measured in local/global plane-parallel
(LPP/GPP) approximations or as the window function, insert the relevant
catalogue type,

  - 'survey' for LPP,
  - 'sim' for GPP, and
  - 'random' for window functions,
  - 'none' for other statistics (e.g. mesh grid binning),

and the statistic type,

  - 'powspec' for power spectrum,
  - '2pcf' for correlation function,
  - '2pcf-win' for correlation function window,
  - 'bispec' for bispectrum,
  - '3pcf' for three-point correlation function,
  - '3pcf-win' for three-point correlation function window,
  - '3pcf-win-wa' for three-point correlation function window wide-angle terms,
  - 'modes' for the binning of wavevector modes,
  - 'pairs' for the binning of separation pairs,

in the following parts.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        catalogue_type: [survey | random | sim | none]
        statistic_type: [powspec | 2pcf | 2pcf-win | bispec | 3pcf | 3pcf-win | 3pcf-win-wa | modes | pairs]

    .. code-block:: ini
        
        catalogue_type = survey | random | sim | none
        statistic_type = powspec | 2pcf | 2pcf-win | bispec | 3pcf | 3pcf-win | 3pcf-win-wa

````

### Indexing

Each clustering statistic is indexed by (a) multipole degree(s) $L$
(and $\ell_1, \ell_2$ for three-point statistics), and wide-angle
terms are indexed by orders $i, j$. These are specified in the following parts.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        degrees:
          ell1: [<degree-1: int>]
          ell2: [<degree-2: int>]  
          ELL: [<degree: int>]

        wa_orders:
          i: [<order-i: int>]
          j: [<order-j: int>]

    .. code-block:: ini
        
        ell1 = [<degree-1: int>]
        ell2 = [<degree-2: int>]
        ELL = [<degree: int>]

        i_wa = [<order-i: int>]
        j_wa = [<order-j: int>]

````

```{hint}
These indices can be unset if no actual clustering measurements are being made,
e.g. when 'statistic_type' is 'modes' or 'pairs' for obtaining the mesh grid
binning details only.
```

### Choices

Three-point clustering statistic measurements have four forms:
'diag' (diagonal), 'off-diag' (off-diagonal) and 'row', which all have reduced
1-d coordinate binning, and 'full', which has flattened 2-d coordinate
binning. This is specified in the following parts.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        form: [=diag | off-diag | row | full]

    .. code-block:: ini
        
        form = [=diag | off-diag | row | full]

````

```{hint}
The 'off-diag' form returns an off-diagonal in the upper triangular matrix
of the 2-d full form. The 'full' form concatenates entries of each row
in the upper triangular matrix.
```

The normalisation factor can be computed either as a sum over catalogue
particles ('particle') or as a sum over the mesh grid ('mesh'), or
unspecified ('none') equivalent to unity normalisation. For two-point
clustering statistics only,
[`pypower`-like mixed-mesh normalisation](https://pypower.readthedocs.io/en/latest/api/api.html#pypower.fft_power.normalization)
is available as 'mesh-mixed'. The normalisation convention is specified in
the following parts.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        norm_convention: [=particle | mesh | mesh-mixed | none]

    .. code-block:: ini
        
        norm_convention = [=particle | mesh | mesh-mixed | none]

````

### Coordinate binning

The coordinates at which the statistics are measured and binned are determined
by a ``binning`` scheme: linear ('lin') spacing, logarithmic ('log') spacing,
linear spacing with additional lower-end padding for 5 linear intervals
('linpad'), logarithmic spacing with additional lower-end padding for 5 linear
intervals ('logpad'), and customised bins ('custom') (for which the user needs
to supply the binning class object explicitly). The padding intervals are
$10^{-3}$ (wavenumber units) in Fourier space and $10$ (length units) in
configuration space. The total number of bins, ``num_bins``, should be no
lower than 2 (or 7 with padding), and the binning ``range`` are the lower and
upper edges ``bin_min`` and ``bin_max`` of the bins.

In addition, for 'row' ``form`` three-point statistics, the first coordinate
is fixed with the bin index ``idx_bin`` specified; for 'off-diag' ``form``
three-point statistics, the off-diagonal index is specified by ``idx_bin``
as well.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        binning: [=lin | log | linpad | logpad | custom]
        range: "["<bin-min: float>"," <bin-max: float>"]"
        num_bins: <bin-number: int>
        idx_bin: <bin-index: int>

    .. code-block:: ini
        
        binning = [=lin | log | linpad | logpad | custom]
        bin_min = <bin-min: double>
        bin_max = <bin-max: double>
        num_bins = <bin-number: int>
        idx_bin = <bin-index: int>

````

```{seealso}
For the Python binning class {py:class}`~triumvirate.dataobjs.Binning`,
see [Binning Scheme](./Binning.ipynb) for more details.
The equivalent C++ class is {cpp:class}`~trv::Binning`.
```

## Miscellaneous

### Optimisation

To adapt the FFTW planner algorithm which has different 'rigour' and
optimisation levels, pass a supported `fftw_scheme` value below, which
corresponds to a FFTW planner flag.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        fftw_scheme: [estimate | =measure | patient]

    .. code-block:: ini
        
        fftw_scheme = [estimate | =measure | patient]

````

```{seealso}
For more information about FFTW planner flags, see the official
[documentation page](https://www.fftw.org/fftw3_doc/Planner-Flags.html).
```

FFT operations can be sped up using FFTW wisdom which accumulates runtime
performance feedback into pre-optimised plans tuned to specific data sizes
and hardware.

To disable FFTW wisdom, set the `use_fftw_wisdom` parameter to 'false'
or leave it empty.  To enable FFTW wisdom, pass a directory path
to `use_fftw_wisdom` so that the program either imports wisdom files from
prior runs or exports wisdom files from the current run.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        use_fftw_wisdom: [=false | off | <directory: str>]

    .. code-block:: ini
        
        use_fftw_wisdom = [=false | <directory: string>]

````

```{hint}
The filename pattern is pre-determined by the program:
`fftw[_omp]_ci<direction>_<ngrid_x>x<ngrid_y>x<ngrid_z>.wisdom`,
where the `_omp` suffix denotes OpenMP-capable routines,
`<direction>` is either 'f' (forward) or 'b' (backward),
`<ngrid_(x|y|z)>` is the mesh grid cell number, and 'c' and 'i' stand for
complex-to-complex, in-place transforms.
```

```{warning}
If a FFTW wisdom file is imported and used, it should have been generated
with a FFTW planner flag higher than or equal to that of `fftw_scheme`.
If a new one is generated and exported, it will inherit the FFTW planner
flag from `fftw_scheme`.  If a FFTW wisdom file has been generated
with/without OpenMP multithreading, it can only be reused by a routine
similarly with/without OpenMP multithreading; the thread number does not
matter.
```

```{seealso}
For more information about FFTW wisdom, see the official
[documentation page](https://www.fftw.org/fftw3_doc/Wisdom.html).
```

### I/O

The C++ program can save the details of the binning of wavevector modes or
separation pairs from a mesh grid when measuring clustering statistics to
a file in the following part in the INI file. For the Python program, this is
a future feature, currently has no effect and can be unset in the YAML file.

```{code-block} ini
:caption: INI
save_binned_vectors = [=false | true | <file-path: string>]
```

```{hint}
When `save_binned_vectors` is 'true', a default file path in the
measurement directory is used to save the binned vectors; if it is a string
(not 'false'), then that is used as the saved file path (relative to
the measurement directory).
```

### Logging

The C++ backend/program comes with a logger ({cpp:class}`~trv::sys::Logger`)
for which the verbosity/logging level can be specified in the following parts.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        verbose: [=20 | <logging-level: int>]

    .. code-block:: ini
        
        verbose = [=20 | <logging-level: int>]

````

The C++ backend/program also comes with an optional progress bar
({cpp:class}`~trv::sys::ProgressBar`) for which the update interval can be
specified in terms of percentage  points if an integer
(or integer-like string) is passed.

````{eval-rst}
.. tab-set-code::

    .. code-block:: yaml

        progbar: [false | true | =off | on | <%-points: float>]

    .. code-block:: ini
        
        progbar = [=false | true | <%-points: float>]

````