# Broad Overview of `hmf`

In this tutorial/demo, we provide a broad overview of the way `hmf` works, and its features. 
For those really just needing to make a quick plot of a mass function, the best place to start is in the [Quickstart](your_first_plot.html) tutorial. This tutorial will go into a little more depth (without exploring more advanced niche features).

## The Package Layout

`hmf` is quite modular, and contains a number of sub-packages concerning each of the physical components that go into defining the halo mass function:

In [None]:
from hmf import (
    cosmology,      # Defines cosmographic parameters and growth functions
    density_field,  # Defines power spectra and transfer functions, as well as window functions/filters on those
    halos,          # Defines halo-specific forms such as mass definitions
    mass_function   # Defines routines that combine the above to obtain halo mass functions
)

While each of these modules has tools that can be useful for more advanced usage, the primary point of contact with `hmf` is the `MassFunction` object, which essentially contains all the working of the full package. This lives in the `mass_function` submodule, but can be imported from the top level:

In [None]:
from hmf import MassFunction

## Frameworks -- Caching and Updating

Each main entrypoint class in `hmf` (this includes `Cosmology`, `Transfer` and `MassFunction`) is what we call a `Framework`. This name is not particularly descriptive, but it means that each of these objects offers a number of similar points of functionality. Here, we'll demonstrate a few of these bits of functionality on the `Transfer` class, but it should be remembered that they are the same for all of these.

In [None]:
from hmf import Transfer

In [None]:
tr = Transfer()

The first common point is that each of the frameworks has defaults for all of its parameters, and a reasonable object can be created by passing no parameters, as we just did.

We can, like any object in Python, get some help with what parameters are available by using `help`:

In [None]:
help(Transfer)

That's a lot of help! You can also consult the [API documentation](../api.html). However, since many of the parameters to `Transfer` merely get passed through to `Cosmology`, they get lost in this documentation. You can get a list of all possible parameters for a framework like this:

In [None]:
Transfer.get_all_parameter_defaults()

Again, you can consult the API docs for information on each one, bu you can also use this special function:

In [None]:
Transfer.parameter_info()

Almost all of the things that a framework can calculate -- whether they be transfer functions, growth factors or mass functions -- will appear to be attributes of the object. That is, you don't "call" them like functions, but instead just access them like data. In fact, they are lazily calculated as needed, and then stored in memory once calculated. So, for example, let's calculate the matter power spectrum:

In [None]:
%time tr.power.max()

This took almost 3 seconds on this system, as it called `CAMB` in the background to calculate the power spectrum.
However, it is now cached, and if we call it again:

In [None]:
%time tr.power.max()

It takes less than 1/1000 of a second, as its just accessing memory. More than that, each (non-trivial) quantity that the power spectrum depends on is also cached, so to access the transfer function:

In [None]:
%time tr.transfer_function.max()

Also returns instantly. 

### Updating Parameters

Often you'll want to compute a certain quantity over a large number of values of a given parameter (or multiple parameters). Of course, you could just create a new framework each time (eg. a `Transfer` object), but that is often going to be much slower than necessary, because often the parameter does not affect many of the underlying quantities. For instance, updating the redshift doesn't change the underlying transfer function, and the power spectrum just changes by an overall factor. 

Internally, each framework keeps precise track of which parameters affect each quantity, which enables robust cache invalidation -- in other words, we can keep a computed quantity cached when a parameter is updated that doesn't affect it, and when other quantities that depend on that quantity are required, they can just access it again directly. Let's see this with an example -- 20 calculations of the power spectrum at different redshifts:

In [None]:
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

In [None]:
redshifts = np.random.uniform(0, 3, size=20)

In [None]:
%%time

for z in redshifts:
    tr_ = Transfer(z=z)
    plt.plot(tr_.k, tr_.power)

plt.xscale('log')
plt.yscale('log');

If instead we use our original transfer object and merely update the redshift in-place:

In [None]:
%%time

for z in redshifts:
    tr.z=z
    plt.plot(tr.k, tr.power)

plt.xscale('log')
plt.yscale('log');

See that the output plots are precisely the same -- the power spectrum is being updated for each redshift, but by using the caching mechanism we improve performance by three orders of magnitude.

You can also use the in-built `.update` method to update parameters:

In [None]:
tr.update(z=0)

## Components

Inside frameworks are a whole bunch of parameters. Some of these are simple numerical parameters, but many of them are themselves complex components tasked with computing specialized quantities. Furthermore, many of these components have various possible models that you might want to switch between. To make it easy to do so, we make each of these a formal `Component` object. Once instance of these is are the transfer function models themselves. We have been using CAMB to compute the transfer function, but a popular approximation is the Eisenstein-Hu model:

In [None]:
tr.transfer_model = 'EH'

For components, you can always pass a string referring to the name of the class, or the actual class itself:

In [None]:
tr.transfer_model = density_field.transfer_models.EH

The latter is useful because it gives you a lot of flexibility -- you could write your own class and pass it in! For more on that, see [Plugins and Extensions](plugins_and_extending.html).

All formal Components within `hmf` are passed to a framework via the `componentname_model` parameter. So Filters are passed as `filter_model`. The actual model instance (the thing that will do the calculations) is then available within the framework as as the component name:

In [None]:
tr.transfer

With this, we can compute a bunch of stuff, like the (log) transfer function:

In [None]:
tr.transfer.lnt(np.linspace(0,1,10))

This is of course what is used to generate the transfer function accessible in the `Transfer` framework. But sometimes there are other goodies hidden away in the components that can be useful!

To update parameters of the components requires passing a dictionary of parameters to the parameter `componentname_params`. For example, the growth function is a component:

In [None]:
tr.growth_model

We can update its params like so:

In [None]:
tr.growth_params = {'dlna': 1}  # By default it is 0.01

Let's compute the growth factor:

In [None]:
tr.growth.growth_factor(z=10)

Now update back to the original

In [None]:
tr.growth_params = {"dlna": 0.01}

In [None]:
tr.growth.growth_factor(z=10)

Another example of a Component is the cosmology, and you can see the [Dealing with Cosmology](deal_with_cosmology.html) tutorial for more details there -- but it follows the same pattern.

A full list of the available components in `hmf` is as follows:

* `Cosmology`
* `GrowthFactor`
* `TransferModel`
* `Filter`
* `MassDefinition`
* `FittingFunction`

And each of these has several models.

## Using `hmf` Efficiently

We have already discussed caching and how it speeds up many calculations very significantly. However, it is only useful if used correctly. Let's say you want to calculate the transfer function for multiple values of both $\Omega_m$ and the redshift $z$. Then to get the speedup, you must use the faster updater as the inner loop, otherwise the object still needs to compute the slower update many times. 

In this case, the redshift should be the inner loop, since $\Omega_m$ affects the basic transfer function, which is typically the slowest calculation of all. Much of the time, the relevant order of parameters should be clear, but you can determine them explicitly using a helper function:

In [None]:
from hmf import get_best_param_order

In [None]:
get_best_param_order(Transfer, q='power')

This call should be interpreted as determining the best order for calculating the `power` in the `Transfer` framework, and the output is in order of fastest to slowest. We see, for example, that `cosmo_params` (where $\Omega_m$ lives) is far down the list compared to redshift. 

You can go even further than that and use another helper function to "just get" the output quantities over the loop for you:

In [None]:
from hmf import get_hmf

In [None]:
for power, tr, label in get_hmf(req_qauntities=['power'],
                                framework=Transfer,
                                fast_kwargs={"transfer_model": "EH"},
                                z=[0,1,2,3,4,5],
                                cosmo_params = [{'Om0': 0.3}, {'Om0': 0.2}, {'Om0': 0.4}]):
    print(tr.cosmo_params, tr.z)
    plt.plot(tr.k, tr.power)

plt.xscale("log")
plt.yscale('log')

Technically, the `get_hmf` function is an iterator, yielding the quantities you ask for (and the full Framework object updated with parameters) on each iteration, but doing it in the optimal order.