# Multidimensional PDFs

This tutorial is about handling multiple dimensions when creating a custom PDF.

The differences are marginal since the ordering is handled automatically. It is on the other hand crucial to understand the concept of a `Space`, most notably `obs` and `axes`.

A user (1someone who instantiates the PDF) only knows and handles observables. The relative order does not matter, if a data has observables a and b and a pdf has observables b and a, the data will be reordered automatically.  Inside a PDF on the other hand, we do not care at all about observables but only about the ordering of the data, the *axis*. So any data tensor we have, and limits for integration, normalization etc. **inside** the PDF is order based and uses *axes*. 

When passing the observables to the init of the PDF (as a user), each observable is automatically assigned to an axis corresponding to the order of the observable. The crucial point is therefore to communicate to the user which *axis* corresponds to what. The naming of the observables is completely up to the user, but the order of the observables depends on the pdf. Therefore, the correspondance of each axis to it's meaning has to be stated in the docs.

In [1]:
import zfit
from zfit import ztf
import numpy as np

## Axes, not obs

Since we create a pdf here, we now can completely forget about observables. We can assume that all the data is axes based (order based).We simply need to write down what each axis means.

An example pdf is implemented below. It calculates the lenght of a vector shifted by some number (dummy example).

In [2]:
class AbsVectorShifted(zfit.pdf.ZPDF):
    _N_OBS = 3  # dimension, can be omitted
    _PARAMS = ['xshift', 'yshift']  # the name of the parameters
    
    def _unnormalized_pdf(self, x):
        x = ztf.unstack_x(x)  # returns a list with the columns: do x, y, z = ztf.unstack_x(x) for 3D
        mean = self.params['mean']
        std = self.params['std']
        return ztf.exp(- ((x - mean)/std)**2)
        

Done. Now we can use our pdf already!

In [3]:
obs = zfit.Space('obs1', limits=(-3, 6))

data_np = np.random.random(size=1000)
data = zfit.data.Data.from_numpy(array=data_np, obs=obs)

Create two parameters and an instance of your own pdf

In [4]:
mean = zfit.Parameter("mean", 1.)
std = zfit.Parameter("std", 1.)
my_gauss = MyGauss(obs='obs1', mean=mean, std=std)

Instructions for updating:
Colocations handled automatically by placer.


In [5]:
probs = my_gauss.pdf(data, norm_range=(-3, 4))

In [6]:
probs_np = zfit.run(probs)
print(probs_np[:20])

[0.54061221 0.26987065 0.55748768 0.55287316 0.35744102 0.56147295
 0.24496626 0.27563618 0.469028   0.50871657 0.2807743  0.23819832
 0.28492248 0.56415543 0.49875545 0.25174901 0.5481325  0.21933145
 0.33557439 0.48277319]


We could improve our PDF by registering an integral

In [7]:
def gauss_integral_from_any_to_any(limits, params, model):
    lower, upper = limits.limit1d
    mean = params['mean']
    std = params['std']
    # write your integral here
    return 42.  # dummy integral, must be a scalar!

In [8]:
limits = zfit.Space.from_axes(axes=0, limits=(zfit.Space.ANY_LOWER, zfit.Space.ANY_UPPER))
MyGauss.register_analytic_integral(func=gauss_integral_from_any_to_any, limits=limits)

### n-dimensional

## Advanced Custom PDF

Subclass BasePDF. The `_unnormalized_pdf` has to be overriden and, in addition, the `__init__`.

Any of the public main methods (`pdf`, `integrate`, `partial_integrate` etc.) can **always** be overriden by implementing the function with a leading underscore, e.g. implement `_pdf` to directly controls `pdf`, the API is the same as the public function without the name. In case, during execution of your own method, it is found to be a bad idea to have overridden the default methods, throwing a `NotImplementedError` will restore the default behavior.

In [9]:
# TOBEDONE