The following discussion is adapted from this notebook by Lehman Garrison.
When computing a two-point correlation function estimator like
the
where
However, using
See this notebook for an empirical demonstration of this effect; specifically, that computing the density with N − 1 is correct, and that using N introduces bias of order
Any Corrfunc
function that returns a clustering statistic (not just raw pair counts) implements this correction. Currently, this includes :pyCorrfunc.theory.xi
and :pyCorrfunc.theory.wp
.
Cross-correlations of two different particle sets don't suffer from this problem; the particle you're sitting on is never part of the set of particles under consideration for pair-making.
Corrfunc
also allows bins of zero separation, in which "self-pairs" are included in the pair counting.
We can extend the above discussion to weighted correlation functions in which each particle is assigned a weight, and the pair weight is taken as the product of the particle weights (see weighted_correlations
).
Let wj be the weight of particle j, and W be the sum of the weights. We will define the "unclustered" particle distribution to be the case of N particles uniformly distributed, where each is assigned the mean weight w̄. We thus have
When the particles all have wj = 1, then W = N and we recover the unweighted result from above.
There are other ways to define the unclustered distribution. If we were to redistribute the particles uniformly but preserve their individual weights, we would find
This is not what we use in Corrfunc
, but this should help illuminate some of the considerations that go into defining the "unclustered" case when writing a custom weight function (see custom_weighting
).