# Out-of-core mean and variance calculation

This tutorial describes use of the cellxgene_census.experimental.pp API for calculating out-of-core mean and variance in the Census. The variance calculation is performed using [Welford's online algorithm](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance).

There is a single API available: 

`mean_variance()`: calculates the mean and the variance for an `ExperimentAxisQuery`. The following additional arguments are supported:
- `layer`: the X layer used for the calculation, defaults to `raw`
- `axis`: the axis along which the calculation is performed. Specify 0 for the `var` axis and 1 for the `obs` axis
- `calculate_mean`: if False, do not include the mean in the result
- `calculate_variance`: if False, do not compute the variance.
- `ddof`: _Delta Degrees of Freedom_: the divisor used in the calculation for variance is N - ddof, where N represents the number of elements.



In [8]:
# Import packages
import cellxgene_census
from cellxgene_census.experimental.pp import mean_variance

## Calculate mean and variance

As an example, we'll calculate the mean and variance along the `obs` axis for a subset of cells from the Mouse census.

The return value will be a Pandas dataframe indexed by `soma_joinid` (in this case, it will be relative to `obs`) and will contain the `mean` and `variance` columns.

In [6]:
experiment_name = "mus_musculus"
obs_value_filter = 'is_primary_data == True and tissue_general == "skin of body"'

with cellxgene_census.open_soma(census_version="stable") as census:
    with census["census_data"][experiment_name].axis_query(
        measurement_name="RNA", obs_query=soma.AxisQuery(value_filter=obs_value_filter)
    ) as query:
        mv_df = mean_variance(
            query,
            axis=1,
            calculate_mean=True,
            calculate_variance=True,
        )

mv_df

The "stable" release is currently 2023-05-15. Specify 'census_version="2023-05-15"' in future calls to open_soma() to ensure data consistency.


Unnamed: 0_level_0,mean,variance
soma_joinid,Unnamed: 1_level_1,Unnamed: 2_level_1
1926144,15.915025,69571.774917
1926146,5.972801,9471.427044
1926150,25.169472,139042.208628
1926153,8.049836,24762.926397
1926155,17.345415,150412.440839
...,...,...
2109685,0.164319,5.339741
2109686,0.368339,24.930156
2109687,0.246049,11.886186
2109688,0.240724,10.307266
