# Example: keep a record of batch updates

By default, the `update` method of a `MeanVarBatch` class object overwrites the previous values. It may be useful for debugging purposes to keep a record of the previous values. This is easy to do by appending the updated values to a Pandas `DataFrame` object using normal Pandas methods. Here is an example.

First let's set up the example by creating some fake data to work with:

In [4]:
import numpy as np
import stats_batch as sb
from itertools import islice

# Set up example ----------------------------------------------
# Simulate full sample
n = 10_000
a = np.random.normal(size=n, loc=0.1)

# Function to create batches from full samples
def group_elements(lst, chunk_size):
    lst = iter(lst)
    return iter(lambda: tuple(islice(lst, chunk_size)), ())

Now let's batch update and save each update to a `DataFrame` object called `suf_stats_df`:

In [8]:
for i, new_list in enumerate(group_elements(a , 1_000)):
    if i == 0:
        mean_var_a = sb.mean_var_batch(new_list)
        suf_stats_df = mean_var_a.to_pandas()
    else:
        mean_var_a.update(new_list)
        suf_stats_df = suf_stats_df.append(mean_var_a.to_pandas())

suf_stats_df.head()

Unnamed: 0,mean,var,sum_squared_dev,sample_size
0,0.096054,0.910451,910.451306,1000
0,0.081018,0.975073,1949.171564,2000
0,0.0632,0.99986,2998.58071,3000
0,0.086212,1.020739,4081.934294,4000
0,0.094086,1.013119,5064.580857,5000
