# Task Vb - blocking analysis

In this exercise you will learn how to perform blocking analysis.
You need to generate a sequence of exponentially correlated Gaussian distributed numbers and then evaluate the statistical error of finite time averages through the blocking method.

In [None]:
import numpy as np
import matplotlib.pyplot as plt 
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit


## 1. Generate the sample
Generate $N$ samples $\{A_i\}$ with correlation time $\tilde{\tau}.$ This script uses the algorithm described in Sec. 5 in `documentation.pdf`.
One canspecify N and the correlation time $\tilde{\tau}$ using the variables `nstep` and `tau` in the code, respectively. 
The code also prints out $\lang A\rang$ and $\sigma_A$.

Verify that $\{A_i\}$ are Gaussian deviates, i.e., $\lang A\rang=0$ and $\sigma_A=1.0$.

In [None]:
def generate_gaussian_distrib(nsamples,tau):
    """ 
    generate a sequence of independent Gaussian numbers with zero mean 
    and unit variance with correlation time tau
    """
    f=np.exp(-1/tau)

    Alist = np.empty(nsamples)
    Alist[0] = np.random.normal(0,1)
    for ii in range(1,nsamples):
        Alist[ii] = f*Alist[ii-1] + np.sqrt(1-f**2)*np.random.normal(0,1)

    return Alist

tau=10
nstep=600000

A=generate_gaussian_distrib(nstep,tau)

average = np.mean(A)
std_deviation = np.std(A)

print("Average:", average)
print("Standard Deviation:", std_deviation)


In [None]:
# Plot the distribution of A
plt.hist(A, bins=30, density=True, alpha=0.5, label='a')

# Plot a normal distribution with mean = 0 and std = 1
halfwidth=5
x = np.linspace(-halfwidth, halfwidth, 100)
y = (1 /  np.sqrt(2 * np.pi)) * np.exp(-0.5 * (x ** 2))
plt.plot(x, y, color='red', label='Normal Distribution')

plt.xlabel('Value')
plt.xlim(-halfwidth, halfwidth)
plt.ylabel('Density')
plt.title('Distribution of A and Normal Distribution')
plt.legend()

plt.show()



## 2. Evaluate the correlation
Compute $C_{AA}(k)$  for different lenghts $k$ and then extract the correlation time $\tilde{\tau}$ using its definition (see eq. 14 in `documentation.pdf`).

In [None]:
krange=100

caa_k=np.empty(0)
for kk in range(krange):
    ii=np.array(range(np.size(A)-kk))
    aa=A[ii]
    bb=A[ii+kk]
    caa_k=np.append(caa_k,(np.mean(aa*bb)- np.mean(aa)*np.mean(bb))/np.var(A))
    print(f"{kk}/{krange}",end='\r')

tau = np.sum(caa_k)-0.5*caa_k[0]
print(f"tau={tau:.2f}")

Plot the autocorrelation function and verify that is indeed an exponential decay.

Fit the autocorrelation function with an exponential decay and verify that the fitted parameter `tau1` is close to `tau` computed using the definition of correlation time.

In [None]:
 
# ... your code here ...

# print("Decay time (tau1):", tau1)


## 3. Compute the statistical error
Evaluate the statistical error using eq. 18 of `documentation.pdf`: $$\sigma_I^2 = \frac{\sigma_A^2}{N}2\tau$$

In [None]:
# ... your code here ...

## 4. - 5. Blocking analysis
The blocking analysis is performed by the following piece of code. `compute_error` computes the correlation time `taum`, the variance for the correlated sampling `sigI` and their errors.

`blocking_step` reduces the sample size according to equation 22 in `documentation.pdf`

In [None]:
def compute_error(iblock, A, varA0, nstep):
    taum = np.var(A)*2**(iblock-1)/varA0
    sigI = np.sqrt(varA0*2*taum/nstep)
    sd_taum = np.sqrt(2/np.size(A))*taum
    sd_sigI = np.sqrt(0.5/len(A))*sigI

    return taum, sigI, sd_taum, sd_sigI

def blocking_step(A):
    B =np.zeros(int(np.size(A)/2))
    for i in range(int(np.size(A)/2)):
        B[i] = (A[2*i]+A[2*i+1])/2
    return B

In [None]:
ntrans=14

A_iter=np.array(A)
results=np.zeros((0,4))
for ii in range(ntrans+1):
    taum, sigI, sd_taum, sd_sigI = compute_error(ii, A_iter, np.var(A), nstep)
    results=np.vstack((results,np.array([taum,sd_taum,sigI,sd_sigI])))
    print(f"iblock={ii}, taum={taum:.2f} +- {sd_taum:.5f}, sigI={sigI:.5f} +- {sd_sigI:.5f}")

    A_iter = blocking_step(A_iter)

Use the blocking method and plot $\sigma_I(M)$ and $\tau(M)$ as a function of the block transformation step.

Evaluate the statistical error at its plateau $\sigma_I^{\rm plateau}$ and the correlation time at its plateau $\tau^{\rm plateau}$.

Compute the ratio $\sigma_I^{\rm plateau}/\sigma_I(0)$ and check that it is equal to $\sqrt{2\tau}$.


In [None]:
# ... your code here ...

## 6. Minimum number of samples
At fixed $\tau$ generate different data sequences increasing the number of samples, $N$.
Determine the minimum number of samples that you need for an accurate evaluation on the correlation time and statistical error.

What is the behavior of the statistical error as a function of $N$?


In [None]:
# ... your code here ...

## 7. $\sigma_I$ vs $\tau$

Calculate $\sigma_I$ for many datasets with different correlation times (be sure that $\tau \ll 2^{M_{\rm max}}$ with $M_{\rm max} = \log_2 N$).
Plot $\sigma_I(\tau)$ as a function of the correlation time and show that $\sigma_I(\tau) = s\sqrt(\frac{2\tau}{N})$ and $s=\sigma_A$.


In [None]:
# ... your code here ...