Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lagged auto-correlations of random_normal not nul #189

Closed
yruprich opened this issue Aug 17, 2022 · 3 comments
Closed

Lagged auto-correlations of random_normal not nul #189

yruprich opened this issue Aug 17, 2022 · 3 comments

Comments

@yruprich
Copy link

yruprich commented Aug 17, 2022

Description of the bug

Hi, I believe there is a shortcoming with the function random_normal.
By generating vectors of T elements with this function, I find that in average the lagged auto-correlation of those vectors in not 0 at lags different than 0. The auto-correlation value tends to -1/(T-1).

Example:

N     = toint(10^7)
T     = 100
invT  = -1./(T-1.)
sd    = 1
av    = 0
mxlag = 10

random_setallseed(1,1) ; (36484749, 9494848)  
X = random_normal(av,sd,(/N,T/))

acf = esacr(X,mxlag)

print("mean auto-correlation of random_normal vector of length T="+T+": "+dim_avg_n_Wrap(acf,0))
print("to be compared with -1/(T-1) = "+invT)

Computing environment

I have this problem in all the 3 environments I tried:

  1. Linux, Ubuntu 20.04.4 LTS, NCL 6.6.2, installed with apt install ncl-ncarg
  2. Linux, OpenSUSE Leap 42.3, NCL 6.3.0, installed with pre-compiled binaries "version-CentOS7.6_64bit_nodap_gnu485.tar.gz"
  3. Linux, Red Hat Enterprise Linux 8.4, NCL 6.6.2, built from sources

Additional context
The problem I am referring to might seem tiny. However, it leads to larger biases when those vectors are used as seeds to generate auto-regressive time series. This is also problematic in case one uses this function to create bootstrap statistical tests.

Cheers,
Yohan

@yruprich
Copy link
Author

Actually, I am facing the same problem with Python (v2.7.9 and v3.7.4):

import numpy as np

N     = 10000000
T     = 100
invT  = -1./(T-1.)
sd    = 1
av    = 0
mxlag = 10

X     = np.random.normal(av, sd, size=(N, T))
acf   = X[:,0:mxlag+1]
for i in range(N):
    acf[i,:] = [1. if l==0 else np.corrcoef(X[i,l:],X[i,:-l])[0][1] for l in range(mxlag+1)]

acf_mean=np.average(acf, axis=0)

print('mean auto-correlation of random_normal vector of length T=',T,' : ',acf_mean)
print('to be compared with -1/(T-1) = ',invT)

@yruprich
Copy link
Author

yruprich commented Aug 17, 2022

@yruprich
Copy link
Author

Actually this is not a shortcoming of the NCL function. My problem is coming from the bias in the estimate of the auto-correlation. This has been already documented back in 1954...

Reference: Marriott, F. H. C., and J. A. Pope. "Bias in the estimation of autocorrelations." Biometrika 41.3/4 (1954): 390-402 (https://www.jstor.org/stable/2332719)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant