|
3 | 3 | <head> |
4 | 4 | <meta charset="UTF-8"> |
5 | 5 | <meta name="viewport" content="width=device-width, initial-scale=1.0"> |
6 | | - <title>Autocorrelation Test - Statistical Test Suite</title> |
| 6 | + <title>Autocorrelation / Serial Correlation Test - Statistical Test Suite</title> |
7 | 7 | <style> |
8 | 8 | * { |
9 | 9 | margin: 0; |
|
160 | 160 | </head> |
161 | 161 | <body> |
162 | 162 | <div class="container"> |
163 | | - <span class="test-category">Enhanced Statistical Test</span> |
164 | | - <h1>Autocorrelation Test</h1> |
| 163 | + <span class="test-category">Enhanced / Small-Sequence Statistical Test</span> |
| 164 | + <h1>Autocorrelation & Serial Correlation Tests</h1> |
165 | 165 |
|
166 | 166 | <div class="highlight-box"> |
167 | | - <strong>Quick Summary:</strong> The Autocorrelation Test checks for correlations between bits at different positions (lag) in the sequence. It detects whether knowing past bits helps predict future bits, which would indicate the sequence is not truly random. Think of it as checking if the sequence has "memory" of its previous values. |
| 167 | + <strong>Quick Summary:</strong> Two related tests check whether values in the sequence are independent of their neighbours. The <strong>bit-level Autocorrelation Test</strong> looks at lag-1 and lag-2 bit matches; the <strong>number-level Serial Correlation Test</strong> computes the lag-1 Pearson correlation on the numbers themselves. Both should give a result close to "no correlation" for random input. |
168 | 168 | </div> |
169 | 169 |
|
170 | | - <h2>What It Tests For</h2> |
171 | | - <p>This test examines <strong>independence between bit positions</strong> separated by a fixed distance (lag). A truly random sequence should have no correlation between bits, regardless of how far apart they are. The test detects:</p> |
172 | | - <ul> |
173 | | - <li>Self-similarity in the bit sequence</li> |
174 | | - <li>Periodic patterns that repeat at regular intervals</li> |
175 | | - <li>Dependencies between bit positions</li> |
176 | | - <li>Hidden cyclic behavior in the generator</li> |
177 | | - <li>Linear feedback structures or predictable recurrence</li> |
178 | | - </ul> |
| 170 | + <p>This page covers two distinct tests in this validator. They are described separately below because they operate on different data and use different formulas.</p> |
179 | 171 |
|
180 | | - <h2>How It Works</h2> |
181 | | - <h3>Step 1: Choose a Lag (d)</h3> |
182 | | - <p>The lag determines how far apart to compare bits. Common lags tested are d = 1, 2, 4, 8, 16, etc:</p> |
183 | | - <div class="example"> |
184 | | -Lag d = 1: Compare each bit with the next bit |
185 | | -Lag d = 2: Compare each bit with the bit 2 positions ahead |
186 | | -Lag d = 8: Compare each bit with the bit 8 positions ahead |
187 | | - </div> |
| 172 | + <h2>Test A: Bit-Level Autocorrelation (Enhanced Stats)</h2> |
188 | 173 |
|
189 | | - <h3>Step 2: Compute the Autocorrelation Function</h3> |
190 | | - <p>For a bit sequence of length n, compare each bit with the bit d positions ahead:</p> |
191 | | - <div class="example"> |
192 | | -Original sequence: 1 0 1 1 0 0 1 1 0 1 ... |
193 | | -Shifted by d=2: _ _ 1 0 1 1 0 0 1 1 ... |
| 174 | + <p>Operates on the bit stream produced from the input numbers. It is intentionally simple — it only looks at lags 1 and 2.</p> |
194 | 175 |
|
195 | | -For each position i, compare bit[i] with bit[i+d] |
196 | | -Count matches and mismatches |
197 | | - </div> |
198 | | - |
199 | | - <h3>Step 3: Calculate Correlation Coefficient</h3> |
200 | | - <p>The autocorrelation coefficient measures the similarity between the original and shifted sequences:</p> |
| 176 | + <h3>How It Works</h3> |
| 177 | + <p>For each lag <em>d</em> ∈ {1, 2}, count how often <code>bits[i] == bits[i+d]</code> across the stream and convert it into a deviation from the random expectation of 0.5:</p> |
201 | 178 | <div class="example"> |
202 | | -C(d) = (1/n) × Σ [(2×b[i] - 1) × (2×b[i+d] - 1)] |
203 | | - |
204 | | -Where: |
205 | | - - b[i] is the bit at position i (0 or 1) |
206 | | - - Sum runs from i=0 to n-d |
207 | | - - Result ranges from -1 (perfect anti-correlation) to +1 (perfect correlation) |
| 179 | +match_ratio(d) = (# i where bits[i] == bits[i+d]) / (n - d) |
| 180 | +deviation(d) = | match_ratio(d) - 0.5 | |
| 181 | +statistic = max(deviation(1), deviation(2)) |
208 | 182 | </div> |
209 | 183 |
|
210 | | - <h3>Step 4: Normalize and Compute Statistic</h3> |
211 | | - <p>Convert the correlation to a standardized test statistic:</p> |
212 | | - <div class="example"> |
213 | | -Z = C(d) × √(n-d) |
214 | | - |
215 | | -For random data: |
216 | | - - Z should be close to 0 |
217 | | - - Z follows approximately normal distribution N(0,1) |
218 | | - </div> |
| 184 | + <h3>Pass / Fail Rule</h3> |
| 185 | + <p>The test passes if the statistic is below 0.15 (i.e. the lag-1 and lag-2 match ratios both fall in the range 0.35–0.65). There is no p-value.</p> |
219 | 186 |
|
220 | 187 | <div class="info-box"> |
221 | | - <strong>Understanding Results:</strong> |
222 | 188 | <ul> |
223 | | - <li><strong>Z ≈ 0:</strong> PASS - No correlation detected, bits are independent</li> |
224 | | - <li><strong>Z >> 0:</strong> FAIL - Positive correlation, bits tend to repeat</li> |
225 | | - <li><strong>Z << 0:</strong> FAIL - Negative correlation, bits tend to alternate</li> |
| 189 | + <li><strong>statistic < 0.15:</strong> PASS — no significant lag-1/lag-2 dependency</li> |
| 190 | + <li><strong>statistic ≥ 0.15:</strong> FAIL — adjacent or near-adjacent bits are too similar (or too different)</li> |
226 | 191 | </ul> |
227 | 192 | </div> |
228 | 193 |
|
229 | | - <h2>Interpreting Results</h2> |
230 | | - |
231 | | - <h3>Pass (no correlation)</h3> |
232 | | - <p>Bits at different positions are statistically independent. Knowing the value of bit[i] provides no information about bit[i+d]. The generator produces uncorrelated output with no detectable memory or cyclic patterns.</p> |
233 | | - |
234 | | - <h3>Fail (positive correlation)</h3> |
235 | | - <p>Bits tend to match their predecessors at the given lag. Common causes include:</p> |
236 | | - <ul> |
237 | | - <li><strong>Periodic Patterns:</strong> Sequence repeats with period related to the lag</li> |
238 | | - <li><strong>Long Runs:</strong> Generator produces extended sequences of identical bits</li> |
239 | | - <li><strong>Insufficient Mixing:</strong> State transitions don't adequately randomize output</li> |
240 | | - <li><strong>Linear Recurrence:</strong> Generator uses linear feedback that creates correlations</li> |
241 | | - </ul> |
242 | | - |
243 | | - <h3>Fail (negative correlation)</h3> |
244 | | - <p>Bits tend to be opposite to their predecessors at the given lag. Common causes include:</p> |
245 | | - <ul> |
246 | | - <li><strong>Alternating Patterns:</strong> Excessive bit flipping at regular intervals</li> |
247 | | - <li><strong>Overcorrection:</strong> Generator tries too hard to balance 0s and 1s</li> |
248 | | - <li><strong>Phase Relationships:</strong> Internal state oscillates predictably</li> |
249 | | - </ul> |
| 194 | + <h3>Examples</h3> |
| 195 | + <div class="example"> |
| 196 | +Sequence 01010101... |
| 197 | +match_ratio(1) ≈ 0.0 → deviation ≈ 0.5 → FAIL (anti-correlated) |
250 | 198 |
|
251 | | - <h2>Common Failure Patterns</h2> |
| 199 | +Sequence 11111111... |
| 200 | +match_ratio(1) = 1.0 → deviation = 0.5 → FAIL (over-correlated) |
252 | 201 |
|
253 | | - <h3>Example 1: Perfect Period</h3> |
254 | | - <div class="example"> |
255 | | -Sequence: 10110101101011011010110101101... |
256 | | -Pattern: "1011" repeats with period 4 |
| 202 | +Random-looking bits |
| 203 | +match_ratio(1) ≈ 0.5 → deviation small → PASS |
| 204 | + </div> |
257 | 205 |
|
258 | | -At lag d=4: Perfect correlation (C=1.0) |
259 | | -Result: FAIL - sequence has period 4 |
| 206 | + <div class="warning-box"> |
| 207 | + <strong>Limitations:</strong> Only lags 1 and 2 are checked. Periodicities at longer lags (e.g. a 32-bit cycle) will not be detected by this test. For a more thorough analysis, use the NIST DFT (Spectral) Test and the Serial Test. |
260 | 208 | </div> |
261 | 209 |
|
262 | | - <h3>Example 2: Alternating Bits</h3> |
263 | | - <div class="example"> |
264 | | -Sequence: 10101010101010101010101010... |
| 210 | + <h2>Test B: Number-Level Serial Correlation (Small Sequence)</h2> |
265 | 211 |
|
266 | | -At lag d=1: Perfect anti-correlation (C=-1.0) |
267 | | -At lag d=2: Perfect correlation (C=1.0) |
268 | | -Result: FAIL - strictly alternating sequence |
269 | | - </div> |
| 212 | + <p>Operates on the input numbers as floating-point values, not on the bit stream. It computes a standard lag-1 Pearson correlation coefficient.</p> |
270 | 213 |
|
271 | | - <h3>Example 3: Long Runs</h3> |
| 214 | + <h3>How It Works</h3> |
| 215 | + <p>The numbers are first normalised to the range [0, 1] using their min and max. Then the lag-1 correlation is computed:</p> |
272 | 216 | <div class="example"> |
273 | | -Sequence: 11111111110000000000111111111... |
| 217 | +xᵢ = (numbers[i] - min) / (max - min) |
| 218 | +mean = average of all xᵢ |
274 | 219 |
|
275 | | -At lag d=1,2,3: High positive correlation |
276 | | -Result: FAIL - excessive runs of identical bits |
| 220 | + Σ (xᵢ - mean) × (xᵢ₊₁ - mean) |
| 221 | +corr = ─────────────────────────────────── |
| 222 | + Σ (xᵢ - mean)² |
277 | 223 | </div> |
| 224 | + <p>The result lies in [-1, +1]. Values near 0 indicate independence; values near +1 indicate that consecutive numbers tend to be similar; values near -1 indicate they tend to alternate.</p> |
278 | 225 |
|
279 | | - <h3>Example 4: Weak PRNG</h3> |
| 226 | + <h3>Pass / Fail Rule</h3> |
| 227 | + <p>The threshold relaxes for shorter inputs, because small samples have higher natural variance:</p> |
280 | 228 | <div class="example"> |
281 | | -Linear Congruential Generator with short period: |
282 | | -Sequence shows correlation at multiple lags |
| 229 | +n < 10 → fail if |corr| > 0.6 |
| 230 | +n < 20 → fail if |corr| > 0.5 |
| 231 | +n ≥ 20 → fail if |corr| > 0.4 |
283 | 232 |
|
284 | | -At lag d=16: C(16) = 0.45 |
285 | | -At lag d=32: C(32) = 0.38 |
286 | | -Result: FAIL - generator has linear structure |
| 233 | +p-value (heuristic) = clamp(1 - |corr| / threshold, 0, 1) |
287 | 234 | </div> |
| 235 | + <p>Edge case: if every input number is identical, the test fails immediately with the message "All numbers are identical".</p> |
288 | 236 |
|
289 | | - <h2>Multiple Lag Testing</h2> |
290 | | - <p>The test is typically run at multiple lags to detect different types of patterns:</p> |
| 237 | + <h3>Examples</h3> |
291 | 238 | <div class="example"> |
292 | | -Short lags (d=1,2,4): Detect local patterns |
293 | | -Medium lags (d=8,16): Detect periodic behavior |
294 | | -Long lags (d=64,128): Detect generator period issues |
295 | | - </div> |
| 239 | +Numbers: 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 |
| 240 | +Strictly increasing → corr ≈ +1.0 → FAIL |
296 | 241 |
|
297 | | - <div class="warning-box"> |
298 | | - <strong>Important:</strong> A good random sequence should pass the test at ALL tested lags. Failure at any lag indicates a structural problem with the generator. |
| 242 | +Numbers: 50, 10, 50, 10, 50, 10, 50, 10 |
| 243 | +Strict alternation → corr ≈ -1.0 → FAIL |
| 244 | + |
| 245 | +Numbers: 42, 7, 88, 23, 91, 4, 60, 71 |
| 246 | +No obvious pattern → |corr| small → PASS |
299 | 247 | </div> |
300 | 248 |
|
301 | 249 | <h2>Technical Details</h2> |
302 | 250 | <div class="stat-grid"> |
303 | 251 | <div class="stat-card"> |
304 | | - <div class="stat-label">Test Type</div> |
305 | | - <div class="stat-value">Correlation</div> |
| 252 | + <div class="stat-label">Test A — Lags</div> |
| 253 | + <div class="stat-value">1, 2</div> |
306 | 254 | </div> |
307 | 255 | <div class="stat-card"> |
308 | | - <div class="stat-label">Lags Tested</div> |
309 | | - <div class="stat-value">Multiple</div> |
| 256 | + <div class="stat-label">Test A — Min Bits</div> |
| 257 | + <div class="stat-value">10</div> |
310 | 258 | </div> |
311 | 259 | <div class="stat-card"> |
312 | | - <div class="stat-label">Min Sequence</div> |
313 | | - <div class="stat-value">1000+ bits</div> |
| 260 | + <div class="stat-label">Test B — Lag</div> |
| 261 | + <div class="stat-value">1</div> |
314 | 262 | </div> |
315 | 263 | <div class="stat-card"> |
316 | | - <div class="stat-label">Distribution</div> |
317 | | - <div class="stat-value">Normal</div> |
| 264 | + <div class="stat-label">Test B — Min Numbers</div> |
| 265 | + <div class="stat-value">3</div> |
318 | 266 | </div> |
319 | 267 | </div> |
320 | 268 |
|
321 | | - <h2>Mathematical Foundation</h2> |
322 | | - <p>The autocorrelation function is a fundamental tool in signal processing and time series analysis. For a random bit sequence, the theoretical autocorrelation is:</p> |
323 | | - <div class="example"> |
324 | | -E[C(d)] = 0 for all d > 0 |
325 | | -Var[C(d)] ≈ 1/(n-d) |
326 | | - |
327 | | -Where E[] is expected value and Var[] is variance |
328 | | - </div> |
329 | | - |
330 | | - <h2>Relationship to Other Tests</h2> |
331 | | - <p>The Autocorrelation Test complements other randomness tests:</p> |
| 269 | + <h2>What These Tests Catch</h2> |
332 | 270 | <ul> |
333 | | - <li><strong>Runs Test:</strong> Detects local run patterns (similar to lag d=1)</li> |
334 | | - <li><strong>Spectral Test:</strong> Detects periodicities using Fourier analysis</li> |
335 | | - <li><strong>Serial Test:</strong> Checks 2-bit pattern distributions</li> |
336 | | - <li><strong>Autocorrelation Test:</strong> Directly measures bit-to-bit dependencies at various lags</li> |
| 271 | + <li><strong>Strict alternation</strong> like <code>0,1,0,1,...</code> or <code>5,10,5,10,...</code></li> |
| 272 | + <li><strong>Long runs</strong> of identical bits or near-identical numbers</li> |
| 273 | + <li><strong>Linear sequences</strong> like <code>1,2,3,4,...</code> (high positive correlation)</li> |
| 274 | + <li><strong>Trivial generators</strong> whose successive outputs are tightly correlated</li> |
337 | 275 | </ul> |
338 | 276 |
|
339 | | - <h2>When to Use This Test</h2> |
340 | | - <p>This test is particularly valuable for:</p> |
| 277 | + <h2>What These Tests Miss</h2> |
341 | 278 | <ul> |
342 | | - <li>Detecting periodic behavior in the generator</li> |
343 | | - <li>Identifying linear feedback structures</li> |
344 | | - <li>Validating cryptographic generators (critical for stream ciphers)</li> |
345 | | - <li>Checking for short periods in PRNGs</li> |
346 | | - <li>Complementing frequency and distribution tests</li> |
| 279 | + <li>Periodicities at lags ≥ 3 (Test A only inspects lags 1 and 2)</li> |
| 280 | + <li>Non-linear dependencies (Pearson correlation in Test B is linear-only)</li> |
| 281 | + <li>Pattern biases at the multi-bit level — use the Serial, Poker, and FFT tests for those</li> |
347 | 282 | </ul> |
348 | 283 |
|
349 | | - <h2>Real-World Example</h2> |
350 | | - <p>A flawed LFSR (Linear Feedback Shift Register) generator might produce output where every 32nd bit is correlated due to its internal structure. The autocorrelation test at lag d=32 would detect this, even though the sequence might pass simpler frequency tests.</p> |
| 284 | + <h2>Relationship to Other Tests</h2> |
| 285 | + <ul> |
| 286 | + <li><strong>NIST Runs Test:</strong> a more rigorous treatment of lag-1 bit transitions</li> |
| 287 | + <li><strong>NIST Serial Test:</strong> checks the joint frequency of all m-bit patterns</li> |
| 288 | + <li><strong>NIST DFT (Spectral) Test:</strong> detects periodicities at any frequency</li> |
| 289 | + </ul> |
| 290 | + <p>The two tests on this page are deliberately fast and shallow; treat them as a first pass, not as a substitute for the NIST suite.</p> |
351 | 291 |
|
352 | 292 | <a href="/" class="back-link">← Back to Validator</a> |
353 | 293 | </div> |
|
0 commit comments