Skip to content

Commit 7efef46

Browse files
committed
Updated documentation
1 parent c08b07b commit 7efef46

9 files changed

Lines changed: 176 additions & 361 deletions

static/tests/autocorrelation.html

Lines changed: 76 additions & 136 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
<head>
44
<meta charset="UTF-8">
55
<meta name="viewport" content="width=device-width, initial-scale=1.0">
6-
<title>Autocorrelation Test - Statistical Test Suite</title>
6+
<title>Autocorrelation / Serial Correlation Test - Statistical Test Suite</title>
77
<style>
88
* {
99
margin: 0;
@@ -160,194 +160,134 @@
160160
</head>
161161
<body>
162162
<div class="container">
163-
<span class="test-category">Enhanced Statistical Test</span>
164-
<h1>Autocorrelation Test</h1>
163+
<span class="test-category">Enhanced / Small-Sequence Statistical Test</span>
164+
<h1>Autocorrelation &amp; Serial Correlation Tests</h1>
165165

166166
<div class="highlight-box">
167-
<strong>Quick Summary:</strong> The Autocorrelation Test checks for correlations between bits at different positions (lag) in the sequence. It detects whether knowing past bits helps predict future bits, which would indicate the sequence is not truly random. Think of it as checking if the sequence has "memory" of its previous values.
167+
<strong>Quick Summary:</strong> Two related tests check whether values in the sequence are independent of their neighbours. The <strong>bit-level Autocorrelation Test</strong> looks at lag-1 and lag-2 bit matches; the <strong>number-level Serial Correlation Test</strong> computes the lag-1 Pearson correlation on the numbers themselves. Both should give a result close to "no correlation" for random input.
168168
</div>
169169

170-
<h2>What It Tests For</h2>
171-
<p>This test examines <strong>independence between bit positions</strong> separated by a fixed distance (lag). A truly random sequence should have no correlation between bits, regardless of how far apart they are. The test detects:</p>
172-
<ul>
173-
<li>Self-similarity in the bit sequence</li>
174-
<li>Periodic patterns that repeat at regular intervals</li>
175-
<li>Dependencies between bit positions</li>
176-
<li>Hidden cyclic behavior in the generator</li>
177-
<li>Linear feedback structures or predictable recurrence</li>
178-
</ul>
170+
<p>This page covers two distinct tests in this validator. They are described separately below because they operate on different data and use different formulas.</p>
179171

180-
<h2>How It Works</h2>
181-
<h3>Step 1: Choose a Lag (d)</h3>
182-
<p>The lag determines how far apart to compare bits. Common lags tested are d = 1, 2, 4, 8, 16, etc:</p>
183-
<div class="example">
184-
Lag d = 1: Compare each bit with the next bit
185-
Lag d = 2: Compare each bit with the bit 2 positions ahead
186-
Lag d = 8: Compare each bit with the bit 8 positions ahead
187-
</div>
172+
<h2>Test A: Bit-Level Autocorrelation (Enhanced Stats)</h2>
188173

189-
<h3>Step 2: Compute the Autocorrelation Function</h3>
190-
<p>For a bit sequence of length n, compare each bit with the bit d positions ahead:</p>
191-
<div class="example">
192-
Original sequence: 1 0 1 1 0 0 1 1 0 1 ...
193-
Shifted by d=2: _ _ 1 0 1 1 0 0 1 1 ...
174+
<p>Operates on the bit stream produced from the input numbers. It is intentionally simple — it only looks at lags 1 and 2.</p>
194175

195-
For each position i, compare bit[i] with bit[i+d]
196-
Count matches and mismatches
197-
</div>
198-
199-
<h3>Step 3: Calculate Correlation Coefficient</h3>
200-
<p>The autocorrelation coefficient measures the similarity between the original and shifted sequences:</p>
176+
<h3>How It Works</h3>
177+
<p>For each lag <em>d</em> ∈ {1, 2}, count how often <code>bits[i] == bits[i+d]</code> across the stream and convert it into a deviation from the random expectation of 0.5:</p>
201178
<div class="example">
202-
C(d) = (1/n) × Σ [(2×b[i] - 1) × (2×b[i+d] - 1)]
203-
204-
Where:
205-
- b[i] is the bit at position i (0 or 1)
206-
- Sum runs from i=0 to n-d
207-
- Result ranges from -1 (perfect anti-correlation) to +1 (perfect correlation)
179+
match_ratio(d) = (# i where bits[i] == bits[i+d]) / (n - d)
180+
deviation(d) = | match_ratio(d) - 0.5 |
181+
statistic = max(deviation(1), deviation(2))
208182
</div>
209183

210-
<h3>Step 4: Normalize and Compute Statistic</h3>
211-
<p>Convert the correlation to a standardized test statistic:</p>
212-
<div class="example">
213-
Z = C(d) × √(n-d)
214-
215-
For random data:
216-
- Z should be close to 0
217-
- Z follows approximately normal distribution N(0,1)
218-
</div>
184+
<h3>Pass / Fail Rule</h3>
185+
<p>The test passes if the statistic is below 0.15 (i.e. the lag-1 and lag-2 match ratios both fall in the range 0.35–0.65). There is no p-value.</p>
219186

220187
<div class="info-box">
221-
<strong>Understanding Results:</strong>
222188
<ul>
223-
<li><strong>Z ≈ 0:</strong> PASS - No correlation detected, bits are independent</li>
224-
<li><strong>Z >> 0:</strong> FAIL - Positive correlation, bits tend to repeat</li>
225-
<li><strong>Z << 0:</strong> FAIL - Negative correlation, bits tend to alternate</li>
189+
<li><strong>statistic &lt; 0.15:</strong> PASS — no significant lag-1/lag-2 dependency</li>
190+
<li><strong>statistic ≥ 0.15:</strong> FAIL — adjacent or near-adjacent bits are too similar (or too different)</li>
226191
</ul>
227192
</div>
228193

229-
<h2>Interpreting Results</h2>
230-
231-
<h3>Pass (no correlation)</h3>
232-
<p>Bits at different positions are statistically independent. Knowing the value of bit[i] provides no information about bit[i+d]. The generator produces uncorrelated output with no detectable memory or cyclic patterns.</p>
233-
234-
<h3>Fail (positive correlation)</h3>
235-
<p>Bits tend to match their predecessors at the given lag. Common causes include:</p>
236-
<ul>
237-
<li><strong>Periodic Patterns:</strong> Sequence repeats with period related to the lag</li>
238-
<li><strong>Long Runs:</strong> Generator produces extended sequences of identical bits</li>
239-
<li><strong>Insufficient Mixing:</strong> State transitions don't adequately randomize output</li>
240-
<li><strong>Linear Recurrence:</strong> Generator uses linear feedback that creates correlations</li>
241-
</ul>
242-
243-
<h3>Fail (negative correlation)</h3>
244-
<p>Bits tend to be opposite to their predecessors at the given lag. Common causes include:</p>
245-
<ul>
246-
<li><strong>Alternating Patterns:</strong> Excessive bit flipping at regular intervals</li>
247-
<li><strong>Overcorrection:</strong> Generator tries too hard to balance 0s and 1s</li>
248-
<li><strong>Phase Relationships:</strong> Internal state oscillates predictably</li>
249-
</ul>
194+
<h3>Examples</h3>
195+
<div class="example">
196+
Sequence 01010101...
197+
match_ratio(1) ≈ 0.0 → deviation ≈ 0.5 → FAIL (anti-correlated)
250198

251-
<h2>Common Failure Patterns</h2>
199+
Sequence 11111111...
200+
match_ratio(1) = 1.0 → deviation = 0.5 → FAIL (over-correlated)
252201

253-
<h3>Example 1: Perfect Period</h3>
254-
<div class="example">
255-
Sequence: 10110101101011011010110101101...
256-
Pattern: "1011" repeats with period 4
202+
Random-looking bits
203+
match_ratio(1) ≈ 0.5 → deviation small → PASS
204+
</div>
257205

258-
At lag d=4: Perfect correlation (C=1.0)
259-
Result: FAIL - sequence has period 4
206+
<div class="warning-box">
207+
<strong>Limitations:</strong> Only lags 1 and 2 are checked. Periodicities at longer lags (e.g. a 32-bit cycle) will not be detected by this test. For a more thorough analysis, use the NIST DFT (Spectral) Test and the Serial Test.
260208
</div>
261209

262-
<h3>Example 2: Alternating Bits</h3>
263-
<div class="example">
264-
Sequence: 10101010101010101010101010...
210+
<h2>Test B: Number-Level Serial Correlation (Small Sequence)</h2>
265211

266-
At lag d=1: Perfect anti-correlation (C=-1.0)
267-
At lag d=2: Perfect correlation (C=1.0)
268-
Result: FAIL - strictly alternating sequence
269-
</div>
212+
<p>Operates on the input numbers as floating-point values, not on the bit stream. It computes a standard lag-1 Pearson correlation coefficient.</p>
270213

271-
<h3>Example 3: Long Runs</h3>
214+
<h3>How It Works</h3>
215+
<p>The numbers are first normalised to the range [0, 1] using their min and max. Then the lag-1 correlation is computed:</p>
272216
<div class="example">
273-
Sequence: 11111111110000000000111111111...
217+
xᵢ = (numbers[i] - min) / (max - min)
218+
mean = average of all xᵢ
274219

275-
At lag d=1,2,3: High positive correlation
276-
Result: FAIL - excessive runs of identical bits
220+
Σ (xᵢ - mean) × (xᵢ₊₁ - mean)
221+
corr = ───────────────────────────────────
222+
Σ (xᵢ - mean)²
277223
</div>
224+
<p>The result lies in [-1, +1]. Values near 0 indicate independence; values near +1 indicate that consecutive numbers tend to be similar; values near -1 indicate they tend to alternate.</p>
278225

279-
<h3>Example 4: Weak PRNG</h3>
226+
<h3>Pass / Fail Rule</h3>
227+
<p>The threshold relaxes for shorter inputs, because small samples have higher natural variance:</p>
280228
<div class="example">
281-
Linear Congruential Generator with short period:
282-
Sequence shows correlation at multiple lags
229+
n &lt; 10 → fail if |corr| &gt; 0.6
230+
n &lt; 20 → fail if |corr| &gt; 0.5
231+
n ≥ 20 → fail if |corr| &gt; 0.4
283232

284-
At lag d=16: C(16) = 0.45
285-
At lag d=32: C(32) = 0.38
286-
Result: FAIL - generator has linear structure
233+
p-value (heuristic) = clamp(1 - |corr| / threshold, 0, 1)
287234
</div>
235+
<p>Edge case: if every input number is identical, the test fails immediately with the message "All numbers are identical".</p>
288236

289-
<h2>Multiple Lag Testing</h2>
290-
<p>The test is typically run at multiple lags to detect different types of patterns:</p>
237+
<h3>Examples</h3>
291238
<div class="example">
292-
Short lags (d=1,2,4): Detect local patterns
293-
Medium lags (d=8,16): Detect periodic behavior
294-
Long lags (d=64,128): Detect generator period issues
295-
</div>
239+
Numbers: 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
240+
Strictly increasing → corr ≈ +1.0 → FAIL
296241

297-
<div class="warning-box">
298-
<strong>Important:</strong> A good random sequence should pass the test at ALL tested lags. Failure at any lag indicates a structural problem with the generator.
242+
Numbers: 50, 10, 50, 10, 50, 10, 50, 10
243+
Strict alternation → corr ≈ -1.0 → FAIL
244+
245+
Numbers: 42, 7, 88, 23, 91, 4, 60, 71
246+
No obvious pattern → |corr| small → PASS
299247
</div>
300248

301249
<h2>Technical Details</h2>
302250
<div class="stat-grid">
303251
<div class="stat-card">
304-
<div class="stat-label">Test Type</div>
305-
<div class="stat-value">Correlation</div>
252+
<div class="stat-label">Test A — Lags</div>
253+
<div class="stat-value">1, 2</div>
306254
</div>
307255
<div class="stat-card">
308-
<div class="stat-label">Lags Tested</div>
309-
<div class="stat-value">Multiple</div>
256+
<div class="stat-label">Test A — Min Bits</div>
257+
<div class="stat-value">10</div>
310258
</div>
311259
<div class="stat-card">
312-
<div class="stat-label">Min Sequence</div>
313-
<div class="stat-value">1000+ bits</div>
260+
<div class="stat-label">Test B — Lag</div>
261+
<div class="stat-value">1</div>
314262
</div>
315263
<div class="stat-card">
316-
<div class="stat-label">Distribution</div>
317-
<div class="stat-value">Normal</div>
264+
<div class="stat-label">Test B — Min Numbers</div>
265+
<div class="stat-value">3</div>
318266
</div>
319267
</div>
320268

321-
<h2>Mathematical Foundation</h2>
322-
<p>The autocorrelation function is a fundamental tool in signal processing and time series analysis. For a random bit sequence, the theoretical autocorrelation is:</p>
323-
<div class="example">
324-
E[C(d)] = 0 for all d > 0
325-
Var[C(d)] ≈ 1/(n-d)
326-
327-
Where E[] is expected value and Var[] is variance
328-
</div>
329-
330-
<h2>Relationship to Other Tests</h2>
331-
<p>The Autocorrelation Test complements other randomness tests:</p>
269+
<h2>What These Tests Catch</h2>
332270
<ul>
333-
<li><strong>Runs Test:</strong> Detects local run patterns (similar to lag d=1)</li>
334-
<li><strong>Spectral Test:</strong> Detects periodicities using Fourier analysis</li>
335-
<li><strong>Serial Test:</strong> Checks 2-bit pattern distributions</li>
336-
<li><strong>Autocorrelation Test:</strong> Directly measures bit-to-bit dependencies at various lags</li>
271+
<li><strong>Strict alternation</strong> like <code>0,1,0,1,...</code> or <code>5,10,5,10,...</code></li>
272+
<li><strong>Long runs</strong> of identical bits or near-identical numbers</li>
273+
<li><strong>Linear sequences</strong> like <code>1,2,3,4,...</code> (high positive correlation)</li>
274+
<li><strong>Trivial generators</strong> whose successive outputs are tightly correlated</li>
337275
</ul>
338276

339-
<h2>When to Use This Test</h2>
340-
<p>This test is particularly valuable for:</p>
277+
<h2>What These Tests Miss</h2>
341278
<ul>
342-
<li>Detecting periodic behavior in the generator</li>
343-
<li>Identifying linear feedback structures</li>
344-
<li>Validating cryptographic generators (critical for stream ciphers)</li>
345-
<li>Checking for short periods in PRNGs</li>
346-
<li>Complementing frequency and distribution tests</li>
279+
<li>Periodicities at lags ≥ 3 (Test A only inspects lags 1 and 2)</li>
280+
<li>Non-linear dependencies (Pearson correlation in Test B is linear-only)</li>
281+
<li>Pattern biases at the multi-bit level — use the Serial, Poker, and FFT tests for those</li>
347282
</ul>
348283

349-
<h2>Real-World Example</h2>
350-
<p>A flawed LFSR (Linear Feedback Shift Register) generator might produce output where every 32nd bit is correlated due to its internal structure. The autocorrelation test at lag d=32 would detect this, even though the sequence might pass simpler frequency tests.</p>
284+
<h2>Relationship to Other Tests</h2>
285+
<ul>
286+
<li><strong>NIST Runs Test:</strong> a more rigorous treatment of lag-1 bit transitions</li>
287+
<li><strong>NIST Serial Test:</strong> checks the joint frequency of all m-bit patterns</li>
288+
<li><strong>NIST DFT (Spectral) Test:</strong> detects periodicities at any frequency</li>
289+
</ul>
290+
<p>The two tests on this page are deliberately fast and shallow; treat them as a first pass, not as a substitute for the NIST suite.</p>
351291

352292
<a href="/" class="back-link">← Back to Validator</a>
353293
</div>

static/tests/chisquared.html

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -274,13 +274,13 @@ <h3>Example 2: Poor Distribution</h3>
274274
Unique: 4 values
275275
Expected: 2.0 per value
276276
Observed:
277-
3: 2 times (OK)
278-
5: 4 times (2× expected!) ← problem
279-
7: 1 time (½ expected)
280-
8: 1 time (½ expected)
281-
χ²: 4.0
282-
Threshold: 8.0
283-
Result: Borderline, but value 5 shows concerning bias
277+
3: 2 times (OK, contributes (2-2)²/2 = 0.0)
278+
5: 4 times (2×!, contributes (4-2)²/2 = 2.0) ← problem
279+
7: 1 time (½×, contributes (1-2)²/2 = 0.5)
280+
8: 1 time (½×, contributes (1-2)²/2 = 0.5)
281+
χ²: 3.0
282+
Threshold: 8.0 (k × 2 with k = 4)
283+
Result: PASS, but value 5 still shows concerning bias
284284
</div>
285285

286286
<h3>Example 3: Clustered Values</h3>

static/tests/fft.html

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -222,10 +222,13 @@ <h3>Step 4: Count Peaks Above Threshold</h3>
222222
<h3>Step 5: Statistical Comparison</h3>
223223
<p>The observed percentage of peaks below the threshold is compared to the expected 95%:</p>
224224
<div class="example">
225-
N0 = number of peaks below threshold
226-
N1 = (95% of N/2) - expected count below threshold
227-
d = (N0 - N1) / sqrt(N * 0.95 * 0.05 / 4)
225+
N0 = 0.95 × (N/2) — expected count below threshold
226+
N1 = observed peaks below T — actual count below threshold
227+
d = (N1 - N0) / sqrt(N * 0.95 * 0.05 / 4)
228228
p-value = erfc(|d| / sqrt(2))
229+
230+
(NIST SP 800-22 §2.6 convention: N0 is the theoretical
231+
expectation, N1 is the observed value.)
229232
</div>
230233

231234
<div class="info-box">

static/tests/linearcomplexity.html

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -188,9 +188,15 @@ <h2>How It Works</h2>
188188
<h3>Step 1: Divide Into Blocks</h3>
189189
<p>The sequence is divided into M non-overlapping blocks of length N bits each (typically N = 500 or N = 1000):</p>
190190
<div class="example">
191-
Block size N = 500 bits
192-
Minimum M = 200 blocks
193-
Total bits needed: 100,000 minimum
191+
Block size M = 500–5000 bits (NIST notation: M)
192+
Minimum number of blocks N ≥ 200
193+
Theoretical minimum: 200 × 500 = 100,000 bits
194+
NIST recommendation: ≥ 1,000,000 bits
195+
196+
(Note: this page later uses N to mean block length in
197+
the formulas below; NIST uses M for block length and N
198+
for the number of blocks. The formulas are correct
199+
either way.)
194200
</div>
195201

196202
<h3>Step 2: Calculate Linear Complexity for Each Block</h3>
@@ -307,8 +313,8 @@ <h2>Technical Details</h2>
307313
<div class="stat-value">500-1000</div>
308314
</div>
309315
<div class="stat-card">
310-
<div class="stat-label">Min Bits</div>
311-
<div class="stat-value">~1,000,000</div>
316+
<div class="stat-label">Recommended Bits</div>
317+
<div class="stat-value">1,000,000</div>
312318
</div>
313319
<div class="stat-card">
314320
<div class="stat-label">Significance</div>

static/tests/nonoverlappingtemplate.html

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -199,8 +199,13 @@ <h3>Step 2: Divide Sequence Into Blocks</h3>
199199
<div class="example">
200200
Sequence: [Block 1: M bits][Block 2: M bits]...[Block N: M bits]
201201

202-
Typical block size M = 968 bits for m=9 templates
203-
Number of blocks N = 8 (minimum)
202+
NIST recommends:
203+
Number of blocks N = 8
204+
Block size M = n / 8
205+
Total length n ≥ 1,000,000 bits → M ≈ 125,000
206+
207+
Minimum-data configuration (rarely sufficient in practice):
208+
N = 8, M = 968 → n ≈ 7,744 bits
204209
</div>
205210

206211
<h3>Step 3: Count Template Matches (Non-overlapping)</h3>

0 commit comments

Comments
 (0)