# 8 · The Appropriate Optics: Why the Logarithmic Scale Is Structural

**Observational record associated with the book**  
*Discovering Chaos in Prime Numbers — Computational Investigations through the Euler Mirror*  
© Alvaro Costa, 2025

This notebook is part of a canonical sequence of computational records.  
It introduces **no new hypotheses, conjectures, or interpretative models**.

Its sole purpose is to **record** the behaviour of arithmetic structures under an explicit,  
deterministic, and reproducible regime of observation.

The complete conceptual discussion is presented in the book.  
This notebook documents only the corresponding experiment.

**Licence:** Creative Commons BY–NC–ND 4.0  
Reading, execution, and citation are permitted.  
Modification, derivative redistribution, or independent commercial use are not permitted.


---

## 1. Distinct statistical regimes under different observation metrics

In the previous chapters, the emergence of statistics compatible with the GOE was observed for the operator $M$. A further analysis, however, reveals a crucial point: **the observed statistical regime depends on the sampling metric adopted along the number line**.

When the points $x_i$ are sampled **linearly**, the spectrum of $M$ exhibits statistics compatible with a **Poisson** regime. When sampling is performed **logarithmically**, while keeping both the construction of the operator and the arithmetic region fixed, a **clearly correlated spectral regime** emerges, compatible with **GOE** statistics.

This contrast is not a numerical artefact. It reflects the explicit dependence between the constructed operator and the observation metric employed.

---

## 2. Observation metric and the natural scale of the primes

The asymptotic density of prime numbers is governed by the relation
$$
\pi(x) \sim \frac{x}{\ln x},
$$
which indicates that the natural scale associated with the distribution of primes is **logarithmic**, rather than linear.

Consequently, the choice of sampling metric acts as a structural filter:

* metrics compatible with the logarithmic scale preserve the relevant variations of the arithmetic signal;
* incompatible metrics tend to suppress long-range correlations.

Thus, different sampling strategies correspond to different observation regimes of the *same* operator.

---

## 3. Logarithmic sampling and the correlated regime

When the points $x_i$ are distributed uniformly in $\ln x$, the sampling remains coherent with the multiplicative structure implicit in the operator $M$.

* **Effect on $\Delta_\pi(x)$:**
  The fluctuations of the signal are sampled in a balanced manner across scales, preserving their structural variability.

* **Spectral consequence:**
  The resulting matrix $M$ exhibits high internal complexity, and its spectrum displays level repulsion characteristic of correlated statistics, compatible with the **GOE (Gaussian Orthogonal Ensemble)** class.

---

## 4. Linear sampling and the uncorrelated regime

When the points $x_i$ are distributed uniformly in $x$, the sampling ignores the logarithmic scale underlying the distribution of the primes.

* **Effect on $\Delta_\pi(x)$:**
  Within restricted linear windows, the signal varies slowly and exhibits approximately independent fluctuations.

* **Spectral consequence:**
  The matrix $M$ constructed in this regime has lower effective variability, and the resulting spectrum is compatible with statistics of independent events, that is, a **Poisson** regime.

---

## 5. Comparison between observation regimes

| Characteristic                 | Linear sampling | Logarithmic sampling   |
| ------------------------------ | --------------- | ---------------------- |
| Sampling metric                | Linear in $x$   | Uniform in $\ln x$     |
| Compatibility with $\pi(x)$    | Low             | High                   |
| Variability of $\Delta_\pi(x)$ | Locally reduced | Structurally preserved |
| Complexity of operator $M$     | Low             | High                   |
| Spectral statistics            | Poisson         | GOE                    |

---

## 6. Local normalisation and verification of the Poisson regime

In the linear regime, small residual variations in the spectral density may obscure the identification of Poisson behaviour.

The function `local_normalize_spacings` applies a local normalisation of the spectral spacings, compensating for smooth variations in the mean density.

This procedure:

* **does not introduce correlation**;
* **does not modify the operator**;
* merely removes scale effects that could mask the underlying statistical regime.

After this normalisation, the Poisson distribution emerges clearly, confirming that it is an **observational property of the system under this metric**, rather than a computational artefact.


In [1]:
# Requirements: pandas, matplotlib, numpy, ipywidgets

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from ipywidgets import interact
import time

# --- Data Generation and Matrix Functions ---
def generate_pi_data(n: int) -> np.ndarray:
    """Generates an array containing all primes up to n using an optimised sieve."""
    if n < 2: return np.array([], dtype=np.int64)
    size = (n - 1) // 2; sieve = np.ones(size, dtype=bool)
    limit = int(np.sqrt(n)) // 2
    for i in range(limit):
        if sieve[i]:
            p = 2 * i + 3; start = (p*p - 3) // 2
            sieve[start::p] = False
    indices = np.where(sieve)[0]; odd_primes = 2 * indices + 3
    return np.concatenate((np.array([2], dtype=np.int64), odd_primes))

def get_delta_pi_for_points(x_points, primes):
    """Computes Δπ(x) for an array of x-points using a precomputed list of primes."""
    x_int = np.floor(x_points).astype(int)
    pi_x = np.searchsorted(primes, x_int, side='right')
    pi_x_div_2 = np.searchsorted(primes, x_int // 2, side='right')
    return pi_x - 2 * pi_x_div_2
    
def generate_cos_matrix(fx_values, x_values):
    """Generates the matrix M from the vectors F(x) and x."""
    fx = fx_values.astype(np.float64); x = x_values.astype(np.float64)
    x[x <= 0] = 1e-12; logx = np.log(x)
    C = np.cos(np.outer(fx, logx)); M = C + C.T
    std_dev = M.std()
    if std_dev > 0:
        M -= M.mean()
        M /= std_dev
    return 0.5 * (M + M.T)

# Fixed bulk: central 90% (alpha = 0.10)
# Fixed local window for unfolding (not optimised): w = 21
def local_normalize_spacings(lam, alpha=0.10, w=21):
    """
    Normalises spacings by their local mean (unfolding).
    This is the key step to correctly visualise Poisson statistics.
    """
    N = lam.size
    # Extract the spectral bulk to avoid edge effects
    k0, k1 = int(alpha * N), int((1 - alpha) * N)
    lam_bulk = np.sort(lam)[k0:k1]
    
    s = np.diff(lam_bulk)
    s = s[s > 0]
    
    if len(s) < w: 
        return s / s.mean() if s.mean() > 0 else s

    # Use a moving average to estimate the local density of states
    w = int(w)
    if w % 2 == 0: w += 1  # Window length must be odd
    pad = w // 2
    s_padded = np.pad(s, (pad, pad), mode='reflect')
    local_mean = np.convolve(s_padded, np.ones(w)/w, mode='valid')
    
    # Avoid division by zero
    local_mean[local_mean == 0] = 1.0
    
    return s / local_mean

# --- Main Interactive Function ---
def scale_comparison_lab(N=2048, log_X0=8, span=2.4):
    
    X0 = int(10**log_X0)
    
    # --- Data Preparation ---
    max_x_log = int(np.ceil(X0 * np.exp(span/2)))
    max_x_linear = X0 + N
    max_x_needed = max(max_x_log, max_x_linear)
    pi_x_full = generate_pi_data(max_x_needed)

    fig, axes = plt.subplots(1, 2, figsize=(16, 6), sharey=True) 
    
    # --- Left Plot: Linear Sampling (Poisson) ---
    print("\n--- Processing Linear Scale ---")
    x_linear = np.arange(X0, X0 + N)
    fx_linear = get_delta_pi_for_points(x_linear, pi_x_full)
    
    M_linear = generate_cos_matrix(fx_linear, x_linear)
    lam_linear, _ = np.linalg.eigh(M_linear)
    # USE LOCAL NORMALISATION TO REVEAL POISSON
    s_unfolded_linear = local_normalize_spacings(lam_linear)

    # --- Right Plot: Logarithmic Sampling (GOE) ---
    print("\n--- Processing Logarithmic Scale ---")
    x_log = np.exp(np.linspace(np.log(X0) - span/2, np.log(X0) + span/2, N))
    fx_log = get_delta_pi_for_points(x_log, pi_x_full)

    M_log = generate_cos_matrix(fx_log, x_log)
    lam_log, _ = np.linalg.eigh(M_log)
    # For GOE, global mean normalisation is sufficient
    s_log = np.diff(np.sort(lam_log)); s_log = s_log[s_log > 0]
    s_unfolded_log = s_log / s_log.mean()

    # --- Comparative Plots ---
    s_grid = np.linspace(0, 4, 200)
    pdf_goe = (np.pi * s_grid / 2) * np.exp(-np.pi * s_grid**2 / 4)
    pdf_poisson = np.exp(-s_grid)
    
    # Left plot
    ax = axes[0]
    ax.hist(s_unfolded_linear, bins='auto', density=True, alpha=0.75, label='Data (Linear)')
    ax.plot(s_grid, pdf_goe, 'r--', lw=2, label='GOE Theory')
    ax.plot(s_grid, pdf_poisson, 'g:', lw=3, label='Poisson Theory')
    ax.set_title('a) Linear Scale → Uncorrelated Regime', fontsize=14)
    ax.set_xlabel('s (Locally Normalised Spacing)'); ax.set_ylabel('Density')
    ax.set_xlim(0, 4); ax.legend(loc='upper right')
    
    # Right plot
    ax = axes[1]
    ax.hist(s_unfolded_log, bins='auto', density=True, alpha=0.75, label='Data (Logarithmic)')
    ax.plot(s_grid, pdf_goe, 'r--', lw=2, label='GOE Theory')
    ax.plot(s_grid, pdf_poisson, 'g:', lw=3, label='Poisson Theory')
    ax.set_title('b) Logarithmic Scale → Correlated Regime', fontsize=14)
    ax.set_xlabel('s (Globally Normalised Spacing)'); ax.legend(loc='upper right')
    ax.set_xlim(0, 4)
    
    fig.suptitle(
        f"Visual Comparison of the Effect of Scale at X₀ = {X0:g}",
        fontsize=18, weight='bold'
    )
    fig.tight_layout(rect=[0, 0, 1, 0.96])
    plt.show()

# --- Interactive Widget ---
interact(
    scale_comparison_lab, 
    N=widgets.Dropdown(options=[512, 1024, 2048], value=2048, description='N:'),
    log_X0=widgets.IntSlider(
        min=5, max=8, step=1, value=8,
        description='X₀=10^', continuous_update=False
    ),
    span=widgets.FloatSlider(
        min=1.0, max=4.0, step=0.1,
        value=2.4, description='Span (Log):'
    )
);


interactive(children=(Dropdown(description='N:', index=2, options=(512, 1024, 2048), value=2048), IntSlider(va…

---

## 7. Final observation

The contrast between Poisson and GOE statistics does not indicate the presence of two distinct systems. Rather, it reflects the dependence of spectral statistics on the **compatibility between the operator and the observation metric**.

In the next notebook, the mathematical structure that makes logarithmic sampling not merely convenient, but necessary for the observation of long-range spectral correlations in this operator, will be analysed in greater detail.
