# $\lambda$ and $\Delta H(p)$ calculations
For the models of relative populations, we use $p_0=37$.
The parameter
$$\lambda(p_k) = \prod_{p=41}^{p_k} \frac{p-3}{p-2} $$
So we create an array of values of $\lambda$ associated with the primes covered by 
$\mathcal{G}(23^\#)$ for $p_0 = 37$.  The minimum value under this range is
$$ \lambda(p_k=223092827) = 0.19416057 $$
However, the range of these primes is almost covered by the horizon of survival $H(p_k)= [p_{k+1},p_{k+1}^2]$
$$ H(p=14929) \; {\rm with} \; \lambda=0.3880218443549397$$

There are three sections to building up these results:  calculation of values of $\lambda$ across the 
range of primes; importing the curves $w_{g,1}(\lambda)$ from the previous notebook; and enumerating
the gaps across consecutive intervals of survival $\Delta H(p)$.

In [1]:
%reset -f

import pandas as pd
import numpy as np
from numpy.polynomial.polynomial import polyval
import array
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
#from plotnine import ggplot, geom_point, aes, geom_line, theme, ggsave
import gc
import psutil
import sys

import itertools
from ipywidgets import interact
import ipywidgets as widgets
from IPython.display import display
plt.ion


<function matplotlib.pyplot.ion() -> 'AbstractContextManager'>

In [2]:
primes23 = np.load('primes23.npy')
G23 = np.load('G23uint.npy')
cG23 = np.load('cG23.npy')

In [3]:
# block to check the available system memory
gc.collect()
memory = psutil.virtual_memory()
available_memory = memory.available
del memory
print(f"Available memory: {available_memory / (1024 ** 2):.2f} MB")

Available memory: 4021.25 MB


In [4]:
lenp23 = len(primes23)
maxp23 = primes23[-1]
print(f"Length primes {lenp23} gaps {len(cG23)} maxp {maxp23}")
# These lengths= primes 12283522 gaps 12283523 maxp 223092827

Length primes 12283522 gaps 12283523 maxp 223092827


In [5]:
lambda23 = np.zeros(lenp23)

In [6]:
# primes23[2:5] are 37, 41, 43
lambda23[0] = 1
ilam = 0
ip = 2 # the offset from lambda23 into primes23 for p0=37
while (ip < (lenp23-1)):
    ilam += 1
    ip += 1
    lambda23[ilam] = lambda23[ilam-1]*(primes23[ip]-3)/(primes23[ip]-2)

In [7]:
# to match indexing across arrays, cG23[i]=primes23[i]-primes23[i-1]
cG23[0:10]

array([28,  2,  6,  4,  2,  4,  6,  6,  2,  6], dtype=uint16)

In [8]:
lambda23[-100:-1]

array([0.19416066, 0.19416066, 0.19416065, 0.19416065, 0.19416065,
       0.19416065, 0.19416065, 0.19416065, 0.19416065, 0.19416065,
       0.19416065, 0.19416065, 0.19416065, 0.19416064, 0.19416064,
       0.19416064, 0.19416064, 0.19416064, 0.19416064, 0.19416064,
       0.19416064, 0.19416064, 0.19416064, 0.19416064, 0.19416064,
       0.19416063, 0.19416063, 0.19416063, 0.19416063, 0.19416063,
       0.19416063, 0.19416063, 0.19416063, 0.19416063, 0.19416063,
       0.19416063, 0.19416062, 0.19416062, 0.19416062, 0.19416062,
       0.19416062, 0.19416062, 0.19416062, 0.19416062, 0.19416062,
       0.19416062, 0.19416062, 0.19416062, 0.19416061, 0.19416061,
       0.19416061, 0.19416061, 0.19416061, 0.19416061, 0.19416061,
       0.19416061, 0.19416061, 0.19416061, 0.19416061, 0.1941606 ,
       0.1941606 , 0.1941606 , 0.1941606 , 0.1941606 , 0.1941606 ,
       0.1941606 , 0.1941606 , 0.1941606 , 0.1941606 , 0.1941606 ,
       0.1941606 , 0.19416059, 0.19416059, 0.19416059, 0.19416

In [9]:
maxgap = max(cG23)
maxgap

np.uint16(248)

In [10]:
np.mean(cG23)

np.float64(18.161961352618462)

In [11]:
np.sqrt(maxp23)

np.float64(14936.292277536617)

In [12]:
# bounds on parameters for cG23
i=0
while (primes23[i] < 14936):
    i += 1
i -= 1
print(f"i {i} p23 {primes23[i]} p^2 {primes23[i]**2} maxp23 {primes23[-1]} lambda {lambda23[i-2]:.4f}")

i 1738 p23 14929 p^2 222875041 maxp23 223092827 lambda 0.3880


In [13]:
# bounds on parameters for cG29
pbound = np.sqrt(29)*14936
i=0
while (primes23[i] < pbound):
    i += 1
i -= 1
print(f"i {i} p23 {primes23[i]} p^2 {primes23[i]**2} maxp23 {primes23[-1]} lambda {lambda23[i-2]:.4f}")

i 7863 p23 80429 p^2 6468824041 maxp23 223092827 lambda 0.3304


## Estimates from the models $w_g(p^\#)=w_g(\lambda)$
Here we prepare estimates of the expected counts of gaps, based on the models $w_g(p^\#)$.  The graph below will superimpose these estimates on the actual counts over sampled intervals of survival $\Delta H(p_k)$.

The hypothesis underlying our estimates is that the gaps in $\mathcal{G}(p^\#)$ are roughly distributed uniformly across the cycle.

The first-order estimate is simply that the number of gaps in an interval of survival should be proportional to
that gap's relative population at that point in the sieve:
$$ n_g (\Delta H) / n_2(\Delta H) \; \approx \; w_{g,1}(\lambda)$$

A second-order improvement can be made, by noting that in the horizon of survival $[q,q^2]$,
the gaps in the interval $[q,p^2]$ for $p < q$ were fixed by previous recursions across the cycles, 
and that especially those gaps to the left of this interval will reflect the distributions 
from much earlier in the evolution.
$$ n_g(\Delta H(q)) / n_2(\Delta H(q)) \; \approx \; w_{g,1}(q^\#) + \frac{p^2}{q^2-p^2} \Delta w$$


In [14]:
# The matrix of coefficients lj is created in the '02' notebook.  Here we read the result from file.
# These coefficients include the alternating signs
ljmat = np.load('lj37.npy')
ljmat.shape
ljmat[0:3,0:5]

array([[ 1.        ,  1.        ,  2.        ,  1.        ,  1.33333333],
       [-0.        , -0.        , -0.6513022 , -0.6513022 , -0.9769533 ],
       [ 0.        ,  0.        ,  0.        ,  0.07157346,  0.14314691]])

In [15]:
ljmat.shape

(19, 41)

## Accumulations over intervals $\Delta H(p)$
For comparison with the relative populations $w_{g,1}(p^\#)$, we accumulate the counts of gaps within the intervals of survival $\Delta H(p) = [p^2, q^2]$.

In [19]:
ninterval = 1738 # max prime index for cG23
ngaps = int(maxgap/2)
DelHcounts = np.zeros((ninterval, ngaps), dtype=int)
print(f"array created {ninterval} x {ngaps}")

array created 1738 x 124


In [20]:
# NOTE on indexing:  cG23[i] = primes23[i]-primes23[i-1]
ip = 0
p = primes23[ip]
psqr = p*p
i = 0
while (primes23[i] < psqr):
    i +=1
# We're ready to start the count - i is pointing at the first gap in the first interval
while (ip < ninterval):
    print(f"ip {ip} of {ninterval} p= {primes23[ip]} at i {i}", end='\r')
    psqr = (primes23[ip+1])**2
    while (primes23[i] < psqr):
        j = int(cG23[i]/2 - 1)  # indices are half the value of the gap
        DelHcounts[ip,j] = DelHcounts[ip,j] + 1
        i += 1
    ip += 1
# Counts of DelH completed over the covered range, from p=29 to p=14929


ip 1737 of 1738 p= 14923 at i 12262909

In [21]:
# visual checks on histogram data
DelHcounts[-11:-1,0:10]

array([[ 412,  432,  753,  327,  400,  504,  314,  243,  420,  233],
       [1278, 1272, 2217, 1011, 1273, 1676,  877,  670, 1302,  680],
       [ 879,  845, 1520,  674,  885, 1082,  605,  425,  862,  408],
       [1647, 1753, 2976, 1327, 1735, 2221, 1255,  906, 1649,  889],
       [ 216,  225,  357,  158,  209,  280,  158,  113,  222,  112],
       [1090, 1055, 1899,  796, 1073, 1358,  757,  582, 1086,  556],
       [ 854,  821, 1577,  661,  840, 1084,  553,  480,  805,  451],
       [ 438,  428,  757,  326,  425,  577,  292,  253,  368,  225],
       [ 636,  644, 1104,  521,  665,  786,  481,  350,  638,  333],
       [2768, 2804, 4911, 2174, 2839, 3509, 2022, 1551, 2707, 1471]])

In [22]:
# visual checks on histogram data
sum(DelHcounts[:,])

array([ 895223,  895902, 1575342,  688899,  888391, 1123493,  615227,
        455120,  818700,  435767,  377389,  558733,  260987,  283355,
        494297,  152949,  160426,  257558,  118599,  139438,  200812,
         81339,   69001,  118137,   66068,   49069,   81284,   41034,
         35534,   70642,   21417,   22202,   39803,   16013,   22638,
         22275,   10993,    9399,   18051,    8834,    6265,   12858,
          4509,    4594,    9254,    2981,    2698,    4697,    2438,
          2542,    3210,    1573,    1235,    2185,    1276,     996,
          1481,     608,     578,    1299,     390,     416,     718,
           251,     297,     406,     165,     136,     294,     188,
           115,     172,      67,      86,     150,      67,      57,
            81,      33,      38,      42,      26,      15,      43,
            22,      12,      14,      17,      14,      13,      10,
             9,       5,       1,       1,       5,       2,       3,
             6,     

In [23]:
# Develop a master color dictionary - up to g=100
# we set colors by family, as determined by prime factors of the gap
mastercolordict={'2':'#FF0000', '4':'#FF8888', '8':'#EEBBBB', '16':'#FF99CC', '32':'#FFDDDD', '64':'#FFEEEE',
                 '6':'#0000FF', '12':'#00DDFF', '18':'#6666FF', '24':'#BBBBFF', '36':'#DDDDFF', '48':'#0000AF',
                 '54':'#44448F', '72':'#88888F',
                 '10':'#00CC00', '20':'#98FB98', '40':'#DDFFCF', '50':'#BBFFBB', '80':'#44AA66', '100':'#228F3F',
                 '30':'#FFD700', '60':'#FFB000', '90':'#FFCC33',
                 '14':'#DD00EE', '28':'#AA22CC', '56':'#660066', '98':'#662266',
                 '42':'#0088BB', '84':'#4488BB', '70':'#00BB88', 
                 '22':'#884400', '44':'#884444', '66':'#884488', '88':'#AA6644',
                 '26':'#AAAAAA', '52':'#888888', '78':'#6666AA',
                 '34':'#CC0000', '68':'#CC4444', '38':'#AA8800', '76':'#AA9944'}

In [24]:
gapnames = np.arange(2, maxgap+2, 2)

In [25]:
print(f"max {ninterval} {primes23[ninterval]}")

max 1738 14929


In [40]:
# Interactive display of a sample from the intervals of survival DelHcounts
# The master array DelHcounts is nintervals x gaps
lowgapdex = 0
highgapdex = 45  # the gap index is half the gap size:  gap = 2*(gapdex+1)
gaprange = (gapnames[lowgapdex:highgapdex]).astype(str)

def draw_DelH(numpdex, highpdex, drawestimates):   # the input parameters are indices, not the x values themselves
    if (highpdex < numpdex):
        numpdex = highpdex-1
    lowpdex = highpdex-numpdex
    primerange = (primes23[lowpdex:highpdex]).astype(str)
    partialDelHcounts = DelHcounts[lowpdex:highpdex, lowgapdex:highgapdex].copy()

    df = pd.DataFrame(partialDelHcounts, index=primerange, columns=gaprange)
    df = df.transpose()    # preparing to display by size of gap

    # create the color dictionary, using gapsizes in primes23 from lowpdex to highpdex
    i = lowpdex
    colordict = {}
    while (i <= highpdex):
        pstring = str(primes23[i])
        gapstring = str(cG23[i+1])  # cG23[i] = primes23[i]-primes[i-1]
        if gapstring in mastercolordict:
            colorstring = mastercolordict[gapstring]
        else:
            print(f"Missing color for {gapstring} at {pstring}")
            colorstring = '#080808'
        colordict[pstring] = colorstring
        i += 1

    plt.clf()
    fig, ax = plt.subplots()
    fig.set_size_inches(12,8)
    # add a title to the plot
    middex = int((highpdex+lowpdex)/2)
    lamvalue = lambda23[middex]
    ptitle = (primes23[middex])**2
    ax.set_title(f"Populations of prime gaps in {numpdex} intervals of survival around {ptitle:.3e}, $\lambda=$ {lamvalue:.3f}")

    # create histogram of counts over intervals of survival
    ax2 = df.plot(kind='bar', stacked='True', ylabel='Counts', ax=ax, color=colordict)
    # reversing the order of the legend, to be decreasing
    handles, labels = ax.get_legend_handles_labels()
    ax.legend(handles[::-1], labels[::-1], ncol=int(1+numpdex/28))

    # add polyline showing expected values
    if (drawestimates):
        xvals = np.arange(41)
        estvals = np.zeros(41)
        scalefactor = sum(partialDelHcounts[:,0])
#    print(f"midp {primes23[middex]} lambda {lamvalue} scale {scalefactor}")
        i=0
        while (i < 41):
            estvals[i]  = scalefactor*polyval(lamvalue, ljmat[:,i])
            i += 1
        ax2.plot(xvals,estvals, color='#402040', lw=1, marker="^")
        # second-order estimate
        estvalsB = np.zeros(41)
        i=0
        while (i < 41):
            estvalsB[i] = scalefactor*polyval(lambda23[highpdex],ljmat[:,i])
            deltaw = polyval(lambda23[highpdex],ljmat[:,i])-polyval(lambda23[lowpdex],ljmat[:,i])
            estvalsB[i] = estvalsB[i] + scalefactor * (deltaw)/((primes23[highpdex]/primes23[lowpdex])**2 - 1)
            i += 1
        ax2.plot(xvals, estvalsB, color='#804080', lw=1.25, marker="^")
    
    plt.show()


xnumpSelect = widgets.IntSlider(min=1, max=100, step=1, value=25,
                  description="Num DelH", layout=widgets.Layout(width='80%'), disabled=False)
xhighpSelect = widgets.IntSlider(min=1, max=1737, step=1, value=1737, description="High_pdex", layout=widgets.Layout(width='80%'), disabled=False)
estimatesSelect = widgets.Checkbox(value=True, description='Estimates') 

interact(draw_DelH, numpdex=xnumpSelect, highpdex=xhighpSelect, drawestimates=estimatesSelect)



interactive(children=(IntSlider(value=25, description='Num DelH', layout=Layout(width='80%'), min=1), IntSlide…

<function __main__.draw_DelH(numpdex, highpdex, drawestimates)>

## Intervals of Survival $\Delta H(p)$
The interactive figure above displays the aggregation of gaps among primes by size of gap, across consecutive intervals of survival $\Delta H(p_k) = [p_k^2, p_{k+1}^2]$.

The aggregations of gaps within a single interval $\Delta H(p_k)$ appear as a colored band within the total displayed aggregation.  We color-code the band for $\Delta H(p_k)$ by the size of the gap $p_{k+1}-p_k$.

Two estimated levels from the models $w_g(\lambda)$ can be displayed:  the first-order and second-order estimates
based on an hypothesis of uniform distribution.