In [11]:
import math_pi
import numpy
import pandas

from tabulate import tabulate

from Check_Numbers_Pi.Dependency import main_computation, check_occurrencies

# Introduction

The objective of the study is a simple analysis aiming at finding links between combinations of numbers, how many digits they take to be found into Pi and their occurrencies. The analysis will only consider combinations which can be found within the first **million digits of Pi** because that is the best free option found online which is also contained into a Python module (**math-pi**).

In [12]:
pi = math_pi.pi(1, 100)

print(pi)

3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679


The first part will introduce the functions developed to check whether a given combination of numbers is present in pi and to count its occurrencies. Then, a subset of these combinations are considered for the actual analysis. Eventually, the results are presented and an interpretation (if possible) is provided.

# The Functions

The first function developed (*main_computation*) takes a string as an input and returns a dictionary with the following keys:
<pre>
{
    "Digits Checked": [int],    # Indicates the number of Pi checked before finding the combination requested
    "Pi Until": [str],          # Stores the whole series of digits of Pi until the combination requested
    "Not Found": True           # Boolean value indicating the result of the query
}
</pre>

Hereby, an example is provided.

In [13]:
example = main_computation("208")

print(example)

{'Digits Checked': 76, 'Pi Until': '3141592653589793238462643383279502884197169399375105820974944592307816406286208', 'Not Found': False}


The second function developed (*check_occurrencies*) takes a string and the "*Digits Checked*" value coming from the *main_computation* function. The function has been purposely developed to be run **after** that a combination has been proven to be found for efficiency reasons and simply returns the number of occurrencies of a given combination within the first million digits of pi.

Hereby an example is provided.

In [14]:
example = check_occurrencies("208", main_computation("208")["Digits Checked"])

print("The number 208 has been found {Occurrencies} times.".format(Occurrencies = example))

The number 208 has been found 1032 times.


# The Combinations

All the combinations composed by a maximum of 5 digits that can be created between using the numbers between 0 and 9 are taken into account.

To generate all the possible combinations, the *itertools* module can be used. The idea is to generate the combinations of 1, 2, 3, 4 and 5 digits and then collect them in an array.

Hereby, an example of such application considering 2 digits is provided.

**N.B.**: Notice that the combinations generated are also numerically sorted: every list of combinations, in fact, is also ordered from 0 to the highest number that is possible to generate with the amount of digits considered.

In [15]:
import itertools

set = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# First the combinations are generated
combinations_of_2_digits = [p for p in itertools.product(set, repeat = 2)]

print(combinations_of_2_digits)

print ("\n")

print (type(combinations_of_2_digits[0]))       # Showing the type of the first element (but they are all the same)

[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (4, 7), (4, 8), (4, 9), (5, 0), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (5, 7), (5, 8), (5, 9), (6, 0), (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6), (6, 7), (6, 8), (6, 9), (7, 0), (7, 1), (7, 2), (7, 3), (7, 4), (7, 5), (7, 6), (7, 7), (7, 8), (7, 9), (8, 0), (8, 1), (8, 2), (8, 3), (8, 4), (8, 5), (8, 6), (8, 7), (8, 8), (8, 9), (9, 0), (9, 1), (9, 2), (9, 3), (9, 4), (9, 5), (9, 6), (9, 7), (9, 8), (9, 9)]


<class 'tuple'>


Since the *main_computation* function requires inputs to be strings, we cannot use the elements generated until we have first converted all its elements into strings while keeping its sorting.

In [16]:
void_string = ""

i = 0

for combination in combinations_of_2_digits:

    for item in combination:

        void_string = void_string + str(item)

    combinations_of_2_digits[i] = void_string

    i += 1

    void_string = ""

print(combinations_of_2_digits)

print ("\n")

print (type(combinations_of_2_digits[0]))       # Showing the final type of the first element (but they are all the same)

['00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '90', '91', '92', '93', '94', '95', '96', '97', '98', '99']


<class 'str'>


Let's now apply the same algorithm to get the lists of all combinations of 1, 2, 3, 4 and 5 digits.

In [17]:
set = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

combinations = []

void_string = ""

for digits in range (0, 6, 1):                  # First generating the tuples

    i = 0

    combinations.append([p for p in itertools.product(set, repeat = digits)])
    
    for combination in combinations[digits]:    # Then converting the elements in each tuple into strings

        for item in combination:

            void_string = void_string + str(item)

        combinations[digits][i] = void_string

        i += 1

        void_string = ""

    combinations[digits] = numpy.array(combinations[digits])

with numpy.printoptions(threshold = 10):
    print(combinations[1])
    print(combinations[2])
    print(combinations[3])
    print(combinations[4])
    print(combinations[5])

print("\n")

%store combinations

['0' '1' '2' '3' '4' '5' '6' '7' '8' '9']
['00' '01' '02' ... '97' '98' '99']
['000' '001' '002' ... '997' '998' '999']
['0000' '0001' '0002' ... '9997' '9998' '9999']
['00000' '00001' '00002' ... '99997' '99998' '99999']


Stored 'combinations' (list)


# Data Check

Before starting the analysis, we should first check our set of combinations.

We will only consider combinations of numbers belonging to the maximum number of digits whose combinations will prove to be all found within the first million digits of Pi.

To check that, we can use the attribute "Not Found" from the *main_computation* function.

<span style="color:red">**WARNING**</span>: the computation may take several minutes (around 20 minutes).

In [18]:
combinations_not_found = []

count_combinations_not_found = []

for digits in range(1, 6, 1):

    count_combinations_not_found.append(0)

    print("Checking combinations of {} digits...\n".format(digits))

    for i in range (0, len(combinations[digits]), 1):

        print("Current Iteration: {Iteration} / {Total_Iterations}". format(Iteration = i, Total_Iterations = len(combinations[digits])), end = "\r", flush = True)

        if (main_computation(combinations[digits][i])["Not Found"] == True):

            combinations_not_found.append(str(combinations[digits][i]))

            count_combinations_not_found[(digits - 1)] += 1

combinations_not_found = numpy.array(combinations_not_found)

check_table = []

for digits in range(1, 6, 1):
    
    check_table.append(["Combination of {} digits".format(digits), count_combinations_not_found[digits - 1]])

print(tabulate(check_table, headers = ["Digits", "Not_Found"], tablefmt = "github", numalign = "center", stralign = "center"))

print("\nThe following combinations have not been found within the first million digits of Pi:")

print(combinations_not_found)

print ("\n")

%store combinations_not_found

Checking combinations of 1 digits...

Checking combinations of 2 digits...

Checking combinations of 3 digits...

Checking combinations of 4 digits...

Checking combinations of 5 digits...

|         Digits          |  Not_Found  |
|-------------------------|-------------|
| Combination of 1 digits |      0      |
| Combination of 2 digits |      0      |
| Combination of 3 digits |      0      |
| Combination of 4 digits |      0      |
| Combination of 5 digits |      8      |

The following combinations have not been found within the first million digits of Pi:
['14523' '17125' '22801' '33394' '36173' '39648' '40527' '96710']


Stored 'combinations_not_found' (ndarray)


To sum up, after running the *main_computation* function through all the combinations, it has been found that:

- All the numbers of 1, 2, 3 or 4 digits can be found within the first million digits of Pi.

- The following combinations of 5 digits cannot be found within the first million digits of Pi: 14523, 17125, 22801, 33394, 36173, 39648, 40527, 96710.

# The Dataset

Due to the limitations emerged in the previous chapter, only the combinations of up to **4** digits are taken into consideration.

As a result, we know that we can count on a total of 11110 combinations:

- 10 single-digit combinations

- 100 double-digit combinations

- 1000 triple-digit combinations

- 10000 quadruple-digit combinations

For each combination, we are interested in tracking the following pieces of information:

| Attribute | Description | Type | Range |
| --------- | ----------- | ---- | ----- |
| Digits | Number of digits of the combination | int | 1 - 4 |
| Digits_Before | Number of digits to check before finding the combination in pi | int | 1 - 1000000 |
| Occurrencies | Number of occurrencies within the first million digits of pi | int | 1 - 1000000 |

Therefore, let's create a pandas dataframe to collect such data in a table.

<span style="color:red">**WARNING**</span>: the computation may take several minutes (around 15 minutes).

In [19]:
# Creating the dictionary for pandas dataframe
data = []

for digits in range(1, 5, 1):

    print("Working on combinations of {} digits...\n".format(digits))

    for i in range (0, len(combinations[digits]), 1):

        print("Current Iteration: {Iteration} / {Total_Iterations}". format(Iteration = i, Total_Iterations = len(combinations[digits])), end = "\r", flush = True)

        temp_data = main_computation(combinations[digits][i])

        data.append (
            {
                "Combination": combinations[digits][i],
                "Digits": digits,
                "Digits_Before": temp_data["Digits Checked"],
                "Occurrencies": check_occurrencies(combinations[digits][i], temp_data["Digits Checked"])
            }
        )

#Creating the dataframe
dataframe = pandas.DataFrame(data, index = pandas.RangeIndex(1, 11111, 1))

print("Dataset has been created successfully!\n")

%store dataframe

Working on combinations of 1 digits...

Working on combinations of 2 digits...

Working on combinations of 3 digits...

Working on combinations of 4 digits...

Dataset has been created successfully!

Stored 'dataframe' (DataFrame)


To conclude, an overview of the dataset created is provided.

In [20]:
dataframe

Unnamed: 0,Combination,Digits,Digits_Before,Occurrencies
1,0,1,32,99959
2,1,1,1,99758
3,2,1,6,100026
4,3,1,0,100230
5,4,1,2,100230
...,...,...,...,...
11106,9995,4,18680,112
11107,9996,4,13019,99
11108,9997,4,22309,103
11109,9998,4,765,90
