<h1><b>FUNDAMENTALS OF DATA ANALYTICS PROJECT: NUMPY</b></h1>

<h2>WHAT IS NUMPY?</h2>

<p><img src="https://www.freecodecamp.org/news/content/images/2020/07/numpy.png" alt="Italian Trulli" style="float:right;width:300px;height:140px;">Numpy is a package used for scientific compuiting in Python. It provides methods to create multidimensional array objects, various derived objects such as masked arrays and matrices and various routines to make fast operations on arrays.
</p>
<p>At the core of the NumPy package, is the n-dimensional arrays object. These arrays encapsulate all types of homogenous data types with the operations performed in compiled code for performance.

<h2>GOALS OF THIS PROJECT</h2>
    <ol>
    <li>show how numpy can be used to generate random numbers.</li>
    <li>Explain how numpy does it, referring to numpy’s documentation.</li>
    <li>Then explain how numpy can be used to simulate rolling a standard six-sided dice.
</li>
</ol>

<h2 align = "center">1. HOW TO GENERATE RANDOM NUMBERS USING NUMPY</h2>

First we will have to import the package

In [2]:
import numpy as np

<p>Numpy has a variety of ways to generate random numbers, but one way is using the method numpy.random. This Module implements a Pseudo Random Number Generator (The reason for it being "Pseudo" will be explained in section 2) with the ability to draw samples from a variety of probability distributions. We will create a "Generator" instance with "default_rng" and call the various methods on it to obtain from different distributions.</p> 
<p>The below piece of code will generate a random Float number in the range of 0 - 1</p>

In [5]:
#Establishes Range of randomness in variable "rng"
rng = np.random.default_rng()

#Generates a single number using Variable "rng" parameters
rng.random()  

0.6476816770479074

<p>We can also create an array with additional parameters</p>
<p>Below we generate a standard normal distribution of 10 numbers in an array using "rng" parameters.</p>

In [7]:
#Generate 10 numbers in array to standard normal distribution using "rng" parameters
rng.standard_normal(10) 

array([-0.50525487,  0.29930471,  0.3546029 ,  1.63995742, -1.30728721,
       -0.32639732,  0.33680928, -0.52205783,  0.11205249,  1.8419475 ])

You can assign the results of these generators to a variable like so, I will also demonstrate generating integers in the range of 0-10 using the same "rng" variable:

In [13]:
#Generate array of 5 integers in range of 0-10 using "rng" variable parameters
rand_int = rng.integers(low=0, high=10, size=5)

rand_int

array([0, 9, 7, 2, 0], dtype=int64)

<p>In all instances, the numbers appear to be random within the parameters given, however they are not truly random as a computer cannot truly be random. Hence why it is called a <i>Pseudo</i> Random Number Generator.</p>
<p>This is because you cannot produce a number randomly wherein no element has any consistent, rule based relationship to any other element(Through a computer program), therefore it would not be random.</p>
<p>Computers can only imitate randomness by introducing inputs and factors that go beyond immediate human traceability such as the exact millisecond or tick an operation is made or obscuring the decision making process a computer performs. However, even this can be manipulated by human interference.</p>
<p>A fascinating case study on this would be how the Pokemon game series "randomness" is taken advantage of.</p>

<h2 align = "center">2. HOW NUMPY GENERATES RANDOM NUMBERS</h2>

<p>How Numpy generates random is through "Seeds". Seeds are an assortment of numbers that can be determined by a variety of different factors, including a systems hardware information, time, date or by a user or programs input.</p>
<p>In the Previous Section we created a Generator with "default_rng" which could call various methods to obtain samples from different distributions(Float, Normal, Integer etc.).
<p>Numpy's RNG's are deterministic sequences that can be reproduced by specifying seed integers, however since we did not provide a seed in default_rng in the cases in the previous section, Numpy will seed the RNG through non-deterministic data from  the Operating System(Windows, Mac, Linux etc.) So that it can produce a potentially unique output with every execution.</p>
<p>Below, are two pieces of code with with two different variables, but with the same seed input to default_rng.

In [14]:
#Variable rng1 with seed "1234"
rng1 = np.random.default_rng(1234)

rng1.random()


0.9766997666981422

In [15]:
#Variable rng2 with seed "1234"
rng2 = np.random.default_rng(1234)

rng2.random()

0.9766997666981422

In [16]:
rng1.standard_normal(10) 

array([ 0.06409991,  0.7408913 ,  0.15261919,  0.86374389,  2.91309922,
       -1.47882336,  0.94547297, -1.66613546,  0.34374458, -0.51244371])

In [17]:
rng2.standard_normal(10) 

array([ 0.06409991,  0.7408913 ,  0.15261919,  0.86374389,  2.91309922,
       -1.47882336,  0.94547297, -1.66613546,  0.34374458, -0.51244371])

In [18]:
rand_int = rng1.integers(low=0, high=10, size=5)

rand_int

array([2, 6, 8, 8, 6], dtype=int64)

In [19]:
rand_int = rng2.integers(low=0, high=10, size=5)

rand_int

array([2, 6, 8, 8, 6], dtype=int64)

<p>As displayed above, both variables with the same seed produced the exact same results for each instance of randomisation. This is determistic sequencing as it is determined by the seed.</p>

<h2 align = "center">3. SIMULATING A SIX-SIDED DICE WITH NUMPY</h2>

A six-sided dice theoretically has an even chance(16.667%) of landing on each side. Initially, What someone would think would produce this phenomenom through Numpy would be the following.

In [20]:
odds = np.random.default_rng()

result = rng.integers(low=1, high=6)

result

2

<p>However, the issue with this approach is that the randomness is non-deterministic. The chance that it would land on any one number could be any percentage for that execution. Even if a collection of results came out resembling an assortment of dice rolls, there is no definitive standard to determine that each result had an even percentage of happening.</p>
<p>Instead we will use [random.choices] and [random.arange]. This lets us determine two things: The range of numbers that can be produced and the weighting(being the chance of an result produce in relation to each other result) of each result.</p>

In [31]:
#The percentage chance for landing on each side of a dice
odds = 0.1666666666666667

die_roll = np.random.choice(np.arange(1, 7, 1), p = [odds, odds, odds, odds, odds, odds])

die_roll

6

<p>Above, I have demonstrated a simulation of a dice with deterministic chances for each value</p>