# NumPy Tutorial

https://www.w3schools.com/python/numpy/

## Random Data Distribution

A data distribution is a list of all possible values and their frequency in any given dataset.

A random distribution is a set of randomly-generated numbers that follow a certain probability density function.

The probability density function is a function that describes a continuous probability (i.e., the probability of all values in an array).

Using the `choice()` method from the `random` module, you can specify the probability of a value in a distribution. When using `choice()` in this way, one must specify the array of values, then `p` with an array of probabilities, where `p` must be between 0 and 1 and the sum of `p` must be 1.

Recall from 'random-intro.ipynb' that `choice()` can also randomly select a value from a distribution you sent it).

In [26]:
import numpy as np
from configurations import printer, logger

array_1_to_10 = np.array([*range(1, 11)])
array_odds_to_10 = array_1_to_10[array_1_to_10 % 2 != 0]

printer(
    'Array 1 to 10 is:\n%s',
    array_1_to_10
    )
printer(
    'Array odds 1 to 10 is:\n%s',
    array_odds_to_10
    )
printer(
    'A value from array 1 to 10 is:\n%s',
    np.random.choice(array_1_to_10)
    )
printer(
    'A value from array odds 1 to 10 is:\n%s',
    np.random.choice(array_odds_to_10)
    )

logger.info(
    'Observe that one can use `choice` to first create a distribution,\n'
    'then again to sample from that distribution, as done below.'
)
printer('Making right skew array of 1000 values from skewing array_1_to_10')
right_skew = np.random.choice(array_1_to_10, p=[0.5, 0.3, 0.1, 0.04, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01], size = 1000)

printer(
    'Getting 10 values from right skew:\n%s',
    np.random.choice(right_skew, size=10)
    )

Array 1 to 10 is:
[ 1  2  3  4  5  6  7  8  9 10]
Array odds 1 to 10 is:
[1 3 5 7 9]
A value from array 1 to 10 is:
7
A value from array odds 1 to 10 is:
9

2023-08-02 07:42:21 
	Logger: numpy-tutorial Module: 3832358490 Function: <module> File: 3832358490.py Line: 24
INFO:
Observe that one can use `choice` to first create a distribution,
then again to sample from that distribution, as done below.

Making right skew array of 1000 values from skewing array_1_to_10
Getting 10 values from right skew:
[3 1 3 1 8 1 1 1 2 2]


Just as before with the `random.choice()` method, you can send output data into multi-dimensional arrays.

In [1]:
import numpy as np
from configurations import printer

array_1_to_10 = np.array([*range(1, 11)])
right_skew = np.random.choice(array_1_to_10, p=[0.5, 0.3, 0.1, 0.04, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01], size = 1000)

printer(
    'A 3D array with samples pulled from right_skew:\n%s',
    np.random.choice(right_skew, size = (2, 2, 3))
)

A 3D array with samples pulled from right_skew:
[[[1 1 4]
  [2 1 2]]

 [[2 3 1]
  [2 2 3]]]


If your array has duplicated values and you assign them different probabilities, they are summed.

In [11]:
import numpy as np
from configurations import printer

duplicated_entries_array = np.random.choice([1, 2, 1], p = [0.1, 0.1, 0.8], size = 100)
printer(
    '100 samples from a duplicated entries array:\n%s',
    duplicated_entries_array
)

100 samples from a duplicated entries array:
[1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 2
 1 1 2 1 2 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1]


You can send array-like objects, such as a tuple or list, to the first argument of `random.choice()`.

What does not work are collections that are unordered (such as a set).

Furthermore, although dictionaries are ordered, they do not work in `np.random.choice()`.

However, if you use list comprehension, you can extract set items, dictionary keys, or dictionary values, and then put those into an array for `random.choice()`.

In [24]:
import numpy as np
from configurations import printer, logger

array_from_set = np.random.choice((1, 2, 3), p = [0.1, 0.2, 0.7], size = 100)

printer(
    '100 samples from an array built from a tuple:\n%s',
    array_from_set
)

printer('Creating a set')
my_set = {1, 2, 3}
my_unpacked_set = [item for item in my_set]

printer('Trying to select randomly from the set')
try:
    array_from_set = np.random.choice(my_set, p = [0.1, 0.2, 0.7], size = 100)
except ValueError as exception: logger.error('Exception:\n%s:', exception)

printer('Trying to select randomly from the unpacked set')
array_from_unpacked_set = np.random.choice(my_unpacked_set, p = [0.1, 0.2, 0.7], size = 100)
printer(array_from_unpacked_set)

printer('Creating a dictionary')
my_dictionary ={'a': 1, 'b': 2, 'c': 3}
my_unpacked_dictionary_keys = [key for key in my_dictionary]
my_unpacked_dictionary_values = [value for value in my_dictionary.values()]

try:
    array_from_dictionary = np.random.choice(my_dictionary, p = [0.1, 0.2, 0.7], size = 100)
except ValueError as exception: logger.error('Exception:\n%s:', exception)

printer('Trying to select randomly from the unpacked dictionary keys')
array_from_unpacked_dictionary_keys = np.random.choice(my_unpacked_dictionary_keys, p = [0.1, 0.2, 0.7], size = 100)
printer(array_from_unpacked_dictionary_keys)

printer('Trying to select randomly from the unpacked dictionary values')
array_from_unpacked_dictionary_values = np.random.choice(my_unpacked_dictionary_values, p = [0.1, 0.2, 0.7], size = 100)
printer(array_from_unpacked_dictionary_values)

100 samples from an array built from a tuple:
[3 3 3 1 2 2 3 3 3 2 2 3 3 3 3 3 3 3 2 1 3 2 1 3 3 3 3 3 3 2 2 2 3 3 3 1 2
 3 3 3 3 3 3 3 3 3 2 3 3 1 2 3 3 2 2 3 1 2 3 2 3 3 3 3 3 3 3 3 1 3 2 1 2 3
 3 3 2 3 3 3 1 2 1 3 3 3 3 3 3 3 3 3 1 2 3 3 3 3 3 2]
Creating a set
Trying to select randomly from the set

2023-08-02 09:05:54 
	Logger: numpy-tutorial Module: 1298046908 Function: <module> File: 1298046908.py Line: 18
ERROR:
Exception:
a must be 1-dimensional or an integer:

Trying to select randomly from the unpacked set
[3 1 3 2 2 3 3 3 3 3 3 3 1 3 3 3 3 3 3 3 2 1 3 3 2 2 3 3 3 3 2 2 2 3 3 2 2
 2 2 3 2 3 2 3 3 3 3 3 3 3 1 3 3 3 2 3 3 1 1 1 3 3 3 3 3 1 3 3 3 3 3 2 1 3
 3 3 2 3 3 3 3 3 2 3 1 3 3 2 3 1 2 3 2 3 3 2 3 3 3 2]
Creating a dictionary

2023-08-02 09:05:54 
	Logger: numpy-tutorial Module: 1298046908 Function: <module> File: 1298046908.py Line: 31
ERROR:
Exception:
a must be 1-dimensional or an integer:

Trying to select randomly from the unpacked dictionary keys
['b' 'a' 'c' 'c' 'c'