### Intro to NumPy & SciPy

**NumPy:** A Python library for performing linear algebra functions; Python machine learning libraries (ex: scikit-learn) depend on it.

An object in Python has attributes/characteristics and functions/methods it can perform. They both come after the NumPy object with with similar syntax:

Attribute: np.attribute  
Method: np.method()

**SciPy:** A Python library generally used for sampling of probability distributions and other statistical analyses. SciPy contains more fully featured versions of the linear algebra modules, as well as many other numerical algorithms.

**Note:** If you are doing scientific computing with Python, you should probably install both NumPy and SciPy.

In [80]:
# Import the library/package using commonly used abbreviation "np":
import numpy as np

In [81]:
# Use NumPy to create arrays:
array = np.array([11, 22, 33]) # 1 row, 3 columns
print(array)
print(type(array))

[11 22 33]
<class 'numpy.ndarray'>


**Using a function which is contained within a Python package/library:** 

Syntax --> Python_Package.Package_Function(Function Arguments)

**Adding arrays in NumPy:**

An array is a sequence of numbers. NumPy allows us to perform operations on arrays:

In [72]:
# Create two arrays:
array_1 = np.array([9, 8, 7])
array_2 = np.array([6, 5, 4])

In [10]:
# Add the two arrays:
array_1 + array_2

array([15, 13, 11])

In [11]:
# Subtract the two arrays:
array_1 - array_2

array([3, 3, 3])

In [12]:
# Multiply the two arrays:
array_1 * array_2

array([54, 40, 28])

In [13]:
# Divide the two arrays:
array_1 / array_2

array([1.5 , 1.6 , 1.75])

**Generating Random Numbers & Arrays With NumPy:**

The NumPy function (np.random.randint) samples numbers from a uniform distribution (uniform means there is an equal likelihood of selecting any number within the distribution). We can set the upper and lower limits of the distribution and the shape of the array produced.

*Set the random seed to a constant to ensure that the returned random number is the same every time. Changing that constant changes the output. Leaving out the random.seed function returns a random number every time.*

In [54]:
"""Generate a single number (integer), between 0 and 100 (including 0, not including 100)."""
np.random.seed(0)
a = np.random.randint(0, 100)
print(a)
print(type(a))

44
<class 'int'>


In [58]:
"""Generate a 1-D array with three values. Size argument tells computer the number of elements in the array."""
np.random.seed(1)
b = np.random.randint(0, 100, size=(3))
print(b)
print(type(b))

[37 12 72]
<class 'numpy.ndarray'>


In [59]:
"""Generate 1D array as row vectors."""
np.random.seed(2)
c = np.random.randint(0, 100, size=(3, 1))
print(c)
print(type(c))

[[40]
 [15]
 [72]]
<class 'numpy.ndarray'>


In [60]:
"""Generate a 2-D array with three rows and two columns."""
np.random.seed(3)
d = np.random.randint(0, 100, size=(3, 2))
print(d)
print(type(d))

[[24  3]
 [56 72]
 [ 0 21]]
<class 'numpy.ndarray'>


In [63]:
"""Generate n-dimensional arrays."""
np.random.seed(4)
e = np.random.randint(0, 100, size=(3, 3, 2))
print(e)
print(type(e))

[[[46 55]
  [69  1]
  [87 72]]

 [[50  9]
  [58 94]
  [55 55]]

 [[57 36]
  [50 44]
  [38 52]]]
<class 'numpy.ndarray'>


A matrix/2-D array is like an Excel worksheet, and a tensor is like an Excel workbook (a collection of sheets). A tensor can have n-dimensions. Above we made a 3 x 3 x 2 tensor with 3 sheets/pages, 3 rows each, and 2 columns each.

**Sampling from a normal distribution:**

In [68]:
"""Sample 5 elements from a normal distribution with a mean of 10 and a standard deviation of 1.
The result is a 1-D array of those values."""
mean = 10
std = 1
series_length = 5

np.random.normal(mean, std, series_length)

array([10.02693565, 10.35880669,  8.38545861, 10.82448138, 10.26043704])

**NumPy Attributes:**

In [83]:
# Array "e" from above:
print(e)

[[[46 55]
  [69  1]
  [87 72]]

 [[50  9]
  [58 94]
  [55 55]]

 [[57 36]
  [50 44]
  [38 52]]]


In [84]:
# Get the dimensions of a matrix
e.shape

(3, 3, 2)

There is no limit to the number of dimensions an array can have. As more dimensions are added, arrays nest in each other to represent additional dimensions.

**NumPy Methods:**

In [74]:
# Array "b" from above:
print(b)

[37 12 72]


In [75]:
# Find the mean (average) of a list:
np.mean(b)

40.333333333333336

In [76]:
# Find the median (middle value) of a list:
np.median(b)

37.0

**Array stats with SciPy:**

SciPy is most well known and used for the stats module, which specializes in stats functions and distributions, including mode.

In [77]:
# Find the mode using stats module from SciPy:

from scipy import stats

sample_array = [1, 1, 1, 2, 2, 3, 4, 4, 5]
stats.mode(sample_array)

ModeResult(mode=array([1]), count=array([3]))

*The mode (most frequently occurring number) is 1, and it appears 3 times. The .mode() function returns an array of modes along an axis (default of 0, the rows) and the corresponding count of those modes for each respective row. The reason for this output is because the .mode() function generalizes to n-dimensional arrays.*

In [79]:
# How many times does mode occur?:
stats.mode(sample_array).count[0]

3

**The End!**