# NumPy


NumPy (or Numpy) is a Linear Algebra Library for Python, the reason it is so important for Data Science with Python is that almost all of the libraries in the PyData Ecosystem rely on NumPy as one of their main building blocks.

Numpy is also incredibly fast, as it has bindings to C libraries.

In [2]:
import numpy as np

In [3]:
mylist = [1, 2, 3]
np.array(mylist)

array([1, 2, 3])

In [4]:
my_matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
my_matrix

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [5]:
my_matrix

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [6]:
np.array(my_matrix)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

### Built-In Methods

#### arange

Return evenly spaced values within a given interval



In [9]:
np.arange(0, 10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [10]:
np.arange(0, 11, 2)

array([ 0,  2,  4,  6,  8, 10])

In [11]:
#### zeros and ones arrays

In [13]:
np.zeros(3)

array([0., 0., 0.])

In [14]:
np.zeros((5, 5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [15]:
np.ones(3)

array([1., 1., 1.])

In [16]:
np.ones((3, 3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

#### linspace

Return evenly spaced numbers over a specified interval

In [17]:
np.linspace(0, 10, 3)

array([ 0.,  5., 10.])

In [18]:
np.linspace(0, 10, 50)

array([ 0.        ,  0.20408163,  0.40816327,  0.6122449 ,  0.81632653,
        1.02040816,  1.2244898 ,  1.42857143,  1.63265306,  1.83673469,
        2.04081633,  2.24489796,  2.44897959,  2.65306122,  2.85714286,
        3.06122449,  3.26530612,  3.46938776,  3.67346939,  3.87755102,
        4.08163265,  4.28571429,  4.48979592,  4.69387755,  4.89795918,
        5.10204082,  5.30612245,  5.51020408,  5.71428571,  5.91836735,
        6.12244898,  6.32653061,  6.53061224,  6.73469388,  6.93877551,
        7.14285714,  7.34693878,  7.55102041,  7.75510204,  7.95918367,
        8.16326531,  8.36734694,  8.57142857,  8.7755102 ,  8.97959184,
        9.18367347,  9.3877551 ,  9.59183673,  9.79591837, 10.        ])

#### Eye

Create an identity matrix

In [19]:
np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

### Random

Numpy also has lots of ways to create random number arrays.

#### rand

Create an array of the given shape and populate it wtih random samples from a uniform distribtution over [0, 1)

In [21]:
np.random.rand(2)

array([0.20997168, 0.33196366])

In [22]:
np.random.rand(5, 5)

array([[0.60106488, 0.56540903, 0.67138709, 0.17284536, 0.14489534],
       [0.97637843, 0.30193324, 0.79821361, 0.27922391, 0.69900114],
       [0.15901774, 0.4675616 , 0.68886323, 0.75548019, 0.07176816],
       [0.52873007, 0.26086019, 0.22910298, 0.55424013, 0.33482392],
       [0.34714392, 0.94188602, 0.70089994, 0.80485961, 0.94467322]])

#### randn

Return a samples (or samples) from the "Standard Normal" distribution. Unlike `rand` which is uniform. 

In [23]:
np.random.randn(2)

array([-0.81995798, -0.29473501])

In [24]:
np.random.randn(5, 5)

array([[ 0.49326225,  0.09288846,  1.7510038 ,  1.08830865, -0.88775447],
       [ 0.43549576,  1.93318356, -0.62773776,  0.35813101, -0.34128472],
       [ 0.79065362,  0.5483512 ,  0.14805377, -0.50661303, -0.18422757],
       [-0.76452181, -0.85761928,  1.65525465, -0.51222389,  0.14686732],
       [ 0.31033908, -0.37031584, -1.56400998,  0.40905384,  0.27334181]])

#### randint

Return a random integers from low (inclusive) to high (exclusive)

In [25]:
np.random.randint(1, 100)

34

In [26]:
np.random.randint(1, 100, 10)

array([85, 37, 30, 19, 14, 54, 87,  5, 34,  9])

### Array Attributes and Methods

In [29]:
arr = np.arange(25)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

In [31]:
ranarr = np.random.randint(0, 50, 10)
ranarr

array([17, 11, 48,  7, 27, 13, 43, 10, 33, 14])

#### Reshape

Returns an array containing the same data but with a new shape. 

In [32]:
arr.reshape(5, 5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

#### max, min, argmax, argmin

These are userful methods for finding max or min values. Or finding their index locations. 

In [34]:
ranarr

array([17, 11, 48,  7, 27, 13, 43, 10, 33, 14])

In [35]:
ranarr.max()

48

In [36]:
ranarr.argmax()

2

In [37]:
ranarr.min()

7

In [38]:
ranarr.argmin()

3

#### Shape

Shape is an attribute (not a method) to inform about Array dimension

In [39]:
# Vector

arr.shape

(25,)

In [41]:
# Notice the two sets of brackets

arr.reshape(1, 25)

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
        16, 17, 18, 19, 20, 21, 22, 23, 24]])

In [43]:
arr.reshape(1, 25).shape

(1, 25)

In [45]:
arr.reshape(25, 1)

array([[ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10],
       [11],
       [12],
       [13],
       [14],
       [15],
       [16],
       [17],
       [18],
       [19],
       [20],
       [21],
       [22],
       [23],
       [24]])

In [46]:
arr.reshape(25, 1).shape

(25, 1)

#### dtype

This attribute informs the data type of the object in the array. 

In [47]:
arr.dtype

dtype('int32')

## Numpy Indexing and Selection

In [1]:
import numpy as np

# Create a simple array

arr = np.arange(0,11)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

### Indexing and Selection

Numpy array indexing is similar to python list. 

In [2]:
# Get a value at an index 
arr[8]

8

In [3]:
# Get values in a range
arr[1:5]

array([1, 2, 3, 4])

### Broadcasting

Numpy arrays can broadcast while Python Lists can't. 

In [4]:
# Setting a value with index range (Broadcasting)

arr[0:5] = 100

arr

array([100, 100, 100, 100, 100,   5,   6,   7,   8,   9,  10])

In [5]:
# Reset array

arr = np.arange(0, 11)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [6]:
# Important notes on Slices

slice_of_arr = arr[0:6]

slice_of_arr

array([0, 1, 2, 3, 4, 5])

In [7]:
# Change Slice
slice_of_arr[:] = 99

slice_of_arr

array([99, 99, 99, 99, 99, 99])

In [8]:
# Note that the changes also occur in the original array!

arr

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

So while slicing, the actual data doesn't get copied. Slice is just a view of the original array! this avoid memory problems. 

In [9]:
# To get a copy need to be explicit about it. 

arr_copy = arr.copy()
arr_copy

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

In [10]:
arr_copy[:] = 100

arr_copy

array([100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100])

In [11]:
arr

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

### Indexing a 2D array (matrices)

The general format is `arr_2d[row][col]` or `arr_2d[row, col]` . Comma notation is recommended. 

In [14]:
arr_2d = np.array(([5,10,15], [20, 25,30], [35,40,45]))

arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [16]:
# Indexing row
arr_2d[1]

array([20, 25, 30])

In [17]:
# Getting individual element value

arr_2d[1][0]

20

In [18]:
# Another way

arr_2d[1, 0]

20

In [19]:
# 2D array slicing

# Slice (2,2) array from top right corner
arr_2d[:2, 1:]

array([[10, 15],
       [25, 30]])

In [20]:
# Get bottom row
arr_2d[2]

array([35, 40, 45])

In [21]:
# Another way to do the same

arr_2d[2, :]

array([35, 40, 45])

### Fancy Indexing

Fancy indexing allows to select entire rows or columns out of order. 

In [22]:
# Set up matrix

arr2d = np.zeros((10, 10))

In [24]:
# Length of array

arr_length = arr2d.shape[1]

In [25]:
# Set up array

for i in range(arr_length):
    arr2d[i] = i

arr2d

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
       [9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])

In [26]:
# Fancy indexing allows to fetch rows in any order

arr2d[[6, 4, 2, 7]]

array([[6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.]])

### Array Selection

In [27]:
arr = np.arange(1, 11)
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [30]:
bool_arr = arr > 4
bool_arr

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [31]:
arr[bool_arr]

array([ 5,  6,  7,  8,  9, 10])

In [32]:
# So we can directly do the following

x = 2

arr[arr > x]

array([ 3,  4,  5,  6,  7,  8,  9, 10])

## Numpy Operations

### Arithmetic

In [33]:
arr = np.arange(0, 10)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [34]:
arr + arr

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [35]:
arr - arr

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [36]:
# Warning on division by zero, but not an error!
# Just replaced with nan
arr / arr

array([nan,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

In [37]:
# Again warning, but no error. Just replaced with infinity
1/arr

array([       inf, 1.        , 0.5       , 0.33333333, 0.25      ,
       0.2       , 0.16666667, 0.14285714, 0.125     , 0.11111111])

In [38]:
# Power

arr ** 3

array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729], dtype=int32)

### Universal Array functions

These are just mathematical operations you can perform across numpy array. 

In [39]:
# Taking Square Roots

np.sqrt(arr)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131, 2.82842712, 3.        ])

In [40]:
# Calculation exponential (e^)
np.exp(arr)

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
       2.98095799e+03, 8.10308393e+03])

In [41]:
np.max(arr) # same as arr.max()

9

In [42]:
np.sin(arr)

array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ,
       -0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849])

In [43]:
np.log(arr)

array([      -inf, 0.        , 0.69314718, 1.09861229, 1.38629436,
       1.60943791, 1.79175947, 1.94591015, 2.07944154, 2.19722458])