# Python S5 NumPy

## My Course Notes and Code

These are my notes from the **Udemy** course available at:
https://www.udemy.com/course/python-for-data-science-and-machine-learning-bootcamp/

I'm focusing on the section 5 of the course, which deals with **NumPy**.

#### S5V17 Intro to NumPy

- **Linear Algebra** library for Python
- The (almost) entire *PyData* Ecosystem relies on it
- Incredibly **fast**
- We'll mostly use NumPy arrays
    - **Vectors**: strictly 1d arrays
    - **Matrices**: 2d arrays

#### S5V18 NumPyArrays


- **Creating** NumPy arrays
    - By casting lists OR
    - By using built-in methods
        - `arange`, 
        - `zeros`, 
        - `ones`, 
        - `linspace`, 
        - `eye`, 
        - `random.rand`, 
        - `random.randn`, 
        - `random.randint`, etc.
            - Simplifying the code: `from numpy.random import randint`
- **Reshaping** NumPy arrays
- `max` and `min` methods

##### Creating NumPy arrays

In [6]:
my_list = [1, 2, 3]

In [8]:
import numpy as np

In [23]:
arr = np.array(my_list)
arr

array([1, 2, 3])

If we cast normal Python lists into NumPy arrays, we get **one-dimensional** NumPy arrays.

To get **two-dimensional arrays (matrices)** we need to cast a list of lists into a NumPy array.

In [22]:
my_mat = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
my_mat

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [24]:
np.array(my_mat)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

This is now a two-dimensional array, with 3 rows and tree columns.

So, this is how we can cast lists (of lists) into NumPy arrays.

Usually, however, we will use **built-in** NumPy array-generation methods. Here are some of the most common methods.

In [25]:
np.arange(0, 10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [26]:
np.arange(0, 11, 2)

array([ 0,  2,  4,  6,  8, 10])

In [28]:
np.zeros(3)

array([0., 0., 0.])

In [31]:
np.zeros((2,7))

array([[0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.]])

In [32]:
np.ones(5)

array([1., 1., 1., 1., 1.])

In [33]:
np.ones((2,7))

array([[1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1.]])

In [40]:
np.linspace(0,5,101)

array([0.  , 0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 ,
       0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95, 1.  , 1.05,
       1.1 , 1.15, 1.2 , 1.25, 1.3 , 1.35, 1.4 , 1.45, 1.5 , 1.55, 1.6 ,
       1.65, 1.7 , 1.75, 1.8 , 1.85, 1.9 , 1.95, 2.  , 2.05, 2.1 , 2.15,
       2.2 , 2.25, 2.3 , 2.35, 2.4 , 2.45, 2.5 , 2.55, 2.6 , 2.65, 2.7 ,
       2.75, 2.8 , 2.85, 2.9 , 2.95, 3.  , 3.05, 3.1 , 3.15, 3.2 , 3.25,
       3.3 , 3.35, 3.4 , 3.45, 3.5 , 3.55, 3.6 , 3.65, 3.7 , 3.75, 3.8 ,
       3.85, 3.9 , 3.95, 4.  , 4.05, 4.1 , 4.15, 4.2 , 4.25, 4.3 , 4.35,
       4.4 , 4.45, 4.5 , 4.55, 4.6 , 4.65, 4.7 , 4.75, 4.8 , 4.85, 4.9 ,
       4.95, 5.  ])

In [43]:
np.eye(4) # identity matrix - 2d square m with 1s in the diagonal and 0s otherwise

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [44]:
np.random.rand(5) # random numbers, uniformally distributed 0-1

array([0.92070185, 0.98066454, 0.1604993 , 0.22008501, 0.83191052])

In [45]:
np.random.rand(5, 5)

array([[0.42200058, 0.10279577, 0.35039785, 0.74462881, 0.09339898],
       [0.11489113, 0.56877565, 0.48988464, 0.61531849, 0.15747865],
       [0.77116048, 0.15180145, 0.25542532, 0.03922249, 0.81118121],
       [0.09400967, 0.15022566, 0.1231709 , 0.10047881, 0.02992746],
       [0.27387483, 0.35058252, 0.26574072, 0.02627654, 0.08040163]])

In [46]:
np.random.randn(5) # standard normal distribution, centered around 0

array([ 0.86690007, -0.66734194, -0.36196124, -0.14956537,  0.28451383])

In [49]:
np.random.randn(2, 4)

array([[ 1.37749909, -1.14889196, -0.96040917, -0.42274181],
       [-0.35477261, -0.32621708, -0.51961164,  0.29039617]])

In [50]:
np.random.randint(1, 100) # lowest num is inclusive, highest is exclusive

87

In [51]:
np.random.randint(1, 100, 3)

array([93, 50, 75])

In [66]:
arr = np.arange(25)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

##### Reshaping arrays

In [67]:
arr.reshape(5, 5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [68]:
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

In [70]:
arr = arr.reshape(5, 5)
arr.shape

dtype('int32')

In [71]:
arr.dtype

dtype('int32')

##### `max` and `min` methods

In [54]:
ranarr = np.random.randint(0, 50, 10)
ranarr

array([35, 16, 11, 12, 40, 43, 36,  5, 19, 11])

In [57]:
ranarr.max()

43

In [58]:
ranarr.min()

5

In [59]:
ranarr.argmax()

5

In [60]:
ranarr.argmin()

7

#### S5V20 NumPy Array Indexing and Selection

##### Indexing 1d arrays

In [82]:
arr = np.arange(0, 11)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [74]:
arr[8] # array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

8

In [77]:
arr[2:5] # array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

array([2, 3, 4])

In [78]:
arr[:6] # array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

array([0, 1, 2, 3, 4, 5])

In [79]:
arr[5:] # array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

array([ 5,  6,  7,  8,  9, 10])

In [83]:
arr[0:5] = 100
arr

array([100, 100, 100, 100, 100,   5,   6,   7,   8,   9,  10])

In [85]:
arr = np.arange(0, 11)
slice_of_arr = arr[0:6]
slice_of_arr

array([0, 1, 2, 3, 4, 5])

In [86]:
slice_of_arr[:] = 99
slice_of_arr

array([99, 99, 99, 99, 99, 99])

Slice is not a *copy*, but a **view** of the original array. So whatever changes we make to the slice will also affect the original array.

In [87]:
arr

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

If we want a **copy**, this needs to be specified.

In [88]:
arr_copy = arr.copy()
arr_copy

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

In [89]:
arr_copy[:] = 100
arr_copy

array([100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100])

In [90]:
arr

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

##### Indexing 2d arrays (matrices)

In [92]:
arr_2d = np.arange(25)    # Jose instead casted a nested list to a 3x3 NumPy array
arr_2d = arr_2d.reshape(5, 5) 
arr_2d           # the same array, quicker: `arr_2d = np.arange(25).reshape(5, 5)`

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [95]:
arr_2d[1][2] # Double-bracket format # row at position 1, column at position 2

7

In [96]:
arr_2d[1] # entire row at position 1

array([5, 6, 7, 8, 9])

In [99]:
arr_2d[1][2:4] # row at position 1, col at position 2 to 4 (not including 4)

array([7, 8])

In [100]:
# Practice: I want to grab element '18'

arr_2d[3][2]

17

In [101]:
arr_2d[3, 2] # Single-bracket comma notation # less prone to error

17

In [102]:
arr_2d[1, 2:4] # is equivalent to: `arr_2d[1][2:4]`

array([7, 8])

Playing some more with grabbing entire "chunks" of the matrix. I'm now grabbing sub-matrices, in other words.

In [103]:
arr_2d[:, 2:4]

array([[ 2,  3],
       [ 7,  8],
       [12, 13],
       [17, 18],
       [22, 23]])

In [105]:
arr_2d[:2] # is the equivalent of `arr_2d[:2,:] `

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [108]:
arr_2d[1:3, 1:4]

array([[ 6,  7,  8],
       [11, 12, 13]])

##### Conditional selection

In [110]:
arr = np.arange(0,11)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [112]:
arr > 4

array([False, False, False, False, False,  True,  True,  True,  True,
        True,  True])

In [111]:
arr[arr > 4]

array([ 5,  6,  7,  8,  9, 10])

In [115]:
arr[arr < 3]

array([0, 1, 2])

#### S5V21 NumPy Operations

- Array with Array
- Array with Scalar
- Universal array Functions

##### Array with Array

In [117]:
arr = np.arange(0,11)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [118]:
arr + arr # element-based summation

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20])

In [119]:
arr - arr # element-based subtraction

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [120]:
arr * arr # element-based multiplication

array([  0,   1,   4,   9,  16,  25,  36,  49,  64,  81, 100])

##### Array with Scalar

In [121]:
arr * 7 # NumPy broadcasts the scallar, so this is also element-based

array([ 0,  7, 14, 21, 28, 35, 42, 49, 56, 63, 70])

In [123]:
arr + 100 - 2

array([ 98,  99, 100, 101, 102, 103, 104, 105, 106, 107, 108])

Be careful when performing **division** on/with arrays that include **zeros**:

In [125]:
arr / arr # We don't get an error here, but a warning, even though 0 / 0...

  """Entry point for launching an IPython kernel.


array([nan,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

In [126]:
1 / arr

  """Entry point for launching an IPython kernel.


array([       inf, 1.        , 0.5       , 0.33333333, 0.25      ,
       0.2       , 0.16666667, 0.14285714, 0.125     , 0.11111111,
       0.1       ])

In [127]:
arr ** 3

array([   0,    1,    8,   27,   64,  125,  216,  343,  512,  729, 1000],
      dtype=int32)

##### Universal Array Functions

In [128]:
np.sqrt(arr)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131, 2.82842712, 3.        ,
       3.16227766])

In [129]:
np.exp(arr)

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
       2.98095799e+03, 8.10308393e+03, 2.20264658e+04])

In [130]:
np.max(arr) # equivalent to `arr.max()`

10

In [131]:
arr.max()

10

In [132]:
np.sin(arr)

array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ,
       -0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849,
       -0.54402111])

In [134]:
np.cos(arr)

array([ 1.        ,  0.54030231, -0.41614684, -0.9899925 , -0.65364362,
        0.28366219,  0.96017029,  0.75390225, -0.14550003, -0.91113026,
       -0.83907153])

In [135]:
np.log(arr)

  """Entry point for launching an IPython kernel.


array([      -inf, 0.        , 0.69314718, 1.09861229, 1.38629436,
       1.60943791, 1.79175947, 1.94591015, 2.07944154, 2.19722458,
       2.30258509])

https://numpy.org/doc/stable/reference/ufuncs.html

#### Section Exercises - My Solutions

- Create an array of 10 zeros

In [136]:
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

- Create an array of 10 ones

In [137]:
np.ones(10)

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

- Create an array of 10 fives

In [139]:
np.ones(10) * 5

array([5., 5., 5., 5., 5., 5., 5., 5., 5., 5.])

- Create an array of the integers from 10 to 50

In [141]:
np.arange(10, 51)

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
       27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
       44, 45, 46, 47, 48, 49, 50])

- Create an array of all the even integers from 10 to 50

In [142]:
np.arange(10, 51, 2)

array([10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42,
       44, 46, 48, 50])

- Create a 3x3 matrix with values ranging from 0 to 8

In [145]:
np.arange(0, 9).reshape(3, 3)

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

- Create a 3x3 identity matrix

In [146]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

- Use NumPy to generate a random number between 0 and 1

In [149]:
from numpy.random import rand
rand(1)

array([0.81397083])

- Use NumPy to generate an array of 25 random numbers sampled from a standard normal distribution

In [150]:
from numpy.random import randn
randn(25)

array([-0.3605921 ,  0.85406461,  1.21088354,  0.4595093 ,  1.45837323,
       -0.15251046,  1.07186527,  1.02608641, -0.09203672, -1.01130303,
       -1.37184226,  0.13706692, -0.37180869, -0.04822197,  2.45967749,
        0.49301006,  1.5183935 , -0.4812883 ,  0.15099358,  0.25977565,
        0.70022757,  0.38127947,  0.05318161,  0.91377745, -0.91376368])

- Create the described matrix:

In [153]:
np.linspace(0.01, 1, 100).reshape(10, 10)

array([[0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1 ],
       [0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.2 ],
       [0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.3 ],
       [0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.4 ],
       [0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5 ],
       [0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.6 ],
       [0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.7 ],
       [0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.8 ],
       [0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.9 ],
       [0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99, 1.  ]])

- Create an array of 20 linearly spaced points between 0 and 1

In [154]:
np.linspace(0, 1, 20)

array([0.        , 0.05263158, 0.10526316, 0.15789474, 0.21052632,
       0.26315789, 0.31578947, 0.36842105, 0.42105263, 0.47368421,
       0.52631579, 0.57894737, 0.63157895, 0.68421053, 0.73684211,
       0.78947368, 0.84210526, 0.89473684, 0.94736842, 1.        ])

Now, we will use a matrix `mat` created below to practice **array indexing and selection**.

In [155]:
mat = np.arange(1,26).reshape(5,5)
mat

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25]])

In [157]:
mat[2:, 1:]

array([[12, 13, 14, 15],
       [17, 18, 19, 20],
       [22, 23, 24, 25]])

In [158]:
mat[3, 4]

20

In [166]:
mat[:3, 1].reshape(3, 1)

array([[ 2],
       [ 7],
       [12]])

In [167]:
mat[4]

array([21, 22, 23, 24, 25])

In [168]:
mat[3:]

array([[16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25]])

- Get the sum of all the values in `mat`

In [169]:
mat.sum()

325

- Get the standard deviation of the values in `mat`

In [173]:
mat.std() # `np.std(mat)`

7.211102550927978

- Get the sum of all the columns in `mat`

- Good to know: `numpy.sum(arr, axis, dtype, out)`

In [177]:
mat.sum(0) # sum row-wise

array([55, 60, 65, 70, 75])

In [179]:
mat.sum(1).reshape(5, 1) # sum column-wise + reshape

array([[ 15],
       [ 40],
       [ 65],
       [ 90],
       [115]])