### <font color="brown">NumPy - Numerical Python</font>
https://numpy.org/

#### A key feature of numpy is the n-dimensional array object, or ndarray, which allows you to perform mathematical operations on entire arrays as you would with scalars

---

In [16]:
import numpy as np

In [18]:
data1 = [3, 2.8, 19, 5, 17.6, 5.1]
arr1 = np.array(data1)
arr1

array([ 3. ,  2.8, 19. ,  5. , 17.6,  5.1])

In [19]:
# multiplying a Python list by a scalar repeats it (just like string)
data1*2

[3, 2.8, 19, 5, 17.6, 5.1, 3, 2.8, 19, 5, 17.6, 5.1]

In [20]:
# multiplying a numpy array multiplies all items individually 
arr1*2

array([ 6. ,  5.6, 38. , 10. , 35.2, 10.2])

In [21]:
# adding a scalar to a Python list?
data1 + 2

TypeError: can only concatenate list (not "int") to list

In [22]:
# adding a scalar to a numpy array
arr1 + 2

array([ 5. ,  4.8, 21. ,  7. , 19.6,  7.1])

In [24]:
# adding two lists in Python appends all items of second to first
data1 + data1

[3, 2.8, 19, 5, 17.6, 5.1, 3, 2.8, 19, 5, 17.6, 5.1]

In [25]:
# adding two numpy arrays does element-wise addition
arr1 + arr1

array([ 6. ,  5.6, 38. , 10. , 35.2, 10.2])

**all items in an ndarray MUST BE OF THE SAME TYPE (unlike Python list)**

In [28]:
num1 = np.array([1,2,3,4,5])
str1 = np.array(['cs112','cs210','cs211'])
bool1 = np.array([True,True,False,True])
print(num1)
print(str1)
print(bool1)

[1 2 3 4 5]
['cs112' 'cs210' 'cs211']
[ True  True False  True]


In [29]:
# try mixing ints with strings
mixedarr1 = np.array([1,'1'])
print(mixedarr1)

['1' '1']


**example above shows that numbers mixed with string will be coerced into strings**

In [32]:
mixedarr3 = np.array([True,1,2.5,'str'])
print(mixedarr3)

['True' '1' '2.5' 'str']


In [31]:
# when int is mixed with float, all items are converted to float
mixedarr2 = np.array([1,2.5])
print(mixedarr2)

[1.  2.5]


---

#### <font color="brown">Every ndarray has a type</font>

In [34]:
data1 = [3, 2.8, 19, 5, 17.6, 5.1]
arr1 = np.array(data1)
print(arr1.dtype)

num1 = np.array([1,2,3,4,5])
print(num1.dtype)

str1 = np.array(['cs112','cs210','cs211'])
print(str1.dtype)

bool1 = np.array([True,True,False,True])
print(bool1.dtype)

float64
int64
<U5
bool


##### U5 above means means Unicode, 5 characters. Actual bytes per character depends on platform

In [35]:
str2 = np.array(['one','three','five','eleven'])
print(str2.dtype)

<U6


---

#### <font color="brown">Every ndarray has a shape</font>

In [36]:
arr1.shape

(6,)

In [37]:
arr2d = np.array([[1,2,3],[4,5,6]])  # input is nested list
print(arr2d)
print(arr2d.dtype)
print(arr2d.shape)   # 2 rows, 3 columns

[[1 2 3]
 [4 5 6]]
int64
(2, 3)


In [38]:
print(arr2d.ndim)  # ndim gives number of rows

2


In [40]:
r,c = arr2d.shape
print(f'rows={r}, columns={c}')

rows=2, columns=3


**nested lists of different lengths will give an unusual object, an array of two lists**

In [42]:
np.array([[1,2,3],[4,5,6,7]])  

array([list([1, 2, 3]), list([4, 5, 6, 7])], dtype=object)

---

#### <font color="brown">Creating boilerplate ndarrays using special NumPy functions</font>

**array initialized to zeros**

In [46]:
# array initialized to zeros
zr = np.zeros(10)
zr

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [47]:
zr2d = np.zeros((5,3))  # 5 x 3 array fileld with zeros
zr2d

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [186]:
zr2d = np.zeros(5,3)   # won't work, shape argument must be a tuple, except for 1-d
zr2d 

TypeError: data type not understood

**array initialized to ones**

In [49]:
ones2d = np.ones((3,4))
ones2d

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [52]:
# use dtype argument to set type to int instead of default float
ones2d = np.ones((3,4),dtype=int)
ones2d

array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]])

**array initialized to empty (no particular value)**

In [53]:
np.empty((2,3,2))  # 3D

array([[[-1.72723371e-077, -1.72723371e-077],
        [ 6.91691904e-323,  0.00000000e+000],
        [ 0.00000000e+000,  0.00000000e+000]],

       [[ 0.00000000e+000,  0.00000000e+000],
        [ 0.00000000e+000,  0.00000000e+000],
        [ 0.00000000e+000,  0.00000000e+000]]])

*not safe to assume that np.empty() will get you ones, or zeros, or anything specific*

**Identity matrix (square matrix with 1s on main diagonal)**

In [54]:
np.eye(3)  # single parametr because matrix is square

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

**Using arange function. It's the numpy array equivalent of Python range function**

In [47]:
np.arange(5)

array([0, 1, 2, 3, 4])

In [55]:
np.arange(-3,3,2)

array([-3, -1,  1])

In [56]:
np.arange(5,-2,-1)

array([ 5,  4,  3,  2,  1,  0, -1])

**Reshaping an ndarray**

In [57]:
np.arange(15).reshape(3,5)   # reshape can be used on any ndarray

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [59]:
arr2d = np.array([[1,2,3],[4,5,6]])

In [60]:
arr2d.reshape(6)

array([1, 2, 3, 4, 5, 6])

**Making a zeros, one's, empty array out of another array's shape**

In [61]:
arr2d

array([[1, 2, 3],
       [4, 5, 6]])

In [62]:
np.ones_like(arr2d)

array([[1, 1, 1],
       [1, 1, 1]])

In [64]:
np.empty_like(arr2d)

array([[0, 0, 0],
       [0, 0, 0]])

In [65]:
arr3d = np.arange(12).reshape(2,3,2)
print(arr3d)

[[[ 0  1]
  [ 2  3]
  [ 4  5]]

 [[ 6  7]
  [ 8  9]
  [10 11]]]


**In the above, the first parameter is the 3rd dimension, so 2 planes of 3x2**

In [67]:
# get 2nd row, 1st column of 1st plane
arr3d[0,1,0]

2

In [68]:
# alternatively, you can use this syntax
arr3d[0][1][0]

2

In [69]:
# get 3nd row, 2st column of 2nd plane
arr3d[1,2,1]

11

In [70]:
np.zeros_like(arr3d)

array([[[0, 0],
        [0, 0],
        [0, 0]],

       [[0, 0],
        [0, 0],
        [0, 0]]])

---

#### <font color="brown">Type Casting</font>

##### You can CAST an array from one dtype to another using astype method. Using astype ALWAYS CREATES A NEW ARRAY, leaving the original array untouched

In [71]:
floatarr = np.array([1,2.5,3])
floatarr.dtype

dtype('float64')

In [72]:
intarr = floatarr.astype(np.int64)
intarr, intarr.dtype

(array([1, 2, 3]), dtype('int64'))

In [74]:
# or, can just say int instead of np.int64
intarr2 = floatarr.astype(int)
intarr2, intarr2.dtype

(array([1, 2, 3]), dtype('int64'))

In [76]:
num_strings = np.array(['1.5', '3.6', '-2.9'])
narr = num_strings.astype(float)  # parse each item as a real number
narr, narr.dtype

(array([ 1.5,  3.6, -2.9]), dtype('float64'))

In [77]:
# assign another array's dtype to intarr
farr = intarr.astype(floatarr.dtype) 
farr, intarr

(array([1., 2., 3.]), array([1, 2, 3]))

In [78]:
# failure to cast will raise an error
np.array(['1.2','2.5','x.y']).astype(float)

ValueError: could not convert string to float: 'x.y'

---

#### <font color="brown">Array-array and array-scalar operations</font>

##### Batch operations applied to arrays as a whole is called <em>vectorization</em>

In [79]:
arr = np.array([[1,2,3],[4,5,6]])
arr

array([[1, 2, 3],
       [4, 5, 6]])

In [81]:
arr * arr  # corresponding elements are multiplied

array([[ 1,  4,  9],
       [16, 25, 36]])

In [83]:
arr + arr  # corresponding elements are added

array([[ 2,  4,  6],
       [ 8, 10, 12]])

In [84]:
1/arr  # invert each element

array([[1.        , 0.5       , 0.33333333],
       [0.25      , 0.2       , 0.16666667]])

In [87]:
arr ** 2  # square each element

array([[ 1,  4,  9],
       [16, 25, 36]])

In [90]:
np.power(arr,2)

array([[ 1,  4,  9],
       [16, 25, 36]])

---

#### <font color="brown">Indexing and Slicing</font>

In [93]:
arr = np.arange(10)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [95]:
arr[5:8]

array([5, 6, 7])

In [96]:
myarr = np.arange(1,10).reshape(3,3)
myarr

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [97]:
myarr[1] = [-1,-2,-3]  # update 2nd row of array
myarr

array([[ 1,  2,  3],
       [-1, -2, -3],
       [ 7,  8,  9]])

In [98]:
myarr[:,2] = -1   # update 2nd column of array
myarr

array([[ 1,  2, -1],
       [-1, -2, -1],
       [ 7,  8, -1]])

In [99]:
narr = np.arange(32).reshape(8,4)
narr

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

**indexing several scattered items in one shot**

In [100]:
narr[[2,4,0,7],[1,2,0,3]]  # selects [2,1],[4,2],[0,0],[7,3]

array([ 9, 18,  0, 31])

In [101]:
narr[[2,4,0,7]]  # rows as specified

array([[ 8,  9, 10, 11],
       [16, 17, 18, 19],
       [ 0,  1,  2,  3],
       [28, 29, 30, 31]])

In [148]:
narr[[2,4,0,7]][:,[1,2,3,0]]  # shuffle columns

array([[ 9, 10, 11,  8],
       [17, 18, 19, 16],
       [ 1,  2,  3,  0],
       [29, 30, 31, 28]])

In [104]:
# above is equivalent to
narr_subrows = narr[[2,4,0,7]]
print(narr_subrows,'\n')
narr_subrows_shuffle = narr_subrows[:,[1,2,3,0]]
print(narr_subrows_shuffle)

[[ 8  9 10 11]
 [16 17 18 19]
 [ 0  1  2  3]
 [28 29 30 31]] 

[[ 9 10 11  8]
 [17 18 19 16]
 [ 1  2  3  0]
 [29 30 31 28]]


**<font color="red">An array slice is a "view" (not copy) on original array. If you modify a slice, the original array is modified!!</font>**

In [107]:
arr = np.arange(10)
arr
arr_slice = arr[5:8]
arr_slice

array([5, 6, 7])

In [109]:
arr_slice[1] = 66  # changes the original array!
arr

array([ 0,  1,  2,  3,  4,  5, 66,  7,  8,  9])

In [110]:
arr[5:8][1] = 6  # 2nd element of the slice
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [111]:
arr[5:8] = 10  # every slice item is set to 10
arr

array([ 0,  1,  2,  3,  4, 10, 10, 10,  8,  9])

In [112]:
# remember, arr_slice is a view on the original, so it reflects change as well
arr_slice  

array([10, 10, 10])

In [113]:
arr_slice[:] = 13 # every slice item is set to 13
arr

array([ 0,  1,  2,  3,  4, 13, 13, 13,  8,  9])

**You can make a copy of a slice by using copy method**

In [115]:
slice_copy = arr[5:8].copy()  # explicit copy of slice, not a view
slice_copy[1] = 66
print(arr)
print(slice_copy)

[ 0  1  2  3  4 13 13 13  8  9]
[13 66 13]


---

#### <font color="brown">Slicing on 2D arrays</font>

In [138]:
# 2D array
arr2d = np.array([[1,2,3],[4,5,6],[7,8,9]])
arr2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [139]:
arr2d[2] # 3rd row

array([7, 8, 9])

In [140]:
arr2d[:,1] # all rows, 2nd column

array([2, 5, 8])

In [141]:
arr2d[1][2] 

6

In [142]:
# alternatively
arr2d[1,2]

6

**Slicing out sub 2D arrays**

In [143]:
rowslc = arr2d[1:] # 2nd and 3rd rows
print(arr2d,'\n')
print(rowslc)

[[1 2 3]
 [4 5 6]
 [7 8 9]] 

[[4 5 6]
 [7 8 9]]


In [144]:
# 1st and 3rd rows
arr2d[[0,2]]

array([[1, 2, 3],
       [7, 8, 9]])

In [145]:
# 2nd and 3rd rows can also be written like this
arr2d[[-2,-1]]

array([[4, 5, 6],
       [7, 8, 9]])

In [146]:
# shuffle rows
arr2d[[2,0,1]]

array([[7, 8, 9],
       [1, 2, 3],
       [4, 5, 6]])

In [147]:
colslc = arr2d[:, [0,2]]  # 1st and 3rd columns
colslc

array([[1, 3],
       [4, 6],
       [7, 9]])

In [148]:
colslc[:,1] = 10  # assign 10 to second column of slice
colslc

array([[ 1, 10],
       [ 4, 10],
       [ 7, 10]])

In [149]:
arr2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

**<font color="red">Above shows that slicing by column gives a COPY, not a view<p>However, slicing by row gives a VIEW, not a copy</font>**

In [150]:
arr2d = np.arange(1,10).reshape(3,3)
arr2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [151]:
rowslc = arr2d[1:]
rowslc

array([[4, 5, 6],
       [7, 8, 9]])

In [152]:
rowslc[0] = 10
rowslc

array([[10, 10, 10],
       [ 7,  8,  9]])

In [153]:
arr2d  # original array is modified!

array([[ 1,  2,  3],
       [10, 10, 10],
       [ 7,  8,  9]])

**Filtering by values**

In [158]:
arr=np.arange(9)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8])

In [159]:
slc = arr[arr > 4]  # pick elements > 4
slc

array([5, 6, 7, 8])

In [160]:
slc[0] = 10
slc

array([10,  6,  7,  8])

In [161]:
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8])

**<font color="red">NOTE: Slicing with boolean filtering, like in the above example, gets a COPY, not a view</font>**