# Numpy Tutorial
**Sources:**
1. Patrick Loeber [Numpy Crash Course](https://www.youtube.com/watch?v=9JUAPgtkKpI)
2. Codebasics [Numpy playlist](https://www.youtube.com/playlist?list=PLeo1K3hjS3uset9zIVzJWqplaWBiacTEU)

In [1]:
import numpy as np
np.__version__    # shows the current version i.e. 1.22.4

'1.22.4'

### A numpy array ( 1-dimensional)

In [2]:
a = np.array([1, 2, 3])
a

array([1, 2, 3])

#### Shape of an array using `shape` attribute

In [3]:
a.shape

(3,)

#### Datatype of an array using `dtype` attribute

In [4]:
a.dtype

dtype('int32')

#### Dimension of an array using `ndim` attribute

In [5]:
a.ndim

1

#### Size of an array using `size` attribute

In [6]:
a.size

3

#### Size of each element in array using `itemsize` attribute

In [7]:
a.itemsize

4

#### Dot product of an array using `np.dot()` method & `@` sign
It is the sum of the product of the two arrays

In [8]:
# Normal way in Python
l1 = [1, 2, 3]
l2 = [4, 5, 6]
list_dot = 0
for i in range(len(l1)):
    list_dot += l1[i] * l2[i]
list_dot

32

In [9]:
# Using np.dot()
a1 = np.array(l1)
a2 = np.array(l2)
array_dot = np.dot(a1, a2)
array_dot

32

Using `@` sign

In [10]:
dot = a1 @ a2
dot

32

## Multi-dimensional arrays

In [11]:
a = np.array([[1, 2, 3], [4, 5, 6]])
a

array([[1, 2, 3],
       [4, 5, 6]])

In [12]:
a.shape

(2, 3)

### Indexing is multi-dimensional array can be done using ',' comma

In [13]:
print(a[0][1])
print(a[0, 1])
a[0][1] == a[0, 1]

2
2


True

### Slicing of multi-dimensional arrays can be done by putting commas
```arr [1:3 , 2:4]```

-> A 2d array

-> Slice 1st & 2nd row from it

-> And, 2nd and 4th column from it

In [14]:
# if we want all the rows of a specific column
a[:,1]

array([2, 5])

In [15]:
# if we want all the columns of a specific row
a[1,:]

array([4, 5, 6])

In [16]:
# last 2 elements of last row
a[-1,-2:]

array([5, 6])

#### Boolean indexing can be done by putting the condition within square brackets

In [17]:
a

array([[1, 2, 3],
       [4, 5, 6]])

In [18]:
a>2

array([[False, False,  True],
       [ True,  True,  True]])

In [19]:
a[a>2]

array([3, 4, 5, 6])

#### Using `np.where()` method

In [20]:
np.where(a>2)

(array([0, 1, 1, 1], dtype=int64), array([2, 0, 1, 2], dtype=int64))

In [21]:
# param:
# condition, filled false values with x, filled true values with y
# Here filling the false values of the respective conditions in x,
# and true values with -1.
np.where(a>2, a, -1)

array([[-1, -1,  3],
       [ 4,  5,  6]])

In [22]:
# Finding evens
np.where(a%2==0, a, 0)

array([[0, 2, 0],
       [4, 0, 6]])

#### Fancy Indexing
Passing an array of indexes to an array to access it's multiple indexes.

In [23]:
arr = np.array([10, 20, 30, 40, 50, 60])
indexes = [1, 4, 3, 5]

In [24]:
arr[indexes]

array([20, 50, 40, 60])

#### Find indexes of an array where some condition is True, using `np.argwhere()` method

In [25]:
arr2 = np.array([1, 2, 3, 4, 5])

In [26]:
even_indexes = np.argwhere(arr2%2==0)
even_indexes

array([[1],
       [3]], dtype=int64)

In [27]:
even_indexes = even_indexes.flatten()
even_indexes

array([1, 3], dtype=int64)

In [28]:
arr2[even_indexes]

array([2, 4])

### Transpose of an array using `T` attribute
Shifting all the rows to columns and columns to rows of a matrix

In [29]:
a.T

array([[1, 4],
       [2, 5],
       [3, 6]])

#### To get diagonals of an array, we use `np.diag(arr)`

In [30]:
a

array([[1, 2, 3],
       [4, 5, 6]])

In [31]:
np.diag(a)

array([1, 5])

In [32]:
np.diag(np.diag(a))

array([[1, 0],
       [0, 5]])

In [33]:
np.diag(np.array([1, 2, 3, 4]))

array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

### Creating an array using `np.arange()` method

In [34]:
# Passing stop value, start=0 (default)
np.arange(5)

array([0, 1, 2, 3, 4])

In [35]:
# Passing start & stop value
np.arange(1, 5)

array([1, 2, 3, 4])

In [36]:
# Passing start, stop & step value
np.arange(1, 11, 2)

array([1, 3, 5, 7, 9])

In [37]:
a = np.arange(1, 11)
a

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

### Reshaping an array using `.reshape()` method
the shape should be equal to total number of values(elements) in the array

In [38]:
b = a.reshape([2, 5])
b

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

In [39]:
b.shape

(2, 5)

In [40]:
b.reshape([5, 2])

array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

In [41]:
a.shape

(10,)

In [42]:
c = a[np.newaxis, :]
c

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]])

In [43]:
c.shape

(1, 10)

In [44]:
c = a[: , np.newaxis]
c

array([[ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10]])

In [45]:
c.shape

(10, 1)

### Concatenating two arrays using `np.concatenate()` method 

In [46]:
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
a, b

(array([[1, 2],
        [3, 4]]),
 array([[5, 6]]))

In [47]:
# Concatenate another row
c = np.concatenate((a, b), axis=0)    # axis=0, default
c

array([[1, 2],
       [3, 4],
       [5, 6]])

In [48]:
# Concatenate another column
d = np.concatenate((a, b.T), axis=1) # axis=1
d

array([[1, 2, 5],
       [3, 4, 6]])

In [49]:
# If axis=None, concatenate after flattening them
e = np.concatenate((a, b), axis=None)
e

array([1, 2, 3, 4, 5, 6])

#### Concatenating arrays using `np.hstack()` & `np.vstack()` methods
`np.hstack()` concatenates the arrays horizontally.

`np.vstack()` concatenates the arrays vertically.

In [50]:
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

In [51]:
arr3 = np.hstack((arr1, arr2))
arr3

array([[1, 2, 5, 6],
       [3, 4, 7, 8]])

In [52]:
arr4 = np.vstack((arr1, arr2))
arr4

array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

### Broadcasting
It is the ability to treat arrays of different shapes arithmetic operations.

In [53]:
arr1 = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
arr1

array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

In [54]:
arr1 * 2

array([[ 2,  4],
       [ 6,  8],
       [10, 12],
       [14, 16]])

In [55]:
arr1 * np.array([2])

array([[ 2,  4],
       [ 6,  8],
       [10, 12],
       [14, 16]])

In [56]:
arr1 * np.array([3, 2])

array([[ 3,  4],
       [ 9,  8],
       [15, 12],
       [21, 16]])

In [57]:
arr1

array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

In [58]:
arr1 - 5

array([[-4, -3],
       [-2, -1],
       [ 0,  1],
       [ 2,  3]])

In [59]:
arr1 + 2

array([[ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

In [60]:
arr1 + np.array([[8, 7], [6, 5], [4, 3], [2, 1]])

array([[9, 9],
       [9, 9],
       [9, 9],
       [9, 9]])

In [61]:
a

array([[1, 2],
       [3, 4]])

### Use `.sum()` to calculate sum of all the values in an array

In [62]:
a.sum()

10

In [63]:
# Rows sum, Column sum
a.sum(0), a.sum(axis=1)

(array([4, 6]), array([3, 7]))

### Use `.mean()` to find mean (average) of the array

In [64]:
a.mean()

2.5

In [65]:
# Rows mean, Column mean
a.mean(0), a.mean(axis=1)

(array([2., 3.]), array([1.5, 3.5]))

#### There are several more methods, we can use them as an instance methods / array methods or numpy methods.

### Datatypes

In [66]:
# Int32 datatype by default for integers in windows
np.array([1, 2]).dtype

dtype('int32')

In [67]:
# float32 datatype by default for floats in windows
np.array([1.141, 2.354]).dtype

dtype('float64')

In [68]:
# We can also specify the datatype of an array
x = np.array([1, 2, 3, 4, 5.0, 6.7], dtype=np.int64)
x, x.dtype

(array([1, 2, 3, 4, 5, 6], dtype=int64), dtype('int64'))

In [69]:
y = np.array([1, 2, 3, 4, 5.0, 6.7], dtype=np.float64)
y, y.dtype

(array([1. , 2. , 3. , 4. , 5. , 6.7]), dtype('float64'))

### Copying an array
By default, the arrays are copied by reference.
For actual copy, we need to use **`.copy()`** method.

In [70]:
a

array([[1, 2],
       [3, 4]])

In [71]:
b = a
c = a.copy()
b, c

(array([[1, 2],
        [3, 4]]),
 array([[1, 2],
        [3, 4]]))

In [72]:
a[0][1] = 5
a, b, c

(array([[1, 5],
        [3, 4]]),
 array([[1, 5],
        [3, 4]]),
 array([[1, 2],
        [3, 4]]))

### Creating arrays (generating arrays)
1. Conversion from Python datastructure
2. [Using numpy array creation functions like arange, zeros 🔗](http://localhost:8888/notebooks/numpy-tutorial%20(patrick%20loeber).ipynb#Creating-an-array-using-np.arange()-method) 
3. [Replicating, concatenating & modifying existed arrays 🔗](http://localhost:8888/notebooks/numpy-tutorial%20(patrick%20loeber).ipynb#Concatenating-two-arrays-using-np.concatenate()-method)
4. Reading arrays from custom file formats like CSV
5. Using special library functions like random, pandas, scipy, OpenCV

In [73]:
# 1. Conversion from Python datastructure
a_tuple = (1, 2, 3)
np.array(a_tuple)

array([1, 2, 3])

In [74]:
# 2. Using zeroes
np.zeros((2, 3))    # similarly, np.ones()

array([[0., 0., 0.],
       [0., 0., 0.]])

In [75]:
np.full((4, 2), 5)

array([[5, 5],
       [5, 5],
       [5, 5],
       [5, 5]])

In [76]:
np.full((4, 2), [1, 5])

array([[1, 5],
       [1, 5],
       [1, 5],
       [1, 5]])

In [77]:
# Identity matrix
np.eye(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [78]:
np.eye(3, 5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.]])

In [79]:
# linspace provide number of samples within a given range
# Start & stop value as provided params,
# optionally, nums=50 to generate
np.linspace(1, 50)

array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
       14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26.,
       27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38., 39.,
       40., 41., 42., 43., 44., 45., 46., 47., 48., 49., 50.])

In [80]:
np.linspace(1, 10, 5)

array([ 1.  ,  3.25,  5.5 ,  7.75, 10.  ])

In [81]:
# Generate random float numbers array from 0 to 1 of provided shape
np.random.random((2, 3))

array([[0.70553633, 0.303737  , 0.45628788],
       [0.69094429, 0.12891104, 0.65341291]])

#### 4. Reading arrays from custom file formats like CSV

In [82]:
data = np.genfromtxt("./data.csv", delimiter=",", dtype=np.float32)
data.shape

(100, 28)

#### 5. Using special library functions like random

In [83]:
# randint takes low, high, size
# if only one value is provided, then low=0
np.random.randint(low=1, high=11, size=(3, 3))

array([[ 5,  1,  5],
       [ 7,  8,  3],
       [10,  7,  6]])

In [84]:
np.random.choice([5, 4, 1], size=(3, 3))

array([[4, 4, 1],
       [4, 5, 4],
       [5, 5, 5]])

### To Split an array, we use `np.hsplit()` & `np.vsplit()`

`np.hsplit()` is used to split the arrays horizontally, columns wise.

`np.vsplit()`

In [86]:
arr = np.arange(16).reshape(4, 4)
arr

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [91]:
# Passing the array with passing no. of splits we want
np.hsplit(arr, 2)

[array([[ 0,  1],
        [ 4,  5],
        [ 8,  9],
        [12, 13]]),
 array([[ 2,  3],
        [ 6,  7],
        [10, 11],
        [14, 15]])]

In [93]:
# Passing the array with the indices where the split needed
np.hsplit(arr, [1,3])

[array([[ 0],
        [ 4],
        [ 8],
        [12]]),
 array([[ 1,  2],
        [ 5,  6],
        [ 9, 10],
        [13, 14]]),
 array([[ 3],
        [ 7],
        [11],
        [15]])]

In [95]:
np.vsplit(arr, 2)

[array([[0, 1, 2, 3],
        [4, 5, 6, 7]]),
 array([[ 8,  9, 10, 11],
        [12, 13, 14, 15]])]

In [97]:
# if the provided index is not there,
# it will create an empty array of same shape
np.vsplit(arr, [1, 3, 8])

[array([[0, 1, 2, 3]]),
 array([[ 4,  5,  6,  7],
        [ 8,  9, 10, 11]]),
 array([[12, 13, 14, 15]]),
 array([], shape=(0, 4), dtype=int32)]

### `np.nditer()` an interator to iterate over arrays efficiently
**Order:**
1. C order to iterate over each row and each column
2. F (fortan) order to iterate over each columna and each row
![C and Fortan order](attachment:image.png)

In [103]:
a

array([[1, 5],
       [3, 4]])

In [101]:
for i in np.nditer(a):    # default order="C"
    print(i)

1
5
3
4


In [102]:
for i in np.nditer(a, order="F"):
    print(i)

1
3
5
4


In [104]:
# Use external loop flag to loop only once.
for i in np.nditer(a, order="F", flags=["external_loop"]):
    print(i)

[1 3]
[5 4]


In [110]:
x = np.array([np.arange(1, 5).reshape(2, 2), 
              np.arange(5, 9).reshape(2, 2),
              np.arange(9, 13).reshape(2,2),
              np.arange(13, 17).reshape(2, 2)])
x

array([[[ 1,  2],
        [ 3,  4]],

       [[ 5,  6],
        [ 7,  8]],

       [[ 9, 10],
        [11, 12]],

       [[13, 14],
        [15, 16]]])

In [115]:
# Use external_loop flag to make 1 dimensional array values
for i in np.nditer(x, order="F", flags=["external_loop"]):
    print(i)

[ 1  5  9 13]
[ 3  7 11 15]
[ 2  6 10 14]
[ 4  8 12 16]


In [131]:
a = np.array([[1, 5], [3, 4]])
a

array([[1, 5],
       [3, 4]])

In [132]:
# makes the value readable and writeable both
for x in np.nditer(a, op_flags=["readwrite"]):
    x[...] = x*x
    print(x)
a

1
25
9
16


array([[ 1, 25],
       [ 9, 16]])

In [136]:
b = np.arange(10, 50, 10).reshape(2,2)
b

array([[10, 20],
       [30, 40]])

In [137]:
for x, y in np.nditer([a, b]):
    print(x, y)

1 10
25 20
9 30
16 40
