# Intro to Numpy

Numpy is a powerful package that can assist us with efficient computing powers. We will be using this package a lot for different Machine Learning algorithm practices.

First of all, to actually use the package, we need to import it!

In [1]:
import numpy as np

Numpy is famous for its matrix operations. The matrices are also known as ndarrays. Let's create some! The code simply creates matrices with random values between 0 and 1 with shape (m, n).

Note:
<br>
m = # of rows
<br>
n = # of columns

In [2]:
A = np.random.rand(3, 4)
B = np.random.rand(4, 5)
print("A:\n", A)
print("B:\n", B)

A:
 [[0.44301902 0.55358705 0.62999884 0.76415369]
 [0.17919054 0.22890302 0.24190631 0.04074649]
 [0.96616876 0.98757651 0.81846418 0.6738869 ]]
B:
 [[0.90028323 0.34331779 0.98378841 0.53900232 0.10908683]
 [0.39272762 0.36251292 0.46524656 0.11750489 0.94464401]
 [0.54934626 0.67528543 0.85415538 0.1005783  0.54389411]
 [0.38812551 0.33329522 0.69353805 0.87217157 0.75455479]]


To inspect the shape of a matrix, simply:

In [3]:
A.shape

(3, 4)

Let's perform some basic matrices operations!

In [4]:
# Transpose
print("A:\n", A.T)

# Dot product of two matrices (AxB . BxC gives a dimension of AxC)
C = A.dot(B)
print("Dot product of A and B:\n", C)
print("Shape of C:", C.shape)

A:
 [[0.44301902 0.17919054 0.96616876]
 [0.55358705 0.22890302 0.98757651]
 [0.62999884 0.24190631 0.81846418]
 [0.76415369 0.04074649 0.6738869 ]]
Dot product of A and B:
 [[1.25892657 1.03289658 1.76147801 1.03367481 1.49051871]
 [0.39992386 0.32143602 0.51766674 0.1833498  0.39809607]
 [1.96884704 1.46701238 2.57643399 1.30687701 1.99194696]]
Shape of C: (3, 5)


Numpy also has a subclass called linalg - short for Linear Algebra. This subclass can compute complicated linear algebra operations efficiently! Let's first create a square matrix

In [5]:
square_mat = np.random.rand(3, 3)
square_mat

array([[0.58959679, 0.30434958, 0.67469218],
       [0.84588803, 0.54229824, 0.56567866],
       [0.15492781, 0.72134633, 0.11239809]])

In [6]:
print("The Determinant of square_mat:")
np.linalg.det(square_mat)

The Determinant of square_mat:


0.1480861728057067

In [7]:
print("The Inverse of square_mat:")
np.linalg.inv(square_mat)

The Inverse of square_mat:


array([[-2.3438849 ,  3.05550756, -1.30815941],
       [-0.0502197 , -0.25835652,  1.60171415],
       [ 3.55307415, -2.55358931,  0.42064451]])

Now that we are more familiar with how numpy works, let's move on to slicing!

First, let's create a bigger matrix!

In [8]:
A = np.random.randint(low=0, high=200, size=(10,10))
A

array([[ 13,  61,  27,  37, 119,   1, 146,  28,  53, 168],
       [  3,  53, 159, 183,  25,  31,  48, 118, 125,  67],
       [136,   4, 104,  94,  44, 121,  39, 174, 192, 168],
       [ 85, 132, 165,  83, 187,  92,  44,  70,  30, 173],
       [ 73, 144, 128, 165, 184,  56,  82,  48, 142, 183],
       [100,  36, 172,  97, 151,  93, 117, 108,  18, 131],
       [ 85,  17, 129, 150,  75, 159,  88, 165, 107,  49],
       [186, 105, 154, 103,   1,   1, 193, 111,  82, 137],
       [ 28,  91, 174, 102,  88,  56, 148,  59, 178, 197],
       [155, 117, 163, 152, 183, 108,  32,  80,  33,  47]])

Assume that we want rows from row 3 to row 6 (which means row 2 to 5 since the index starts at 0):

In [9]:
A[2:6]

array([[136,   4, 104,  94,  44, 121,  39, 174, 192, 168],
       [ 85, 132, 165,  83, 187,  92,  44,  70,  30, 173],
       [ 73, 144, 128, 165, 184,  56,  82,  48, 142, 183],
       [100,  36, 172,  97, 151,  93, 117, 108,  18, 131]])

As you can see, the slicing operation is similar to how we slice other objects (i.e. Strings) - [a:b] means from a (inclusive) to b (exclusive).

Now, say we want columns from column 3 to column 5:

In [10]:
A[:, 2:6]

array([[ 27,  37, 119,   1],
       [159, 183,  25,  31],
       [104,  94,  44, 121],
       [165,  83, 187,  92],
       [128, 165, 184,  56],
       [172,  97, 151,  93],
       [129, 150,  75, 159],
       [154, 103,   1,   1],
       [174, 102,  88,  56],
       [163, 152, 183, 108]])

The ":," in the beginning simply means we want all the rows of these columns.

Now, say we want rows 3-5 and columns 3-5.
You can probably guess the code for this:

In [11]:
A[2:6, 2:6]

array([[104,  94,  44, 121],
       [165,  83, 187,  92],
       [128, 165, 184,  56],
       [172,  97, 151,  93]])

Next, broadcasting. This is something the list from vanilla python cannot do, it will raise an error.

In [12]:
# nums = [0, 1, 2, 3, 4]
# nums + 5                Will raise an error; list cannot work with int

A = np.arange(0, 5)
A

array([0, 1, 2, 3, 4])

In [13]:
# This will not raise an error:
A + 5

array([5, 6, 7, 8, 9])

Needless to say, this could be done to big, complicated arrays:

In [14]:
A = np.random.rand(10, 10)
A

array([[0.57687038, 0.24450664, 0.67247502, 0.34450274, 0.70016254,
        0.59037796, 0.87199989, 0.55114854, 0.33954133, 0.19353015],
       [0.67926517, 0.7598611 , 0.72038702, 0.76106755, 0.74332444,
        0.25312281, 0.11831178, 0.07430064, 0.08600708, 0.99704423],
       [0.07632733, 0.05546759, 0.94494364, 0.01265978, 0.33367632,
        0.00798284, 0.54655506, 0.71992477, 0.15215962, 0.20569613],
       [0.96556732, 0.50897252, 0.7194195 , 0.62733974, 0.45076815,
        0.74702307, 0.18429445, 0.82375678, 0.55610196, 0.15959152],
       [0.95247716, 0.90977385, 0.98679708, 0.55705006, 0.44934777,
        0.74476508, 0.18018895, 0.76650888, 0.34842357, 0.04040032],
       [0.7101492 , 0.51015226, 0.75847078, 0.95292218, 0.63834956,
        0.90969519, 0.64701236, 0.42571633, 0.48230942, 0.29959739],
       [0.67570177, 0.84595048, 0.30135012, 0.88697978, 0.38779612,
        0.53661327, 0.56373406, 0.60537396, 0.25213812, 0.36152823],
       [0.84769412, 0.87494513, 0.9144249

In [15]:
A * 10.4323 - 4.342

array([[ 1.67608486, -1.79123337,  2.67346112, -0.74804412,  2.96230569,
         1.81699995,  4.75496445,  1.40774693, -0.799803  , -2.32303542],
       [ 2.744298  ,  3.58509899,  3.17329349,  3.597685  ,  3.41258358,
        -1.70134695, -3.10773602, -3.56687349, -3.44474834,  6.05946453],
       [-3.54573044, -3.76334545,  5.51593554, -4.20992937, -0.8609885 ,
        -4.25872057,  1.35982634,  3.16847122, -2.75462517, -2.19611621],
       [ 5.731088  ,  0.96775397,  3.16320007,  2.20259638,  0.36054853,
         3.45116873, -2.41938504,  4.25167788,  1.45942246, -2.6770934 ],
       [ 5.59452749,  5.14903375,  5.95256322,  1.46931339,  0.34573079,
         3.42761277, -2.46221486,  3.65445061, -0.70714075, -3.92053175],
       [ 3.0664895 ,  0.98006137,  3.57059477,  5.59917005,  2.31745412,
         5.14821314,  2.407827  ,  0.09920044,  0.68959655, -1.21651018],
       [ 2.70712362,  4.4832092 , -1.19822514,  4.91123912, -0.29639449,
         1.25611064,  1.53904281,  1.97344278

Last but not least, reshaping. This is a really important method because it helps us feature engineer much easily.

In [16]:
A = np.random.rand(3,4)
A

array([[0.86256217, 0.5579376 , 0.23560901, 0.92887837],
       [0.58490036, 0.78588328, 0.85494572, 0.48587739],
       [0.36853283, 0.23802236, 0.30410236, 0.8674434 ]])

In [17]:
A.shape

(3, 4)

To reshape, simply use the reshape() method.

In [18]:
# 2D Reshape
A = A.reshape(2,6)
A

array([[0.86256217, 0.5579376 , 0.23560901, 0.92887837, 0.58490036,
        0.78588328],
       [0.85494572, 0.48587739, 0.36853283, 0.23802236, 0.30410236,
        0.8674434 ]])

In [19]:
# 1D Reshape
A = A.ravel()
A

array([0.86256217, 0.5579376 , 0.23560901, 0.92887837, 0.58490036,
       0.78588328, 0.85494572, 0.48587739, 0.36853283, 0.23802236,
       0.30410236, 0.8674434 ])

In [20]:
# 3D Reshape
A = A.reshape(2, 2, 3)
A

array([[[0.86256217, 0.5579376 , 0.23560901],
        [0.92887837, 0.58490036, 0.78588328]],

       [[0.85494572, 0.48587739, 0.36853283],
        [0.23802236, 0.30410236, 0.8674434 ]]])

There are many, many more things we can do with numpy. However, we will mostly be using these methods throughout the course. Hope you had fun! See you in the next meeting :)