# Introduction to Python  

# Introduction to [Numpy](https://numpy.org/)

Why use NumPy?

NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use.  
NumPy uses much less memory to store data and it provides a mechanism of specifying the data types. This allows the code to be optimized even further.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
#dir(np)

## 1. Numpy Basic Objects - Array

An array is a central data structure of the NumPy library. An array is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element. It has a grid of elements that can be indexed in various ways. The elements are all of the same type, referred to as the dtype of the array.  


<img src="https://vitalflux.com/wp-content/uploads/2020/09/Screenshot-2020-09-14-at-2.19.00-PM.png" alt="Tensors" style="width:500px;height:300px;"> 


An array can be indexed by a tuple of nonnegative integers, by booleans, by another array, or by integers. 
+ The rank of the array is the number of dimensions.  
+ The shape of the array is a tuple of integers giving the size of the array along each dimension.  

We can initialize NumPy arrays from Python lists, using nested lists for two-, or higher-dimensional data.  

In [3]:
my_numbers = [1,2,3,4]
simple_array = np.array(my_numbers)
simple_array

array([1, 2, 3, 4])

In [4]:
type(simple_array)

numpy.ndarray

In [5]:
#dir(simple_array)

In [6]:
print(simple_array.shape)
print(simple_array.size)
print(simple_array.ndim)

(4,)
4
1


In [7]:
my_other_numbers = [[1,2,3,4],[4,5,6,7],[8,9,0,1]]
other_simple_array = np.array(my_other_numbers)
other_simple_array

array([[1, 2, 3, 4],
       [4, 5, 6, 7],
       [8, 9, 0, 1]])

In [8]:
print(other_simple_array.shape)
print(other_simple_array.size)
print(other_simple_array.ndim)

(3, 4)
12
2


### 1.1 - Operations between arrays (scalar and vectorial)

In [9]:
A = np.array([[1,2,3],[4,5,6],[8,9,0]])
B = np.array([[2,1,5],[9,2,1],[8,7,6]])

In [10]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [8, 9, 0]])

In [11]:
B

array([[2, 1, 5],
       [9, 2, 1],
       [8, 7, 6]])

In [12]:
A + 2

array([[ 3,  4,  5],
       [ 6,  7,  8],
       [10, 11,  2]])

In [13]:
A * 3

array([[ 3,  6,  9],
       [12, 15, 18],
       [24, 27,  0]])

In [14]:
A / 4

array([[0.25, 0.5 , 0.75],
       [1.  , 1.25, 1.5 ],
       [2.  , 2.25, 0.  ]])

In [15]:
A + B

array([[ 3,  3,  8],
       [13,  7,  7],
       [16, 16,  6]])

In [16]:
A * B

array([[ 2,  2, 15],
       [36, 10,  6],
       [64, 63,  0]])

In [17]:
A.dot(B)

array([[ 44,  26,  25],
       [101,  56,  61],
       [ 97,  26,  49]])

In [18]:
A.T
#A.transpose()

array([[1, 4, 8],
       [2, 5, 9],
       [3, 6, 0]])

In [19]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [8, 9, 0]])

### 1.2 - Creating specific arrays

+ arange
+ linspace
+ logspace
+ zeros
+ ones
+ empty
+ identity
+ eye

In [20]:
a = np.arange(20)  # ==> np.arange(0,20,1)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

In [21]:
a = np.arange(1,5,0.2)
a

array([1. , 1.2, 1.4, 1.6, 1.8, 2. , 2.2, 2.4, 2.6, 2.8, 3. , 3.2, 3.4,
       3.6, 3.8, 4. , 4.2, 4.4, 4.6, 4.8])

In [22]:
b = np.linspace(1,10,30)
#b = np.linspace(1,2*np.pi,50)
b

array([ 1.        ,  1.31034483,  1.62068966,  1.93103448,  2.24137931,
        2.55172414,  2.86206897,  3.17241379,  3.48275862,  3.79310345,
        4.10344828,  4.4137931 ,  4.72413793,  5.03448276,  5.34482759,
        5.65517241,  5.96551724,  6.27586207,  6.5862069 ,  6.89655172,
        7.20689655,  7.51724138,  7.82758621,  8.13793103,  8.44827586,
        8.75862069,  9.06896552,  9.37931034,  9.68965517, 10.        ])

In [23]:
b2 = np.logspace(1,100,30)
b2

array([1.00000000e+001, 2.59294380e+004, 6.72335754e+007, 1.74332882e+011,
       4.52035366e+014, 1.17210230e+018, 3.03919538e+021, 7.88046282e+024,
       2.04335972e+028, 5.29831691e+031, 1.37382380e+035, 3.56224789e+038,
       9.23670857e+041, 2.39502662e+045, 6.21016942e+048, 1.61026203e+052,
       4.17531894e+055, 1.08263673e+059, 2.80721620e+062, 7.27895384e+065,
       1.88739182e+069, 4.89390092e+072, 1.26896100e+076, 3.29034456e+079,
       8.53167852e+082, 2.21221629e+086, 5.73615251e+089, 1.48735211e+093,
       3.85662042e+096, 1.00000000e+100])

In [24]:
a1 = np.zeros((3,4))
print(a1)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


In [25]:
a2 = np.ones((2,2))
print(a2)

[[1. 1.]
 [1. 1.]]


In [26]:
a3 = np.empty((2,3))
print(a3)

[[4.65975970e-310 0.00000000e+000 6.93715209e-310]
 [            nan 6.09114680e+247 2.64988877e+180]]


In [27]:
a4 = np.identity(3)
print(a4)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [28]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

### 1.3 - Modifying Dimensions

In [29]:
c = np.arange(10)
c

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [30]:
d = c.reshape(2,5)
#d = np.arange(10).reshape(2,5)
d

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [31]:
print(c.shape)
print(d.shape)
print(np.ndim(d))
print(d.dtype.name)

(10,)
(2, 5)
2
int64


In [32]:
d2 = np.arange(100).reshape(2,10,5)
d2

array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34],
        [35, 36, 37, 38, 39],
        [40, 41, 42, 43, 44],
        [45, 46, 47, 48, 49]],

       [[50, 51, 52, 53, 54],
        [55, 56, 57, 58, 59],
        [60, 61, 62, 63, 64],
        [65, 66, 67, 68, 69],
        [70, 71, 72, 73, 74],
        [75, 76, 77, 78, 79],
        [80, 81, 82, 83, 84],
        [85, 86, 87, 88, 89],
        [90, 91, 92, 93, 94],
        [95, 96, 97, 98, 99]]])

In [33]:
d2.ndim

3

In [34]:
d2.shape

(2, 10, 5)

### 1.4 - Slicing multidimensional arrays

In [35]:
d2

array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34],
        [35, 36, 37, 38, 39],
        [40, 41, 42, 43, 44],
        [45, 46, 47, 48, 49]],

       [[50, 51, 52, 53, 54],
        [55, 56, 57, 58, 59],
        [60, 61, 62, 63, 64],
        [65, 66, 67, 68, 69],
        [70, 71, 72, 73, 74],
        [75, 76, 77, 78, 79],
        [80, 81, 82, 83, 84],
        [85, 86, 87, 88, 89],
        [90, 91, 92, 93, 94],
        [95, 96, 97, 98, 99]]])

In [36]:
d2[0:1,4:6,1:3]

array([[[21, 22],
        [26, 27]]])

In [37]:
d2[d2%2==0]
print(np.ndim(d2[d2%2==0]))

1


In [38]:
#np.mask_indices?

In [39]:
d2[~d2%2==0]  #negation of condition

array([ 1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33,
       35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67,
       69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99])

### 1.5 - Stacking and Concatenating Arrays

In [40]:
a = np.arange(16).reshape(4,4)
a

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [41]:
np.vstack([a,np.arange(4).reshape(1,4)])

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [ 0,  1,  2,  3]])

In [42]:
np.hstack([a,np.arange(4).reshape(4,1)])

array([[ 0,  1,  2,  3,  0],
       [ 4,  5,  6,  7,  1],
       [ 8,  9, 10, 11,  2],
       [12, 13, 14, 15,  3]])

You can also use the generic np.stack:

In [43]:
a1 = np.array([1, 2, 3, 4])
a2 = np.array([5, 6, 7, 8])

In [44]:
np.stack((a1,a2), axis=0)

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [45]:
np.stack((a1,a2), axis=1)

array([[1, 5],
       [2, 6],
       [3, 7],
       [4, 8]])

In [46]:
np.concatenate((a1,a2), axis=0)

array([1, 2, 3, 4, 5, 6, 7, 8])

### 1.6 - Sorting array data:

In [47]:
arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])
arr

array([2, 1, 5, 3, 7, 4, 6, 8])

In [48]:
np.sort(arr)

array([1, 2, 3, 4, 5, 6, 7, 8])

***

## 2. Numpy Basic Objects - Matrix

In [49]:
a = np.array([[1,2.],[4,3]])
b = np.array([[1,9],[7,5]])

In [50]:
a

array([[1., 2.],
       [4., 3.]])

In [51]:
b

array([[1, 9],
       [7, 5]])

In [52]:
#dir(a)

In [53]:
a * b

array([[ 1., 18.],
       [28., 15.]])

In [54]:
A = np.matrix(a)
B = np.matrix(b)

In [55]:
B

matrix([[1, 9],
        [7, 5]])

In [56]:
#dir(B)

In [57]:
B.I

matrix([[-0.0862069 ,  0.15517241],
        [ 0.12068966, -0.01724138]])

In [58]:
A * B

matrix([[15., 19.],
        [25., 51.]])

In [59]:
print(type(a))
print(type(A))
print(type(a * b))
print(type(A * B))
print(type(a * B))

<class 'numpy.ndarray'>
<class 'numpy.matrix'>
<class 'numpy.ndarray'>
<class 'numpy.matrix'>
<class 'numpy.matrix'>


## 3. Numpy Datatypes

In [60]:
x = np.array([1, 2])   # Let numpy choose the datatype
print(x.dtype)         # Prints "int64"

int64


In [61]:
x = np.array([1.0, 2.0])   # Let numpy choose the datatype
print(x.dtype)             # Prints "float64"

float64


In [62]:
x = np.array([1, 2], dtype=np.float64)   # Force a particular datatype
print(x.dtype)                           # Prints "int64"

float64


## 4. Array Math

In [63]:
a = np.array([[1,2.],[4,3]])
a

array([[1., 2.],
       [4., 3.]])

### 4.1 - Scalar operations

In [64]:
a * 2

array([[2., 4.],
       [8., 6.]])

### 4.2 - Array Methods

In [65]:
a.cumsum()

array([ 1.,  3.,  7., 10.])

#### The original array stays the same

In [66]:
a

array([[1., 2.],
       [4., 3.]])

### 4.3 - Stacking

In [67]:
a = np.arange(16).reshape(4,4)
b = np.arange(4).reshape(1,4)
c = np.arange(4).reshape(4,1)

np.vstack([a,b])

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [ 0,  1,  2,  3]])

In [68]:
np.hstack([a,c])

array([[ 0,  1,  2,  3,  0],
       [ 4,  5,  6,  7,  1],
       [ 8,  9, 10, 11,  2],
       [12, 13, 14, 15,  3]])

### 4.4 - Equivalence: operator and methods

In [69]:
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

Elementwise sum; both produce the array  
[[ 6.0  8.0]  
 [10.0 12.0]]  

In [70]:
print(x + y)
print(np.add(x, y))

[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]


Elementwise difference; both produce the array  
[[-4.0 -4.0]  
 [-4.0 -4.0]]  

In [71]:
print(x - y)
print(np.subtract(x, y))

[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]


Elementwise product; both produce the array  
[[ 5.0 12.0]  
 [21.0 32.0]]  

In [72]:
print(x * y)
print(np.multiply(x, y))

[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]


Elementwise division; both produce the array  
[[ 0.2         0.33333333]  
 [ 0.42857143  0.5       ]]  

In [73]:
print(x / y)
print(np.divide(x, y))

[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]


Elementwise square root; produces the array  
[[ 1.          1.41421356]  
 [ 1.73205081  2.        ]]  

In [74]:
print(np.sqrt(x))

[[1.         1.41421356]
 [1.73205081 2.        ]]


### 4.5 - Inner product

In [75]:
x = np.matrix([[1,2],[3,4]])
y = np.matrix([[5,6],[7,8]])
v = np.array([9,10])
w = np.array([11,12])

In [76]:
print(x)

[[1 2]
 [3 4]]


In [77]:
print(y)

[[5 6]
 [7 8]]


In [78]:
print(v)

[ 9 10]


In [79]:
print(w)

[11 12]


Inner product of Arrays; both produce 219

In [80]:
print(v.dot(w), '\n')
print(np.dot(v, w))

219 

219


Matrix / Array product; both produce the rank 1 array [29 67]  

In [81]:
print(x.dot(v), '\n')
print(np.dot(x, v))

[[29 67]] 

[[29 67]]


Matrix / matrix product; both produce the rank 2 array  

In [82]:
print(x.dot(y), '\n')
print(np.dot(x, y))

[[19 22]
 [43 50]] 

[[19 22]
 [43 50]]


### 4.6 - Row-wise and Column-wise operations

In [83]:
x = np.array([[1,2],[3,4]])
print(x)

[[1 2]
 [3 4]]


Compute sum of all elements; prints "10"

In [84]:
print(np.sum(x), '\n')

10 



Compute sum of each column; prints "[4 6]"

In [85]:
print(np.sum(x, axis=0))  

[4 6]


Compute sum of each row; prints "[3 7]"

In [86]:
print(np.sum(x, axis=1)) 

[3 7]


### 4.7 - Transposing

In [87]:
print(x, '\n')
print(x.T)

[[1 2]
 [3 4]] 

[[1 3]
 [2 4]]


## 5 - Vectorized Operations and Broadcasting

In [88]:
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = np.empty_like(x)   # Create an empty matrix with the same shape as x

In [89]:
print(x)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


In [90]:
print(v)

[1 0 1]


In [91]:
print(y)

[[0 0 0]
 [0 0 0]
 [0 0 0]
 [0 0 0]]


We will add the vector v to each row of the matrix x, storing the result in the matrix y

In [92]:
%%time

for i in range(4):
    y[i, :] = x[i, :] + v
print(y)

[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]
CPU times: user 207 µs, sys: 66 µs, total: 273 µs
Wall time: 240 µs


This works; however when the matrix x is very large, computing an explicit loop in Python could be slow. Note that adding the vector v to each row of the matrix x is equivalent to forming a matrix vv by stacking multiple copies of v vertically, then performing elementwise summation of x and vv. We could implement this approach like this:

We could add the vector v to each row of the matrix x, storing the result in the matrix y

In [93]:
vv = np.tile(v, (4, 1))   # Stack 4 copies of v on top of each other
print(vv)

[[1 0 1]
 [1 0 1]
 [1 0 1]
 [1 0 1]]


Add x and vv elementwise

In [94]:
y = x + vv  
print(y)

[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]


Numpy broadcasting allows us to perform this computation without actually creating multiple copies of v. Consider this version, using broadcasting:

In [95]:
%%time

y = x + v  # Add v to each row of x using broadcasting
print(y)

[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]
CPU times: user 157 µs, sys: 49 µs, total: 206 µs
Wall time: 201 µs


The line y = x + v works even though x has shape (4, 3) and v has shape (3,) due to broadcasting; this line works as if v actually had shape (4, 3), where each row was a copy of v, and the sum was performed elementwise.

#### More on Broadcasting:

In [96]:
x = np.ones((3,4))
y = np.random.random((5,1,4))

In [97]:
print(x)
print(x.shape)

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]
(3, 4)


In [98]:
print(y)
print(y.shape)

[[[0.70015144 0.39863124 0.46129372 0.19509156]]

 [[0.43304475 0.36417879 0.98433437 0.88063144]]

 [[0.70198761 0.98933084 0.22090772 0.86894756]]

 [[0.45433769 0.3976259  0.85585213 0.95674322]]

 [[0.44146603 0.99240751 0.0165041  0.8475694 ]]]
(5, 1, 4)


Add `x` and `y`

In [99]:
print(x + y)

[[[1.70015144 1.39863124 1.46129372 1.19509156]
  [1.70015144 1.39863124 1.46129372 1.19509156]
  [1.70015144 1.39863124 1.46129372 1.19509156]]

 [[1.43304475 1.36417879 1.98433437 1.88063144]
  [1.43304475 1.36417879 1.98433437 1.88063144]
  [1.43304475 1.36417879 1.98433437 1.88063144]]

 [[1.70198761 1.98933084 1.22090772 1.86894756]
  [1.70198761 1.98933084 1.22090772 1.86894756]
  [1.70198761 1.98933084 1.22090772 1.86894756]]

 [[1.45433769 1.3976259  1.85585213 1.95674322]
  [1.45433769 1.3976259  1.85585213 1.95674322]
  [1.45433769 1.3976259  1.85585213 1.95674322]]

 [[1.44146603 1.99240751 1.0165041  1.8475694 ]
  [1.44146603 1.99240751 1.0165041  1.8475694 ]
  [1.44146603 1.99240751 1.0165041  1.8475694 ]]]


You see that, even though x and y seem to have somewhat different dimensions, the two can be added together.  
That is because they are compatible in all dimensions:

    Array x has dimensions 3 X 4,
    Array y has dimensions 5 X 1 X 4

Since you have seen above that dimensions are also compatible if one of them is equal to 1, you see that these two arrays are indeed a good candidate for broadcasting!  

What you will notice is that in the dimension where y has size 1 and the other array has a size greater than 1 (that is, 3), the first array behaves as if it were copied along that dimension.  

Note that the shape of the resulting array will again be the maximum size along each dimension of x and y: the dimension of the result will be (5,3,4)  

In short, if you want to make use of broadcasting, you will rely a lot on the shape and dimensions of the arrays with which you’re working.  

## 6 - Other useful functions:

In [100]:
grades1 = np.array([0,3,5,7,9,2,4,6])
grades2 = np.array([0,3,4.9,7,9,4,4,6])

In [101]:
grades1

array([0, 3, 5, 7, 9, 2, 4, 6])

In [102]:
grades2

array([0. , 3. , 4.9, 7. , 9. , 4. , 4. , 6. ])

In [103]:
np.where(grades1 > 4)

(array([2, 3, 4, 7]),)

In [104]:
np.where(grades1 > 4, 'bigger', 'lower')

array(['lower', 'lower', 'bigger', 'bigger', 'bigger', 'lower', 'lower',
       'bigger'], dtype='<U6')

In [105]:
grades1.argmin() #Position of min

0

In [106]:
grades1.argmax() #Position of max

4

In [107]:
grades1.argsort()

array([0, 5, 1, 6, 2, 7, 3, 4])

In [108]:
np.intersect1d(grades1,grades2)

array([0., 3., 4., 6., 7., 9.])

In [109]:
np.allclose(grades1, grades2, 0.1)  #Returns True if two arrays are element-wise equal within a tolerance.

False

In [110]:
np.allclose(grades1, grades2, 0.5)  # absolute(a - b) <= (absolute tolerance + relative tolerance * absolute(b))

True

In [111]:
np.isclose(grades1, grades2, 0.1)

array([ True,  True,  True,  True,  True, False,  True,  True])

In [112]:
np.equal(grades1, grades2)

array([ True,  True, False,  True,  True, False,  True,  True])

In [113]:
np.any(grades1)   # Test whether any array element along a given axis evaluates to True.

True

In [114]:
np.all(grades1) # Test whether all array elements along a given axis evaluate to True.

False