# Recalled by Hariz

# What is NumPy?

NumPy stands for Numerical Python, which is the fundamental package for scientific computing in Python. It supports for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

# Why use NumPy?

- Faster and more powerful than lists
- Supports basic matrix operations
- Supports statistics operands
- Supports broadcasting
- Supports vectorization

## Get started with Numpy

In [1]:
#Let's import the NumPy library and create our first NumPy array.
import numpy as np

In [4]:
# create 1-D arrays
my_arr = np.array([1, 2, 3, 4, 5])
print(my_arr)
print(type(my_arr))
print(my_arr.shape)
print(my_arr.size)
print(my_arr.ndim)
# The ndim is an attribute in the pandas DataFrame which is used to get the integer/number representation of dimensions of the given DataFrame object

[1 2 3 4 5]
<class 'numpy.ndarray'>
(5,)
5
1


In [7]:
# create 2-D arrays
my_arr2D = np.array([[1, 2, 3], [4, 5, 6]])
print(my_arr2D)
print(type(my_arr2D))
print(my_arr2D.shape)
print(my_arr2D.size)
print(my_arr2D.ndim)

[[1 2 3]
 [4 5 6]]
<class 'numpy.ndarray'>
(2, 3)
6
2


In [8]:
# create 3-D arrays
my_arr3D = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(my_arr3D)
print(type(my_arr3D))
print(my_arr3D.shape)
print(my_arr3D.size)
print(my_arr3D.ndim)

[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [4 5 6]]]
<class 'numpy.ndarray'>
(2, 2, 3)
12
3


## Indexing

NumPy arrays indexing works similar like Python's lists

In [13]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [10]:
a[0]

0

In [11]:
a[-1]

9

In [12]:
a[-5] = 50
a

array([ 0,  1,  2,  3,  4, 50,  6,  7,  8,  9])

## Slicing

var[lower:upper:step]

- Lower boundary is inclusive
- Upper boundary is exclusive
- Step specifics the stride between elements

In [14]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [15]:
a[1:3]

array([1, 2])

In [16]:
# negative indices do work
a[1:-1]

array([1, 2, 3, 4, 5, 6, 7, 8])

In [17]:
# ommiting boundaries
a[:3]

array([0, 1, 2])

In [18]:
a[3:]

array([3, 4, 5, 6, 7, 8, 9])

In [19]:
# every other elements (take every two steps)
a[::2]

array([0, 2, 4, 6, 8])

## Fancy Indexing

In [21]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [22]:
mask = a > 5
mask

array([False, False, False, False, False, False,  True,  True,  True,
        True])

In [23]:
a[mask]

array([6, 7, 8, 9])

In [24]:
a[a%2==0]

array([0, 2, 4, 6, 8])

## Element-wise operations

- Add
- Subtract
- Multiply
- Divide
- Square root
- Power
- Exponential
- Logarithm

In [25]:
a = np.arange(1,6)
a

array([1, 2, 3, 4, 5])

In [26]:
# Add
a + 5

array([ 6,  7,  8,  9, 10])

In [27]:
# Subtract
a - 5

array([-4, -3, -2, -1,  0])

In [28]:
# Multiply
a * 5

array([ 5, 10, 15, 20, 25])

In [29]:
# Divide
a / 5

array([0.2, 0.4, 0.6, 0.8, 1. ])

In [30]:
# Old-style divide (Python 2)
a // 5

array([0, 0, 0, 0, 1], dtype=int32)

In [31]:
# square root
np.sqrt(a)

array([1.        , 1.41421356, 1.73205081, 2.        , 2.23606798])

In [33]:
# power
a ** 5

array([   1,   32,  243, 1024, 3125], dtype=int32)

In [34]:
# exponential
np.exp(a)

array([  2.71828183,   7.3890561 ,  20.08553692,  54.59815003,
       148.4131591 ])

In [35]:
# logarithm
np.log(a)

array([0.        , 0.69314718, 1.09861229, 1.38629436, 1.60943791])

In [36]:
# log10
np.log10(a)

array([0.        , 0.30103   , 0.47712125, 0.60205999, 0.69897   ])

## Matrix Operations

- Arithmetic
- Dot product
- Cross product
- Inverse
- Transpose

In [37]:
a = np.array([[1,2,3],[1,2,3]])
b = np.array([[4,5,6],[4,5,6]])
print(a)
print(b)

[[1 2 3]
 [1 2 3]]
[[4 5 6]
 [4 5 6]]


In [38]:
a+b

array([[5, 7, 9],
       [5, 7, 9]])

In [39]:
a-b

array([[-3, -3, -3],
       [-3, -3, -3]])

In [40]:
print(a.dot(b.T))
print(a @ b.T)

[[32 32]
 [32 32]]
[[32 32]
 [32 32]]


In [41]:
np.cross(a,b)

array([[-3,  6, -3],
       [-3,  6, -3]])

In [42]:
np.linalg.inv(np.array([[1,1,1],[0,2,5],[2,5,-1]]))

array([[ 1.28571429, -0.28571429, -0.14285714],
       [-0.47619048,  0.14285714,  0.23809524],
       [ 0.19047619,  0.14285714, -0.0952381 ]])

In [43]:
np.linalg.pinv(a)

array([[0.03571429, 0.03571429],
       [0.07142857, 0.07142857],
       [0.10714286, 0.10714286]])

##  Broadcasting

NumPy arrays of different dimensionality can be combined in the same expression. Arrays with smaller dimension are broadcasted to match larger arrays without copying the data.

In [44]:
a = np.array([0,10,20,30])
b = np.array([4,5,6])
a = a[:, np.newaxis]
print(a)
print(b)

[[ 0]
 [10]
 [20]
 [30]]
[4 5 6]


In [45]:
a+b

array([[ 4,  5,  6],
       [14, 15, 16],
       [24, 25, 26],
       [34, 35, 36]])

In [46]:
# Note: The trailing axes of either arrays must be 1 or both have the same size for broadcasting to occur
error_a = np.arange(25).reshape(5,5) # 5 x 5 
error_b = np.arange(24).reshape(6,4) # 6 x 4
print(error_a)
print(error_b)
print(error_a+error_b)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]]


ValueError: operands could not be broadcast together with shapes (5,5) (6,4) 

## Vectorization

In [47]:
my_list = [i for i in range(100000)]
my_array = np.array(my_list)

In [48]:
%%timeit
new_list = [i**50  for i in my_list]

88.9 ms ± 2.92 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [49]:
%%timeit
new_array = my_array**50

533 µs ± 12.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
