![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)

<hr style="margin-bottom: 40px;">

<img src="https://user-images.githubusercontent.com/7065401/39118381-910eb0c2-46e9-11e8-81f1-a5b897401c23.jpeg"
    style="width:300px; float: right; margin: 0 40px 40px 40px;"></img>

# Numpy: Numeric computing library

NumPy (Numerical Python) is one of the core packages for numerical computing in Python. Pandas, Matplotlib, Statmodels and many other Scientific libraries rely on NumPy.

NumPy major contributions are:

- Efficient numeric computation with C primitives
- Efficient collections with vectorized operations
- An integrated and natural Linear Algebra API
- A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.

Let's develop on efficiency. In Python, **everything is an object**, which means that even simple ints are also objects, with all the required machinery to make object work. We call them "Boxed Ints". In contrast, NumPy uses primitive numeric types (floats, ints) which makes storing and computation efficient.


<img src="https://docs.google.com/drawings/d/e/2PACX-1vTkDtKYMUVdpfVb3TTpr_8rrVtpal2dOknUUEOu85wJ1RitzHHf5nsJqz1O0SnTt8BwgJjxXMYXyIqs/pub?w=726&h=396" />


![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)

## Hands on!


In [1]:
import sys
import numpy as np

## Basic Numpy Arrays


In [2]:
np.array([1, 2, 3, 4])

array([1, 2, 3, 4])

In [9]:
a = np.array([10, 2, 3, 4])

In [10]:
b = np.array([0, .5, 1, 1.5, 2])

In [12]:
a[0], a[2]

(np.int64(10), np.int64(3))

In [7]:
# a[0], a[1]
for i in range(a.shape[0]):
    print(a[i], end= " ")

1 2 3 4 

In [13]:
a[0:]

array([10,  2,  3,  4])

In [16]:
a[1:3]

array([2, 3])

In [15]:
a[1:-1]

array([2, 3])

In [14]:
a[::2] # index increse by 2

array([10,  3])

In [17]:
b

array([0. , 0.5, 1. , 1.5, 2. ])

In [18]:
b[0], b[2], b[-1]

(np.float64(0.0), np.float64(1.0), np.float64(2.0))

In [12]:
b[[0, 2, -1]]

array([0., 1., 2.])

In [19]:
# Multi indexing
print(b[[0, 2, -1]]) # Passing another list containing indeces

[0. 1. 2.]


In [23]:
l = np.array([10, 20, 30, 40, 50, 50])
print(l[[0, 3]]) # Multi indeces is possinle in NumPy array

[10 40]


![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Array Types


In [24]:
a

array([10,  2,  3,  4])

In [25]:
a.dtype

dtype('int64')

In [26]:
b

array([0. , 0.5, 1. , 1.5, 2. ])

In [27]:
b.dtype

dtype('float64')

In [34]:
# Converting int array to float array
i = np.array([1, 2, 3, 4], dtype=np.float16)
i

array([1., 2., 3., 4.], dtype=float16)

In [28]:
# Converting int array to float array
x = np.array([1,2,3], dtype = np.float32)
print(x)

[1. 2. 3.]


In [29]:
# float to int array
np.array([1, 2, 3, 4], dtype=np.int8)

array([1, 2, 3, 4], dtype=int8)

In [37]:
# float to int array
np.array([2.4, 5.6, 6.7], dtype = np.int64)

array([2, 5, 6])

In [42]:
# Or this
arr = np.array([3.4, 5.6, 9.7])
np.round(arr)

array([ 3.,  6., 10.])

In [43]:
c = np.array(['a', 'b', 'c'])

In [46]:
print(c)

['a' 'b' 'c']


In [44]:
c.dtype

dtype('<U1')

In [47]:
d = np.array([{'a': 1}, sys])

In [49]:
d

array([{'a': 1}, <module 'sys' (built-in)>], dtype=object)

In [48]:
d.dtype

dtype('O')

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Dimensions and shapes


In [2]:
# 2x3 array
import numpy as np
A = np.array([
    [1, 2, 3],
    [4, 5, 6]
])

In [15]:
A.dtype

dtype('int64')

In [3]:
# To check the order of the array
A.shape

(2, 3)

In [4]:
M = np.array([
    [1,2,3,4],
    [5,6,7,8],
    [9,8,7,6]
])

In [5]:
# It returns a tuple with order of the array
M.shape

(3, 4)

In [7]:
# Returns the dimension of the array
A.ndim # 2D array

2

In [8]:
# 3x3x4 array
N = np.array([
    [[1,2,3,4],
    [5,6,7,8],
    [9,8,7,6]],
    [[1,2,3,4],
    [5,6,7,8],
    [9,8,7,6]],
    [[1,2,3,4],
    [5,6,7,8],
    [9,8,7,6]]
    ])

In [16]:
A.dtype

dtype('int64')

In [11]:
N.shape
# Dimension of the array
N.ndim # 3D array

3

In [14]:
N.size

36

In [13]:
# Total number of elements
A.size

6

In [25]:
B = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4],
        [3, 2, 1]
    ]
])

In [26]:
B.dtype

dtype('int64')

In [27]:
B

array([[[12, 11, 10],
        [ 9,  8,  7]],

       [[ 6,  5,  4],
        [ 3,  2,  1]]])

In [29]:
B.shape

(2, 2, 3)

In [30]:
B.ndim

3

In [31]:
B.size

12

If the shape isn't consistent, it'll just fall back to regular Python objects:


In [32]:
C = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4]
    ]
])

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

In [33]:
C.dtype

NameError: name 'C' is not defined

In [34]:
C.shape

NameError: name 'C' is not defined

In [35]:
C.size

NameError: name 'C' is not defined

In [36]:
type(C[0])

NameError: name 'C' is not defined

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Indexing and Slicing of Matrices


In [37]:
# Square matrix
A = np.array([
#.   0  1  2
    [1, 2, 3], # 0
    [4, 5, 6], # 1
    [7, 8, 9]  # 2
])

In [41]:
A[1] # It gives the 1st row

array([4, 5, 6])

In [40]:
A[1][0]

np.int64(4)

In [None]:
# A[d1, d2, d3, d4]

In [42]:
# A[1][0] Both same
A[1, 0]

np.int64(4)

In [43]:
# 0th row to (2-1)th
A[0:2]

array([[1, 2, 3],
       [4, 5, 6]])

In [46]:
A[-3:]

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [47]:
A[0: , 0:2]
# 0: -> Select all row
# 0:2 -> From all row -> 0th row to (2-1)th

array([[1, 2],
       [4, 5],
       [7, 8]])

In [49]:
A[0: , 1:3]

array([[2, 3],
       [5, 6],
       [8, 9]])

In [50]:
A[:2, :2]

array([[1, 2],
       [4, 5]])

In [51]:
A[:2, 2:]

array([[3],
       [6]])

In [52]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [53]:
A[1] = np.array([10, 10, 10])

In [46]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [ 7,  8,  9]])

In [54]:
A[2] = 99 # All 99 in row 2

In [55]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [99, 99, 99]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Summary statistics


In [56]:
a = np.array([1, 2, 3, 4])

In [57]:
a.sum()

np.int64(10)

In [58]:
a.mean()

np.float64(2.5)

##### **In NumPy, the .std() method is used to compute the standard deviation of the elements in an array. Standard deviation is a statistical measure that quantifies the amount of variation or dispersion of a set of values.**


In [59]:
a.std() # Learn it later

np.float64(1.118033988749895)

In [62]:
a.var()

np.float64(1.25)

In [63]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [64]:
A.sum()

np.int64(45)

In [65]:
A.mean()

np.float64(5.0)

In [66]:
A.std()

np.float64(2.581988897471611)

In [67]:
A.sum(axis=0) # Sum of 1st column

array([12, 15, 18])

In [68]:
A.sum(axis=1) # Sum of 2nd column

array([ 6, 15, 24])

In [69]:
A.mean(axis=0) # Mean of 1st column

array([4., 5., 6.])

In [88]:
A.mean(axis=1)

array([2., 5., 8.])

In [89]:
A.std(axis=0)

array([2.44948974, 2.44948974, 2.44948974])

In [90]:
A.std(axis=1)

array([0.81649658, 0.81649658, 0.81649658])

And [many more](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.ndarray.html#array-methods)...


![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Broadcasting and Vectorized operations


In [70]:
a = np.arange(4) # It will create an array from (0 to range - 1)

In [71]:
a

array([0, 1, 2, 3])

In [72]:
a + 10 # It sum up with all elements of the array

array([10, 11, 12, 13])

In [73]:
a * 10

array([ 0, 10, 20, 30])

In [74]:
a

array([0, 1, 2, 3])

In [75]:
a += 100

In [76]:
a

array([100, 101, 102, 103])

In [77]:
l = [0, 1, 2, 3]

In [78]:
[i * 10 for i in l]

[0, 10, 20, 30]

In [79]:
M = [1,2,3,4]

In [80]:
[i + 5 for i in M]

[6, 7, 8, 9]

In [81]:
a = np.arange(4)

In [82]:
a

array([0, 1, 2, 3])

In [83]:
b = np.array([10, 10, 10, 10])

In [84]:
b

array([10, 10, 10, 10])

In [86]:
a + b # Matrix Addition 

array([10, 11, 12, 13])

In [87]:
a * b # Matrix Multiplication 

array([ 0, 10, 20, 30])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Boolean arrays

_(Also called masks)_


In [3]:
import numpy as np
a = np.arange(4)

In [4]:
a

array([0, 1, 2, 3])

In [5]:
a[0], a[-1]

(np.int64(0), np.int64(3))

In [6]:
a[[0, -1]]

array([0, 3])

In [7]:
a[[True, False, False, True]] # Multi indeces using boolean 
# True will be printed, false won't

array([0, 3])

In [8]:
a

array([0, 1, 2, 3])

In [9]:
a >= 2 # Which are greater than or equal 2 they are true


array([False, False,  True,  True])

In [11]:
# Which are greater than or equal to 2 they will be printed
a[a >= 2] 

array([2, 3])

In [15]:
# Which are not greater than or equal to 2 they will be printed
a[~(a >= 2)]

array([0, 1])

In [12]:
a.mean()

np.float64(1.5)

In [13]:
a[a > a.mean()] # Which are greater than [ a.mean() ] they will be printed

array([2, 3])

In [16]:
# Which are not greater than [ a.mean() ] they will be printed
a[~(a > a.mean())]

array([0, 1])

In [18]:
a[(a == 0) | (a == 1)]

array([0, 1])

In [19]:
a[(a <= 2) & (a % 2 == 0)]

array([0, 2])

In [25]:
A = np.random.randint(100, size=(3, 3))

In [26]:
A

array([[88, 54, 30],
       [47, 12, 25],
       [18, 24, 28]], dtype=int32)

In [27]:
A[np.array([
    [True, False, True],
    [False, True, False],
    [True, False, True]
])]

array([88, 30, 12, 18, 28], dtype=int32)

In [28]:
A > 30

array([[ True,  True, False],
       [ True, False, False],
       [False, False, False]])

In [29]:
A[A > 30]

array([88, 54, 47], dtype=int32)

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Linear Algebra


In [30]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [31]:
B = np.array([
    [6, 5],
    [4, 3],
    [2, 1]
])

In [103]:
A.dot(B) # Matrix multiplication

array([[20, 14],
       [56, 41],
       [92, 68]])

In [32]:
A @ B # Matrix multiplication

array([[20, 14],
       [56, 41],
       [92, 68]])

In [33]:
B.T # Tanspose of B

array([[6, 4, 2],
       [5, 3, 1]])

In [34]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [35]:
B.T @ A

array([[36, 48, 60],
       [24, 33, 42]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Size of objects in Memory


### Int, floats


In [38]:
import sys
# An integer in Python is > 24bytes

sys.getsizeof(1)

28

In [39]:
# Longs are even larger
sys.getsizeof(10**100)

72

In [40]:
# Numpy size is much smaller
np.dtype(int).itemsize

8

In [41]:
# Numpy size is much smaller
np.dtype(np.int8).itemsize

1

In [42]:
np.dtype(float).itemsize

8

### Lists are even larger


In [43]:
# A one-element list
sys.getsizeof([1])

64

In [44]:
# An array of one element in numpy
np.array([1]).nbytes

8

### And performance is also important


In [46]:
l = list(range(100000))

In [48]:
a = np.arange(100000)

In [53]:
%time np.sum(a ** 2)

CPU times: total: 0 ns
Wall time: 0 ns


np.int64(333328333350000)

In [54]:
%time sum([x ** 2 for x in l])

CPU times: total: 15.6 ms
Wall time: 6.52 ms


333328333350000

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Useful Numpy functions


### `random`


In [None]:
np.random.random(size=2)

In [None]:
np.random.normal(size=2)

In [None]:
np.random.rand(2, 4)

---

### `arange`


In [None]:
np.arange(10)

In [None]:
np.arange(5, 10)

In [None]:
np.arange(0, 1, .1)

---

### `reshape`


In [None]:
np.arange(10).reshape(2, 5)

In [None]:
np.arange(10).reshape(5, 2)

---

### `linspace`


In [None]:
np.linspace(0, 1, 5)

In [None]:
np.linspace(0, 1, 20)

In [None]:
np.linspace(0, 1, 20, False)

---

### `zeros`, `ones`, `empty`


In [None]:
np.zeros(5)

In [None]:
np.zeros((3, 3))

In [None]:
np.zeros((3, 3), dtype=np.int)

In [None]:
np.ones(5)

In [None]:
np.ones((3, 3))

In [None]:
np.empty(5)

In [None]:
np.empty((2, 2))

---

### `identity` and `eye`


In [None]:
np.identity(3)

In [None]:
np.eye(3, 3)

In [None]:
np.eye(8, 4)

In [None]:
np.eye(8, 4, k=1)

In [None]:
np.eye(8, 4, k=-3)

In [None]:
"Hello World"[6]

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)
