![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)
<hr style="margin-bottom: 40px;">

<img src="https://user-images.githubusercontent.com/7065401/39118381-910eb0c2-46e9-11e8-81f1-a5b897401c23.jpeg"
    style="width:300px; float: right; margin: 0 40px 40px 40px;"></img>

# Numpy: Numeric computing library

NumPy (Numerical Python) is one of the core packages for numerical computing in Python. Pandas, Matplotlib, Statmodels and many other Scientific libraries rely on NumPy.

NumPy major contributions are:

* Efficient numeric computation with C primitives
* Efficient collections with vectorized operations
* An integrated and natural Linear Algebra API
* A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.

Let's develop on efficiency. In Python, **everything is an object**, which means that even simple ints are also objects, with all the required machinery to make object work. We call them "Boxed Ints". In contrast, NumPy uses primitive numeric types (floats, ints) which makes storing and computation efficient.

<img src="https://docs.google.com/drawings/d/e/2PACX-1vTkDtKYMUVdpfVb3TTpr_8rrVtpal2dOknUUEOu85wJ1RitzHHf5nsJqz1O0SnTt8BwgJjxXMYXyIqs/pub?w=726&h=396" />


![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)

## Hands on!

In [178]:
import sys # are use for system level operation
import numpy as np

## Basic Numpy Arrays

In [81]:
np.array([1, 2, 3, 4]) #As to be numeric and the same number type

array([1, 2, 3, 4])

In [179]:
a = np.array([1, 2, 3, 4])

In [180]:
b = np.array([0, .5, 1, 1.5, 2])

In [84]:
a[0], a[1]

(np.int64(1), np.int64(2))

In [85]:
a[0:]

array([1, 2, 3, 4])

In [86]:
a[1:3]

array([2, 3])

In [87]:
a[1:-1]

array([2, 3])

In [88]:
a[::2]

array([1, 3])

In [89]:
b

array([0. , 0.5, 1. , 1.5, 2. ])

In [90]:
b[0], b[2], b[-1]

(np.float64(0.0), np.float64(1.0), np.float64(2.0))

In [91]:
b[[0, 2, -1]] # it should come as a new array

array([0., 1., 2.])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Array Types

In [92]:
a

array([1, 2, 3, 4])

In [93]:
a.dtype # for numpy, you use dtype and not type()

dtype('int64')

In [94]:
b

array([0. , 0.5, 1. , 1.5, 2. ])

In [95]:
a.var()

np.float64(1.25)

In [96]:
a.std()

np.float64(1.118033988749895)

In [97]:
b.dtype  #a.mean? to see the docstrings

dtype('float64')

In [98]:
np.array([1, 2, 3, 4], dtype=np.float)

AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

In [181]:
np.array([1, 2, 3, 4], dtype=np.int8)

array([1, 2, 3, 4], dtype=int8)

In [182]:
c = np.array(['a', 'b', 'c'])

In [None]:
c.dtype # the dtype tells the value while the type tells it numpy. the u means unicode

In [None]:
d = np.array([{'a': 1}, sys])

In [None]:
d.dtype

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Dimensions and shapes

In [99]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6]
])

In [100]:
A.shape # tells the num of rows and column

(2, 3)

In [101]:
A.ndim #

2

In [102]:
A.size # the number of element in the entire array

6

In [103]:
B = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4],
        [3, 2, 1]
    ]
])

In [104]:
B

array([[[12, 11, 10],
        [ 9,  8,  7]],

       [[ 6,  5,  4],
        [ 3,  2,  1]]])

In [105]:
B.shape # 2c for the num of table, 2 for row, 3 for column

(2, 2, 3)

In [106]:
B.ndim

3

In [107]:
B.size

12

If the shape isn't consistent, it'll just fall back to regular Python objects:

In [108]:
C = np.array([          # data type - object
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4]
    ]
], dtype= object)

In [109]:
C.dtype

dtype('O')

In [110]:
C.shape

(2,)

In [111]:
C.size

2

In [112]:
type(C[0])

list

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Indexing and Slicing of Matrices

In [113]:
# Square matrix
A = np.array([
#.   0. 1. 2
    [1, 2, 3], # 0
    [4, 5, 6], # 1
    [7, 8, 9]  # 2
])

In [114]:
A[1] # when you are indexing, you are getting the row

array([4, 5, 6])

In [115]:
A[1][0] #when you want to get indiviual element when indexing

np.int64(4)

In [116]:
# A[d1, d2, d3, d4]

In [117]:
A[1, 0] # using this method, you will get both the row and column

np.int64(4)

In [118]:
A[0:2] # this for slicing

array([[1, 2, 3],
       [4, 5, 6]])

In [119]:
A[:, :2] # for the comma, is the slicing for the column

array([[1, 2],
       [4, 5],
       [7, 8]])

In [120]:
A[:2, :2]

array([[1, 2],
       [4, 5]])

In [121]:
A[:2, 2:]

array([[3],
       [6]])

In [122]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [123]:
A[1] = np.array([10, 10, 10])

In [124]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [ 7,  8,  9]])

In [125]:
A[2] = 99 # this will print out 99, 3x. when you've the same value all overthe row

In [126]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [99, 99, 99]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Summary statistics

In [127]:
a = np.array([1, 2, 3, 4])


In [128]:
b =[1,2,3,4]

In [129]:
np.mean(b)

np.float64(2.5)

In [130]:
a.sum()

np.int64(10)

In [131]:
a.mean()

np.float64(2.5)

In [132]:
a.std()

np.float64(1.118033988749895)

In [133]:
a.var()

np.float64(1.25)

In [134]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [135]:
A.sum()

np.int64(45)

In [136]:
A.mean()

np.float64(5.0)

In [137]:
A.std()

np.float64(2.581988897471611)

In [138]:
A.sum(axis=0) # is row wise

array([12, 15, 18])

In [139]:
A.sum(axis=1) #Axis = 0 is column while 1 =row

array([ 6, 15, 24])

In [140]:
A.mean(axis=0)

array([4., 5., 6.])

In [141]:
A.mean(axis=1)

array([2., 5., 8.])

In [142]:
A.std(axis=0)

array([2.44948974, 2.44948974, 2.44948974])

In [143]:
A.std(axis=1)

array([0.81649658, 0.81649658, 0.81649658])

And [many more](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.ndarray.html#array-methods)...

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Broadcasting and Vectorized operations

In [144]:
a = np.arange(4) #arrange we create an array of numbers just like list

In [145]:
a

array([0, 1, 2, 3])

In [146]:
a + 10

array([10, 11, 12, 13])

In [147]:
a * 10

array([ 0, 10, 20, 30])

In [148]:
a

array([0, 1, 2, 3])

In [149]:
a += 100 # a= a+100, summing the variable and assigning it back to a. is different from a+100

In [150]:
a

array([100, 101, 102, 103])

In [151]:
l = [0, 1, 2, 3]

In [152]:
[i * 10 for i in l]

[0, 10, 20, 30]

In [153]:
number = [2,3,4,5,6]
square_num =[]
for num in number:
    square_num.append(num**2)

square_num

[4, 9, 16, 25, 36]

In [154]:
[num ** 2 for num in number] # the output must be a list for you to use for list comprehension

[4, 9, 16, 25, 36]

In [155]:
a = np.arange(4)

In [156]:
a


array([0, 1, 2, 3])

In [157]:
b = np.array([10, 10, 10, 10])

In [158]:
b

array([10, 10, 10, 10])

In [159]:
a + b

array([10, 11, 12, 13])

In [160]:
a * b

array([ 0, 10, 20, 30])

In [161]:
a[2] = 10 # to change one element in an array

In [162]:
a + np.array([0,0,0,15])
array

NameError: name 'array' is not defined

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Boolean arrays
_(Also called masks)_

In [None]:
a = np.arange(4)

In [None]:
a

In [None]:
a[0], a[-1]

In [None]:
a[[0, -1]] # it ceck for the index and return true or false

In [None]:
a[[True, False, False, True]]

In [None]:
a

In [None]:
a >= 2

In [None]:
b= np.array([77, 8,9,80])


In [None]:
b[a >=2]

In [183]:
a[a >= 2]

array([2, 3, 4])

In [184]:
a.mean()

np.float64(2.5)

In [None]:
a[a > a.mean()]

In [None]:
a[~(a > a.mean())] #the ~ sign flip all the result to the opposite

In [None]:
a[(a == 0) | (a == 1)]

In [None]:
a[(a <= 2) & (a % 2 == 0)]

In [185]:
A = np.random.randint(100, size=(3, 3))

In [None]:
A

In [None]:
A[np.array([
    [True, False, True],
    [False, True, False],
    [True, False, True]
])]

In [None]:
A > 30

In [None]:
A[A > 30]

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Linear Algebra

In [None]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [None]:
B = np.array([
    [6, 5],
    [4, 3],
    [2, 1]
])

In [None]:
A.dot(B) # the .dot means multiplication of a matrix, the 1st row multiple the ist column

In [None]:
A @ B # the @ sign is still the same with the .dot

In [None]:
B.T # the T sugnify transpose

In [None]:
A

In [None]:
B.T @ A

In [None]:
np.random.random.seed(42)
A = np.random.randint(100, size=(3, 3))

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Size of objects in Memory

### Int, floats

In [None]:
# An integer in Python is > 24bytes
sys.getsizeof(1)

In [None]:
# Longs are even larger
sys.getsizeof(10**100)

In [None]:
# Numpy size is much smaller
np.dtype(int).itemsize

In [None]:
# Numpy size is much smaller
np.dtype(np.int8).itemsize

In [None]:
np.dtype(float).itemsize

### Lists are even larger

In [None]:
# A one-element list
sys.getsizeof([1])

In [None]:
# An array of one element in numpy
np.array([1]).nbytes

### And performance is also important

In [None]:
l = list(range(100000))

In [None]:
a = np.arange(100000)

In [None]:
%time np.sum(a ** 2)

In [None]:
%time sum([x ** 2 for x in l])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Useful Numpy functions

### `random`

In [None]:
np.random.random(size=2)

In [163]:
np.random.normal(size=2)

array([-0.22307036,  1.50766425])

In [164]:
np.random.rand(2, 4)

array([[0.9309029 , 0.5348184 , 0.18756978, 0.6064234 ],
       [0.71388899, 0.42823859, 0.62522173, 0.18094171]])

---
### `arange`

In [165]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [166]:
np.arange(5, 10)

array([5, 6, 7, 8, 9])

In [167]:
np.arange(0, 1, .1)

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

---
### `reshape`

In [168]:
np.arange(10).reshape(2, 5)

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [169]:
np.arange(10).reshape(5, 2)

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

---
### `linspace`

In [170]:
np.linspace(0, 1, 5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [171]:
np.linspace(0, 1, 20)

array([0.        , 0.05263158, 0.10526316, 0.15789474, 0.21052632,
       0.26315789, 0.31578947, 0.36842105, 0.42105263, 0.47368421,
       0.52631579, 0.57894737, 0.63157895, 0.68421053, 0.73684211,
       0.78947368, 0.84210526, 0.89473684, 0.94736842, 1.        ])

In [172]:
np.linspace(0, 1, 20, False) #the false stop the last number from including itself

array([0.  , 0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 ,
       0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])

---
### `zeros`, `ones`, `empty`

In [173]:
import numpy as np

In [174]:
np.zeros(5) # the zero method create an arrray of zero

array([0., 0., 0., 0., 0.])

In [175]:
np.zeros((3, 3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [176]:
import numpy as np

In [177]:
np.zeros((3, 3),dtype=np.int) # dtype is to make sure that the values are integers

AttributeError: module 'numpy' has no attribute 'int'.
`np.int` was a deprecated alias for the builtin `int`. To avoid this error in existing code, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

In [None]:
np.ones(5)

In [None]:
np.ones((3, 3))

In [None]:
np.empty(5)

In [None]:
np.empty((2, 2))

---
### `identity` and `eye`

In [None]:
import numpy as np

In [None]:
np.identity(3) #chaining is doing multiple method in one line

In [None]:
np.eye(3, 3) #eye is similar to identity but you specified the shape here, identity create a square matrix, while eye
             #doesn't have to be a square matrix

In [None]:
np.eye(8, 4)

In [None]:
np.eye(8, 4, k=1) #k=1 is identifying where you want your identity to start from, the index position

In [None]:
np.eye(8, 4, k=-3)

In [None]:
np.eye(8, 4, k=3)

In [None]:
np.eye(8, 4, k=2)

In [None]:
np.eye(8, 4, k=-2)

In [None]:
"Hello World"[6]

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)