-----
# Python NumPy
-----


## What's NumPy

The fundamental package for scientific **numerical** computing with Python

# Why use NumPy?




*   NumPy aims to provide an array object that is up to 50x faster than traditional Python lists (NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently)
*   Arrays are very frequently used in data science



# NumPy

In [None]:
import numpy as np

# NumPy ndarray (N-dimensional array) vs. list 




In [None]:
py_list = [2, 3, 4, 6]
np_array = np.array(py_list)

In [None]:
print(type(py_list), py_list)
print(type(np_array), np_array)

<class 'list'> [2, 3, 4, 6]
<class 'numpy.ndarray'> [2 3 4 6]


# Array shape and dtype


In [None]:
print(np_array.dtype)
print(np_array.shape)

int64
(4,)


# Dimensions in Arrarys

0-D Arrays


In [None]:
np_array = np.array(42)

In [None]:
print(np_array)
print(np_array.shape)

42
()


1-D Arrays

In [None]:
np_array = np.array([1,2,3,4,5])

In [None]:
print(np_array)
print(np_array.shape)
print(np_array.ndim)

[1 2 3 4 5]
(5,)
1


2-D Arrays

In [None]:
np_array = np.array([[1, 2, 3], [4, 5, 6]])

In [None]:
print(np_array)
print(np_array.shape)
print(np_array.ndim)

[[1 2 3]
 [4 5 6]]
(2, 3)
2


3-D Arrays

In [None]:
np_array = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

In [None]:
print(np_array)
print(np_array.shape)
print(np_array.ndim)

[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [4 5 6]]]
(2, 2, 3)
3


# Creating ndarrays

In [None]:
nds_array = np.zeros((2,3))
print(nds_array)

[[0. 0. 0.]
 [0. 0. 0.]]


In [None]:
nds_array = np.ones((4,2))
print(nds_array)

[[1. 1.]
 [1. 1.]
 [1. 1.]
 [1. 1.]]


In [None]:
nds_array = np.array([[0,1,2],[2,3,4]])


In [None]:
print(nds_array)
print(nds_array.shape)

[[0 1 2]
 [2 3 4]]
(2, 3)


# Array Indexing and Slicing

Indexing

In [None]:
py_list = [2, 3, 4, 6]
np_array = np.array(py_list)
py_list[1:3]

[3, 4]

In [None]:
np_array[1:3]

array([3, 4])

In [None]:
print(np_array[1])

3


In [None]:
np_array = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('2nd element on 1st row: ', np_array[0, 1])
print('4nd element on 2st row: ', np_array[1, 3])

2nd element on 1st row:  2
4nd element on 2st row:  9


Slicing

In [None]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5])

[2 3 4 5]


In [None]:
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[1, 1:4])

[7 8 9]


# Array computing

In [None]:
np_array[np_array>3]

array([4, 6])

In [None]:
py_list * 5

[2, 3, 4, 6, 2, 3, 4, 6, 2, 3, 4, 6, 2, 3, 4, 6, 2, 3, 4, 6]

In [None]:
np_array * 5

array([10, 15, 20, 30])

In [None]:
py_list ** 2

TypeError: ignored

In [None]:
# performance test 
def pure_python_version(size_of_vec = 1000):
    X = range(size_of_vec)
    Y = range(size_of_vec)
    Z = [X[i] + Y[i] for i in range(len(X)) ]

def numpy_version(size_of_vec = 1000):
    X = np.arange(size_of_vec)
    Y = np.arange(size_of_vec)
    Z = X + Y


In [None]:
%%timeit -n 10000
pure_python_version()

In [None]:
%%timeit -n 10000
numpy_version()

In [None]:
py_list

In [None]:
np_array

In [None]:
np_array ** 2

In [None]:
matrix = [[1, 2, 4], 
          [3, 1, 0]]
np_matrix = np.array(matrix)

In [None]:
np_matrix.shape

In [None]:
matrix[1][2]

In [None]:
np_matrix[1][2]

In [None]:
np_matrix[:,0]

In [None]:
np.random.rand()

In [None]:
np.random.randn()

In [None]:
np.random.randn(4)

In [None]:
np.random.randn(4, 5)

In [None]:
np.arange(0, 2, 0.1)

In [None]:
range(0, 8, 0.1)

# NumPy Data Types

In [None]:
arr = np.array(['apple', 'banana', 'cherry'])

print(arr.dtype)

<U6


In [None]:
arr = np.array([2, 5, 8])

print(arr.dtype)

int64


# Brodcasting 

Array Shape & Reshape

In [None]:
a = np.array([[1, 2, 3], [1, 2, 3]])
print(a)

b = 3
print(b)

c = a + b
print(c)

In [None]:
a.shape

In [None]:
a.reshape(-1,2)

In [None]:
a.reshape(1,6)

In [None]:
a.reshape(3,2)

Array Join

In [None]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.concatenate((arr1, arr2))
print(arr)

[1 2 3 4 5 6]


Array Split

In [None]:
arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 3)
print(newarr)

[array([1, 2]), array([3, 4]), array([5, 6])]


Array Search

In [None]:
arr = np.array([1, 2, 3, 4, 5, 4, 4])
x = np.where(arr == 4) # return the index
print(x)

(array([3, 5, 6]),)


----
### Summary

The benifits of the using NumPy are:
- Size - NumPy data structures take up less space.
- Performance - NumPy is faster than python lists. 
- Functionality - SciPy and NumPy have optimized functions such as linear algebra operations built in.
---

# Exercises: 
(1) Create a numpy array with zeros with the shape (3,4). Hint, check the `np.zeros` function https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.zeros.html

(2) Replace each element in the first column in the array from first exercises with `1`.  
Task: 
```python
[[0. 0. 0. 0.]    [[1. 0. 0. 0.] 
 [0. 0. 0. 0.] ->  [1. 0. 0. 0.] 
 [0. 0. 0. 0.]]    [1. 0. 0. 0.]]   
```

(3) Print out the shape of the above array.

(4) Index the fourth element on the second row.
```python
arr = np.array([[1, 2, 3, 9], [4, 5, 6, 12]])
```
(5) Find all 'apple' and return their index.
```python
arr = np.array(['apple', 'pear', 'banana', 'apple', 'cherry'])
```
(6) Using `np.random.randn` create an array of size `30000` and find the mean value of it. 

Solutions

In [None]:
# Solutions

zeros = np.zeros((3,4))
print(zeros)

zeros[:,0] = [1,1,1]

print(zeros)

N = np.random.randn(30000)
m = N.mean()
print(m)