# Learning NumPy
It is a Python library used for working with arrays. NumPy stands for "Numerical Python".

## Why we use NumPy?
Because lists are slow! So we use arrays.

# Installation of NumPy

In [36]:
import numpy as np

# Creating arrays
The array object in NumPy is called `ndarray`. We can create a NumPy `ndarray` object by using the `array()` function.

In [37]:
arr = np.array([1, 2, 3, 4, 5])
print(arr)

[1 2 3 4 5]


We can pass a list, a tuple or any array-like object into the `array()` method, and it will be converted into an `ndarray`

In [38]:
ex1 = np.array((1, 2, 3, 4, 5, 6, 7))
print(ex1)

[1 2 3 4 5 6 7]


## Make an array full of zeros

In [39]:
ex = np.zeros(3)
ex1= np.zeros((3,2))

print(ex)
print(ex1)

[0. 0. 0.]
[[0. 0.]
 [0. 0.]
 [0. 0.]]


## Make an array full of ones

In [40]:
ex = np.ones(3)
ex1 = np.ones((3,5))

print(ex)
print(ex1)

[1. 1. 1.]
[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]


## Make an identity matrix

In [41]:
ex = np.eye(5)
print(ex)

[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


## Array of x evenly divided values from y to z (`np.linspace(y, z, x)`)

In [42]:
np.linspace(0, 5, 40)

array([0.        , 0.12820513, 0.25641026, 0.38461538, 0.51282051,
       0.64102564, 0.76923077, 0.8974359 , 1.02564103, 1.15384615,
       1.28205128, 1.41025641, 1.53846154, 1.66666667, 1.79487179,
       1.92307692, 2.05128205, 2.17948718, 2.30769231, 2.43589744,
       2.56410256, 2.69230769, 2.82051282, 2.94871795, 3.07692308,
       3.20512821, 3.33333333, 3.46153846, 3.58974359, 3.71794872,
       3.84615385, 3.97435897, 4.1025641 , 4.23076923, 4.35897436,
       4.48717949, 4.61538462, 4.74358974, 4.87179487, 5.        ])

## Array of values from x to less than y with step z (`np.arange(x, y, z)`)

In [43]:
np.arange(0, 400, 100)

array([  0, 100, 200, 300])

## MxN array with all values X (`np.full((M,N), X)`)

In [44]:
np.full((2, 9), 6)

array([[6, 6, 6, 6, 6, 6, 6, 6, 6],
       [6, 6, 6, 6, 6, 6, 6, 6, 6]])

## MxN array with random numbers
- Random floats: `np.random.rand(M, N)`.
- Random integers between 0 and (x-1): `np.random.randint(x, size=(M,N))`.

In [45]:
np.random.rand(5, 6)
np.random.rand(5, 6) *100
np.random.randint(6, size=(4, 5))

array([[2, 2, 5, 5, 1],
       [3, 1, 3, 0, 4],
       [0, 5, 3, 1, 3],
       [5, 5, 0, 4, 2]])

## Dimensions in arrays
A dimension in arrays is one level of array depth.

### 0-D Arrays

In [46]:
zerod_array = np.array(42)
print(zerod_array)

42


### 1-D Arrays

In [47]:
oned_array = np.array([1, 2, 3, 4, 5])
print(oned_array)

[1 2 3 4 5]


### 2-D Arrays

In [48]:
twod_array = np.array([[1, 2], [3, 4]])
print(twod_array)

[[1 2]
 [3 4]]


## Check number of dimensions
The attribute `ndim` returns an integer that tells us how many dimensions the array have.

In [49]:
print(zerod_array.ndim)
print(oned_array.ndim)
print(twod_array.ndim)

0
1
2


## Higher dimensional arrays
When creating an array, we can define the dimension using `ndmin` inside the `array()`function.

In [50]:
higher_da = np.array([1, 2, 3, 4], ndmin=5)
print(higher_da.ndim)

5


# Indexing
An array element can be accessed by referring to its index number. 

## Access 2-D arrays
To do this, we can use comma separated integers representing the `dimension` and the `index of the element`.

In [51]:
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print(arr.ndim)
print(arr[1, 0])

2
6


## Access 3-D arrays

In [52]:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(arr[0, 1, 2])

6


# Slicing
This works as a in regular Python:
- [start:end]
- [start:end:step]

In [53]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:4])
print(arr[4:])
print(arr[:4])
print(arr[-3:-1])
print(arr[1:5:2])
print(arr[::2])

[2 3 4]
[5 6 7]
[1 2 3 4]
[5 6]
[2 4]
[1 3 5 7]


In [54]:
arr2 = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(arr2[1, 1:4])
print(arr2[0, 0:2])
print(arr2[0:2, 2])
print(arr2[0:2, 1:4])

[7 8 9]
[1 2]
[3 8]
[[2 3 4]
 [7 8 9]]


# Data Types
NumPy has some extra data types, and we refer to data types with one character:
- i, integer
- b, boolean
- u, unsigned integer
- f, float
- c, complex float
- O, object
- S, string

In order to check the data type of an array, the NumPy array object has a property called `dtype` that returns the data type of an array.

In [55]:
arr = np.array([1, 2, 3, 4])
print(arr.dtype)

arr2 = np.array(['apple', 'banana', 'cherry'])
print(arr2.dtype)

int32
<U6


## Creating arrays with a defiend data typw
The function `array()` can take an additional argument to set the data type (`dtype=`).

In [56]:
arr = np.array([1, 2, 3, 4], dtype='S')
print(arr)
print(arr.dtype)

[b'1' b'2' b'3' b'4']
|S1


## Converting data type on existing arrays
The best way to change the data type of an existing array is to make a copy of the array with the `astype()` method.

In [57]:
arr = np.array([1.1, 2.1, 3.1])
newarr = arr.astype("i")
print(arr.dtype)
print(newarr.dtype)

float64
int32


# Array Copy vs View
- `copy`: a new array. It owns the data and any changes to the copy will not affect original array, and any changes made to the original array will not affect the copy.
- `view`: a view of the original array. It does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.

In [58]:
arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
arr[0] = 42

print(arr)
print(x)

[42  2  3  4  5]
[1 2 3 4 5]


In [59]:
arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
arr[0] = 42

print(arr)
print(x) 

[42  2  3  4  5]
[42  2  3  4  5]


## Check if array owns its data
Every array has the attribute `base` that returns `None` if the array owns the data.

In [60]:
arr = np.array([1, 2, 3, 4, 5])

x = arr.copy()
y = arr.view()

print(x.base)
print(y.base) 

None
[1 2 3 4 5]


# Array shape
The shape of an array is the number of elements in each dimension. NumPy arrays have the attribute `shape` that returns a tuple ith each index having the number of corresponding elements.

In [61]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr.shape) # there are two dimensions, and each has 4 elements

(2, 4)


In [62]:
arr = np.array([1, 2, 3, 4], ndmin=5)
print(arr)
print(arr.shape)

[[[[[1 2 3 4]]]]]
(1, 1, 1, 1, 4)


# Array reshaping
Reshaping means changing the shape of an array. With this we can add or remove dimensions or change number of elements in each dimension.

## Reshape from 1-D to 2-D

In [63]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr = arr.reshape(4, 3) # 4 arrays, each with 3 elements

print(newarr)
print(newarr.shape)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
(4, 3)


## Reshape from 1-D to 3-D

In [64]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr = arr.reshape(2, 3, 2) # 2 arrays that contain 3 arrays, each with 2

print(arr)
print(newarr)

[ 1  2  3  4  5  6  7  8  9 10 11 12]
[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]


In [65]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

## Flattening the arrays
This means converting a multidimensional array into a 1D array. This is done using the command `reshape(-1)`

In [66]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
newarr = arr.reshape(-1)

print(arr)
print(newarr)

[[1 2 3]
 [4 5 6]]
[1 2 3 4 5 6]


# Array Iterating
As in regular python.

## Iterating arrays using nditer()
The function `nditer()` is a helping function that solves some basic issues which we face in iteration.

In [67]:
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
for x in arr:
    print(x)

for x in np.nditer(arr):
    print(x)

[[1 2]
 [3 4]]
[[5 6]
 [7 8]]
1
2
3
4
5
6
7
8


## Iterating array with different data types
We can use `op_dtypes` arguments and pass it the expected datatype to change the datatype of elements while iterating.

Beware, NumPy does not change the data typ of the element in-place, so it needs some other space to perform this action. That extra space is called buffer, and in order to enable it in `nditer()` we pass `flags=['buffered']`

In [68]:
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

for x in np.nditer(arr, flags=['buffered'], op_dtypes=['S']):
    print(x)

b'1'
b'2'
b'3'
b'4'
b'5'
b'6'
b'7'
b'8'


# Joining arrays
Joining means putting contents of two or more arrays in a single array. In SQL we join tables based on a key, whereas in NumPy we join arrays by axes. 

We pass a sequence of arrays that we want to join to the `concatenate()` function, along with the axis.

In [69]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

arr = np.concatenate((arr1, arr2), axis=0)
print(arr)

[1 2 3 4 5 6]


In [70]:
# Join by rows:
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

arr = np.concatenate((arr1, arr2), axis=1)
print(arr)

[[1 2 5 6]
 [3 4 7 8]]


# Splitting arrays
Splitting breaks one array into multiple. We use `array_split()`, we pass it the array we want to split and the number of splits.

In [74]:
arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 4)

print(newarr) 

[1 2 3 4 5 6]
[array([1, 2]), array([3, 4]), array([5]), array([6])]


We can even specify the axis

In [77]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
newarr = np.array_split(arr, 3, axis=1)

print(newarr) 

[array([[ 1],
       [ 4],
       [ 7],
       [10],
       [13],
       [16]]), array([[ 2],
       [ 5],
       [ 8],
       [11],
       [14],
       [17]]), array([[ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18]])]


# Searching arrays
You can search an array for a certain value, and return the indexes that get a match. For this, we use the `where()` method.

Find the indexes where the value is 4:

In [80]:
arr = np.array([1,2,3,4,5,4,4])

x = np.where(arr==4)
print(x)

(array([3, 5, 6], dtype=int64),)


Find the indexes where the values are even:

In [81]:
arr = np.array([1,2,3,4,5,6,7,8])


x = np.where(arr%2==0)
print(x)

(array([1, 3, 5, 7], dtype=int64),)


## Search sorted (ASSUMED TO BE USED ON SORTED ARRAYS)
The method `searchsorted()`performs a binary search* in the array, and returns the index where the specified value would be inserted to maintain the search order.

*binary search: an efficient algorithm for finding an item from a sorted list of items.

In [85]:
arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7.8)
print(x)

# The number 7.8 should be inserted on index 1 to remain the sort order.
# The method starts the search from the left and returns the first index
# where the number 7.8 is no longet larger than the next value.

2


We can give `side='right'` to return the right most index instead:

In [88]:
arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7, side='right')
print(x) 

2


## Multiple values
To search for more than one value, use an array with the specified values.

In [89]:
arr = np.array([1, 3, 5, 7])

x = np.searchsorted(arr, [2, 4, 6])
print(x) 

[1 2 3]


# Sorting arrays
To sort a specified array, we use the function `sort()`.

In [93]:
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))

arr = np.array(['banana', 'cherry', 'apple'])
print(np.sort(arr)) 

arr = np.array([True, False, True])
print(np.sort(arr)) 

arr = np.array([[3, 2, 4], [5, 0, 1]])
print(np.sort(arr))

[0 1 2 3]
['apple' 'banana' 'cherry']
[False  True  True]
[[2 3 4]
 [0 1 5]]
