<a href="https://colab.research.google.com/github/LuisR-jpg/School/blob/master/Machine%20Learning%20I/NumPy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Numpy tutorial

> NumPy is a Python library.
>
> NumPy is used for working with arrays.
>
> NumPy is short for "Numerical Python".

See the reference in [W3 Schools](https://www.w3schools.com/python/numpy/default.asp).

### NumPy Introduction

NumPy aims to provide an array object that is up to 50x faster than traditional Python lists. This is achieved by storing data at one continuous place in memory unlike lists.
This library is written partially in Python, and the parts that require fast computation are written in C or C++.

### NumPy getting started

    import numpy as np
*In Python alias are an alternate name for referring to the same thing.*
    
    print(np.__version__)
*The version string is stored under \_\_version\_\_ attribute.*

### NumPy Creating Arrays

We can create a NumPy ***ndarray*** object by using the array() function.
To create an ndarray, we can pass a list, tuple or any array-like object into the array() method, and it will be converted into an ndarray.

#### Dimension in Arrays
- 0-D Arrays are scalars
- 1-D Arrays are called uni-dimensional
- 2-D Arrays are used to represent matrix or 2nd order tensors
- 3-D Arrays has matrices as its elements and they're used to represent 3rd order tensor.

The ***ndim*** attribute is an integer that tells us how many dimensions the array has.

*When the array is created, you can define the number of dimensions by using the ndmin argument*

In [None]:
import numpy as np

#Creating 
uno = np.array([1, 2, 3, 4, 5])
print(uno)
print(type(uno))
dos = np.array((1, 2, 3, 4, 5))

#Dimensions
zeroD = np.array(42)
oneD = np.array([1, 2, 3, 4, 5])
twoD = np.array([[1, 2, 3], [4, 5, 6]])
threeD = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
fiveD = np.array([1, 2, 3, 4], ndmin=5)
print(zeroD.ndim)
print(oneD.ndim)
print(twoD.ndim)
print(threeD.ndim)
print(fiveD.ndim)

[1 2 3 4 5]
<class 'numpy.ndarray'>
0
1
2
3
5


### NumPy Array Indexing
#### Access Array Elements
Array indexing is the same as accessing an array element.

To access elements from several arrays we can use comma separated integers representing the dimension and the index of the element.

#### Negative Indexing
Use negative indexing to access an array from the end.

In [None]:
import numpy as np

oneD = np.array([1, 2, 3, 4])
twoD = np.array([[1,2,3,4,5], [6,7,8,9,10]])
threeD = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])


#Access
print(oneD[0])
print('2nd element on 1st row: ', twoD[0, 1])
print(threeD[0, 1, 2])

#Negative indexing
print('Last element from 2nd dim: ', twoD[1, -1])

1
2nd element on 1st row:  2
6
Last element from 2nd dim:  10


### NumPy Array Slicing
#### Slicing Arrays
Slicing in python means taking elements from one given index to another given index *[start, end)*.

- We pass slice instead of index like this: ***[start:end]***.

- We can also define the step, like this: ***[start:end:step]***.

*If we don't pass start its considered 0*

*If we don't pass end its considered length of array in that dimension*

*If we don't pass step its considered 1*

#### Negative Slicing
Use the minus operator to refer to an index from the end

#### Step
Determines step of the slicing

#### Slicing more dimensional arrays
Slicing can be applied to any dimension

In [None]:
import numpy as np

oneD = np.array([1, 2, 3, 4, 5, 6, 7])
twoD = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

#Slicing arrays
print(oneD[1:5])
print('Slice elements from index 4 to the end of the array:', oneD[4:])
print('From [beginning to end)', oneD[:4])

#Negative slicing
print('Slice from the end [third element, last element)', arr[-3:-1])

#Step
print(oneD[1:5:2])
print(oneD[::2])

#Slicing 2-D
print(twoD[1, 1:4])
print(twoD[0:2, 2])
print(twoD[0:2, 1:4])

[2 3 4 5]
Slice elements from index 4 to the end of the array: [5 6 7]
From [beginning to end) [1 2 3 4]
Slice from the end [third element, last element) [5 6]
[2 4]
[1 3 5 7]
[7 8 9]
[3 8]
[[2 3 4]
 [7 8 9]]


### NumPy Data Types
#### Data types
##### Python
By default Python have these data types:

- strings - used to represent text data, the text is given under quote marks. e.g. "ABCD"
- integer - used to represent integer numbers. e.g. -1, -2, -3
- float - used to represent real numbers. e.g. 1.2, 42.42
- boolean - used to represent True or False.
- complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j

##### NumPy
Below is a list of all data types in NumPy and the characters used to represent them.

- i - integer
- b - boolean
- u - unsigned integer
- f - float
- c - complex float
- m - timedelta
- M - datetime
- O - object
- S - string
- U - unicode string
- V - fixed chunk of memory for other type ( void )

#### Checking the Data Type of an Array
The NumPy array object has a property called ***dtype*** that returns the data type of the array

#### Creating Arrays With a Defined Data Type
Array() can take an optional argument that allows to define the expected data type of the array elements

*For i, u, f, S and U we can define size as well.*

#### What if a Value Can Not Be Converted?
Raises **ValueError**

#### Convert data type of existing array
The best way to change the data type of an existing array, is to make a copy of the array with the astype() method.

The astype() function creates a copy of the array, and allows you to specify the data type as a parameter.

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4])

#Checking data type
print(arr.dtype)

#Creating arrays with defined data type
arr = np.array([1, 2, 3, 4], dtype='S')
arr = np.array([1, 2, 3, 4], dtype='i2') 
print(arr.dtype)

#What if a value can't be converted?
try:
  arr = np.array(['a', '2', '3'], dtype='i')
except ValueError:
  print("See? I'm here")

#Converting Data Type on Existing Arrays
arr = np.array([1.1, 2.1, 3.1])
newarr = arr.astype('i')
print(newarr, newarr.dtype)
newarr = arr.astype(int)
print(newarr, newarr.dtype)

int32
int16
See? I'm here
[1 2 3] int32
[1 2 3] int32


### The Difference Between Copy and View
- **Copy** is a duplicated of the array, it is independant from the original.
- **View** is a reference of the original, both are connected. But the view doesn't own the data.

#### Check if Array owns it's data
Every NumPy array has the attribute ***base*** that returns **None** if the array owns the data. Otherwise, the base  attribute **refers** to the original object.


In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
copy = arr.copy()
view = arr.view()

#Check property
print(copy.base)
print(view.base)

None
[1 2 3 4 5]


### NumPy Array Shape
#### Shape of an array
Is the number of elements in each dimension.

#### Get the Shape of an Array
NumPy arrays have an attribute called shape that returns a tuple with each index having the number of corresponding elements.


In [None]:
import numpy as np

twoD = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
fiveD = np.array([1, 2, 3, 4], ndmin=5)

print(twoD.shape)
print(fiveD.shape)

(2, 4)
(1, 1, 1, 1, 4)


### NumPy Array Reshaping
#### Reshaping Arrays
By reshaping we can add or remove dimensions or change number of elements in each dimension.
#### Can We Reshape Into any Shape?
Yes, as long as the elements required for reshaping are equal in both shapes.
#### Returns Copy or View?
Returns the original array, so it is a view.
#### Unknown Dimension
You are allowed to have one "unknown" dimension.

Meaning that you do not have to specify an exact number for one of the dimensions in the reshape method.

Pass -1 as the value, and NumPy will calculate this number for you.
#### Flattening the arrays
Flattening array means converting a multidimensional array into a 1D array.

We can use ***reshape(-1)*** to do this.



In [11]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

print('1D to 2D', '\n', arr.reshape(4, 3))
print('1D to 3D', '\n', arr.reshape(2, 3, 2))

#Can We Reshape Into any Shape?
try:
  print(arr.reshape(3, 3))
except ValueError:
  print("See? I'm here")

#Returns a view or a copy?
print(arr.reshape(4, 3).base)

#Unknown dimension
print(arr.reshape(2, 2, -1))

#Flattening the arrays
print(arr.reshape(-1))

1D to 2D 
 [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
1D to 3D 
 [[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]
See? I'm here
[ 1  2  3  4  5  6  7  8  9 10 11 12]
[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]
[ 1  2  3  4  5  6  7  8  9 10 11 12]


### NumPy Array Iterating
#### Iterating Arrays
As we deal with multi-dimensional arrays in numpy, we can do this using basic for loop of python.

If we iterate on a n-D array it will go through n-1th dimension one by one.
#### Iterating Arrays Using nditer()
The function ***nditer()*** is a function that solves some basic issues which we face in iteration.
##### Iterating on Each Scalar Element
    for x in np.nditer(arr):
      print(x)
##### Iterating Array With Different Data Types
We can use ***op_dtypes*** argument and pass it the expected datatype to change the datatype of elements while iterating.

NumPy does not change the data type of the element in-place so it needs ***flags=['buffered']***.
##### Iterating With Different Step Size
We can use filtering and followed by iteration.
#### Enumerated Iteration Using ndenumerate()
Sometimes we require corresponding index of the element while iterating, the ***ndenumerate()*** method can be used for those usecases.

In [13]:
import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

#Iterating arrays
for x in arr:
  print(x)
for x in arr:
  for y in x:
    for z in y:
      print(z)
    
##nditer()
#Iterating on Each Scalar Element
for x in np.nditer(arr):
  print(x)
#Iterating Array With Different Data Types
for x in np.nditer(arr, flags=['buffered'], op_dtypes=['S']):
  print(x)
#Iterating With Different Step Size
for x in np.nditer(arr[:, :, ::2]):
  print(x)

#Enumerated Iteration Using ndenumerate(
for idx, x in np.ndenumerate(arr):
  print(idx, x)

[[1 2 3]
 [4 5 6]]
[[ 7  8  9]
 [10 11 12]]
1
2
3
4
5
6
7
8
9
10
11
12
1
2
3
4
5
6
7
8
9
10
11
12
b'1'
b'2'
b'3'
b'4'
b'5'
b'6'
b'7'
b'8'
b'9'
b'10'
b'11'
b'12'
1
3
4
6
7
9
10
12
(0, 0, 0) 1
(0, 0, 1) 2
(0, 0, 2) 3
(0, 1, 0) 4
(0, 1, 1) 5
(0, 1, 2) 6
(1, 0, 0) 7
(1, 0, 1) 8
(1, 0, 2) 9
(1, 1, 0) 10
(1, 1, 1) 11
(1, 1, 2) 12


### Joining NumPy Arrays 
#### Joining means putting contents of two or more arrays in a single array.

We pass a sequence of arrays that we want to join to the ***concatenate()*** function, along with the axis. 

*If axis is not explicitly passed, it is taken as 0.*

#### Joining Arrays Using Stack Functions

Stacking is same as concatenation, the only difference is that stacking is done along a new axis.

*If axis is not explicitly passed it is taken as 0.*

##### Stacking Along Rows
    hstack()
##### Stacking Along Columns
    vstack()
##### Stacking Along Height (depth)
    dstack()

In [5]:
import numpy as np

arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
print(np.concatenate((arr1, arr2), axis = 0))
print(np.concatenate((arr1, arr2), axis = 1))
print(np.stack((arr1, arr2), axis=1))
print(np.hstack((arr1, arr2)))
print(np.vstack((arr1, arr2)))
print(np.dstack((arr1, arr2)))

[[1 2]
 [3 4]
 [5 6]
 [7 8]]
[[1 2 5 6]
 [3 4 7 8]]
[[[1 2]
  [5 6]]

 [[3 4]
  [7 8]]]
[[1 2 5 6]
 [3 4 7 8]]
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
[[[1 5]
  [2 6]]

 [[3 7]
  [4 8]]]


### NumPy Splitting Array
#### Splitting NumPy Arrays
Splitting breaks one array into multiple.

We use ***array_split()*** for splitting arrays, we pass it the array we want to split and the number of splits.

*The return value is an array containing three arrays.*

*If the array has less elements than required, it will adjust from the end accordingly.*

#### Splitting 2-D Arrays

Use the same syntax when splitting 2-D arrays.

In addition, you can specify which axis you want to do the split around.

It's possible to add the axis argument to split along the row. (**axis=1**).

An alternate solution is using ***hsplit()***


In [4]:
import numpy as np

#Splitting NumPy Arrays
arr = np.array([1, 2, 3, 4, 5, 6])
print(np.array_split(arr, 3))

#Splitting 2-D Arrays
arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])
print(np.array_split(arr, 3))
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
print(np.array_split(arr, 3))
print(np.array_split(arr, 3, axis=1))

[array([1, 2]), array([3, 4]), array([5, 6])]
[array([[1, 2],
       [3, 4]]), array([[5, 6],
       [7, 8]]), array([[ 9, 10],
       [11, 12]])]
[array([[1, 2, 3],
       [4, 5, 6]]), array([[ 7,  8,  9],
       [10, 11, 12]]), array([[13, 14, 15],
       [16, 17, 18]])]
[array([[ 1],
       [ 4],
       [ 7],
       [10],
       [13],
       [16]]), array([[ 2],
       [ 5],
       [ 8],
       [11],
       [14],
       [17]]), array([[ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18]])]


### NumPy Searching Arrays
#### Searching Arrays
You can search an array for a certain value, and return the indexes that get a match.

To search an array, use the ***where()*** method.

#### Search Sorted
There is a method called searchsorted() which performs a binary search in the array, and returns the index where the specified value would be inserted to maintain the search order.

*The searchsorted() method is assumed to be used on sorted arrays.*

By default the left most index is returned, but we can give side='right' to return the right most index instead.

##### Multiple Values
To search for more than one value, use an array with the specified values.

In [9]:
import numpy as np

#Searching Arrays
arr = np.array([1, 2, 3, 4, 5, 4, 4])
print(np.where(arr == 4))
print(np.where(arr % 2 == 0))

#Search sorted
arr = np.array([6, 7, 8, 9])
print(np.searchsorted(arr, 7))
print(np.searchsorted(arr, 7, side='right'))

##Multiple values
print(np.searchsorted(arr, [2, 4, 6]))

(array([3, 5, 6]),)
(array([1, 3, 5, 6]),)
1
2
[0 0 0]


### NumPy Sorting Arrays
#### Sorting Arrays
The NumPy ndarray object has a function called ***sort()***.

*This method returns a copy of the array, leaving the original array unchanged.*

#### Sorting a 2-D Array
Both arrays will be sorted.

In [12]:
import numpy as np

#Sorting Arrays
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))
arr = np.array(['banana', 'cherry', 'apple'])
print(np.sort(arr))
arr = np.array([True, False, True])
print(np.sort(arr))

#Sorting 2D arrays
arr = np.array([[3, 2, 4], [5, 0, 1]])
print(np.sort(arr))

[0 1 2 3]
['apple' 'banana' 'cherry']
[False  True  True]
[[2 3 4]
 [0 1 5]]


### NumPy Filter Array
#### Filtering Arrays

Getting some elements out of an existing array and creating a new one with those.

*In NumPy, you filter an array using a boolean index list.*

#### Creating the Filter Array
Any boolean array could work.

We can directly substitute the array instead of the iterable variable in our condition and it will work just as we expect it to.

In [13]:
import numpy as np

arr = np.array([41, 42, 43, 44])

#Filtering Arrays
x = [True, False, True, False]
print(arr[x])

#Creating the filtering array
filter_arr = arr > 42
newarr = arr[filter_arr]
print(filter_arr)
print(newarr)

[41 43]
[False False  True  True]
[43 44]


# NumPy Random

### Random Numbers in NumPy
#### Generate Random Number
NumPy offers the random module to work with random numbers.

#### Generate Random Float
The random module's ***rand()*** method returns a random float between 0 and 1.

#### Generate Random Array
##### Integers
The ***randint()*** method takes a size parameter where you can specify the shape of an array.

##### Floats
The **rand()** method also allows you to specify the shape of the array.

#### Generate Random Number From Array
The ***choice()*** method takes an array as a parameter and randomly returns one of the values.

Add a size parameter to specify the shape of the array.

In [18]:
from numpy import random

#Random number
print(random.randint(100))

#Random float
print(random.rand())

#Generate Random Array
print(random.randint(100, size=(5)))
print(random.randint(100, size=(3, 5)))
print(random.rand(5))

#Random from array
print(random.choice([3, 5, 7, 9]))
print(random.choice([3, 5, 7, 9], size=(3, 5)))

50
0.6288326957631435
[92 61  9 23 92]
[[91 36 22 52 49]
 [20 21 77  8 74]
 [ 9 69 90 74 70]]
[0.8005961  0.61209764 0.04841144 0.82765982 0.45989815]
7
[[9 9 5 3 9]
 [3 5 5 9 3]
 [3 7 9 9 5]]


### Random Data Distribution
#### Random Distribution
A random distribution is a set of random numbers that follow a certain probability density function.

The **choice()** method allows us to specify the probability for each value.

The probability is set by a number between 0 and 1, where 0 means that the value will never occur and 1 means that the value will always occur.

*The sum of all probability numbers should be 1*



In [19]:
from numpy import random

#Random distribution
print(random.choice([3, 5, 7, 9], p=[0.1, 0.3, 0.6, 0.0], size=(100)))

[5 3 5 7 5 7 5 7 7 7 7 7 5 5 7 7 3 7 5 5 7 7 5 5 7 3 7 7 5 5 7 7 7 7 7 7 7
 5 7 7 5 5 5 7 7 7 7 3 7 5 7 7 7 3 7 7 5 7 3 5 7 5 5 5 7 7 5 7 7 7 7 5 7 5
 7 5 3 5 3 7 7 7 7 3 7 7 5 7 5 7 7 5 7 7 7 5 7 3 7 7]


### Random Permutations
#### Random Permutations of Elements
A permutation refers to an arrangement of elements.

The NumPy Random module provides two methods for this: shuffle() and permutation().

##### Shuffling Arrays
Shuffles array in-place.

##### Permutating Arrays.
Returns a re-arranged array, leaving the original unchanged.


In [20]:
from numpy import random
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

#Shuffle
random.shuffle(arr)
print(arr)

#Permutation
print(random.permutation(arr))

[4 3 2 1 5]
