<a href="https://colab.research.google.com/github/Afferent-Learning/Intro-to-the-Machine-Learning-Pipeline/blob/master/(C5)%202D%20Data%20Management%20in%20Numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 2 Dimensional Data Management in Numpy
This serves as an introduction to *numpy*, and *matrix mathematics*

## What's 2D Data Anyways?
2D Data is simply data that is referenced by row and column. Think of it like values inside a spreadsheet, or pixels inside and image. By storing data in 2 dimesions we are able to organize information in a way that aids complex analysis techniques we will be using in machine learning.

### Common Interchangable Terms

- 1D Data = Vector = Array = List = 1 Rank Data

- 2D Data = Matrix = 2D Array = 2D List = 2 Rank Data



## Python List Operations
Before we jump straight into 2D data, let's make sure we fully understand 1D.

### Concatenation
The '+' operator does not add lists but instead combines them into a longer single list

In [0]:
l1 = [1,2,3,4]
l2 = [5,6,7,8]

l1 + l2

### Addition
In order to perform addition we will have to reference each element in each list. Create l3 by adding each element of l1 and l2 together. Use the append function to initialize the l3 list elements equal to the sum of the elements of l1 and l2 at the same index.

*Hint: output should be [6,8,10,12]*

In [0]:
l3 = []
for i in range(0,len(l1)):
    l3.append()

l3

### Quick Modulus Reviews
Modulus produces an integer remainder of a division of the left number by the right

In [0]:
10 % 3

In [0]:
10 % 2

In [0]:
10 % 5

In [0]:
10 % 6

In [0]:
100 % 22

### List Comprehension
The following line is able to declare a list using looping and conditional statements contained between brackets. In this case, the loop iterates from 1 (inclusive) to 10 (exclusive) and performs a conditional operation. If the remainder of the iteration when divided by 2 is greater than zero, than it is added to the list. This is because Python is based in C. In C a number can be used a truth statement. If the number is equal to 0 the condition is False. If the number is greater than zero the condition is True. Therefore, all odd numbers between 1 and 10 will be added to our list. The else statement simply adds the iteration multiplied by 100.

In [0]:
l3 += [ x if (x % 2) else (x * 100) for x in range(1, 10) ]
l3

## Using Numpy
"Python’s lists are efficient general-purpose containers. They support efficient for insertion, deletion, appending, and concatenation, and Python’s list comprehensions make them easy to construct and manipulate. However, they have certain limitations: they don’t support “vectorized” operations like elementwise addition and multiplication, and the fact that they can contain objects of differing types mean that Python must store type information for every element, and must execute type dispatching code when operating on each element. This also means that very few list operations can be carried out by efficient C loops – each iteration would require type checks and other Python API bookkeeping." - [Shivam Kohli](https://www.quora.com/What-are-some-advantages-of-numpy-over-regular-lists-in-Python)


### NumPy Array Creation

In [0]:
import numpy as np

arr1 = np.array([1, 2, 3])
arr1

In [0]:
arr3 = np.zeros(5)
arr3

### Shape

In [0]:
arr1.shape

### List to Array Conversion

In [0]:
arr2 = np.array(l3)
arr2

### Array Indexing and Modifications

In [0]:
print(arr2[0], arr2[3], arr2[6])
del_idx = [1,2,3,4] 
arr2 = np.delete(arr2, del_idx)              
arr2          

### 1D to 2D Conversion

In [0]:
arr_2d = np.reshape(arr2, (-1,3))
arr_2d

### Adding Rows

In [0]:
arr_2d = np.vstack((arr_2d, arr1))
arr_2d

### Transposing an Array
Must convert to a matrix first to use the transpose function

In [0]:
arr1 = np.append(arr1, 4)
mtx1 = np.matrix(arr1)
mtx1 = mtx1.transpose()
mtx1

### Adding Columns

In [0]:
arr_2d = np.hstack((arr_2d, mtx1))
arr_2d

## Matrix Mathematics with Numpy

### Dot Product

The dot product is the what most people refer to when they say matrix multiplication. In this type of multiplication, rows in the left matrix are multiplied by the columns in the right matrix and the products of each element are added together. In order for two matrices to be multiplied the right matrix must have the same number or columns as the left matrix has rows.

![Dot Product](https://drive.google.com/uc?export=view&id=1ieCBD4WQeMIHkDwvZzDa2W1k-q2xFeLY)

Let's try performing the dot product on a pair of vectors.

In [0]:
a = np.array([1,2,3])
b = np.array([4,5,6])
c = np.dot(a,b)
c

### Transposing 2D Data
Notice that **a** did not have the same number of columns as **b**. Luckily NumPy recognizes this and automatically transposes b to have the correct dimensions before calculating the dot product. Let's see if it does the same thing with a matrix. You should see an error saying that that the shapes are not aligned, try **transposing** matrix **b** to align the shapes

In [0]:
a = np.matrix([[1,2,3],[4,5,6]])
b = np.matrix([[6,5,4],[3,2,1]])
c = np.dot(a,b)
c

### Hadamard Product

The Hadamard Product is a lot more simple of a concept compared to the dot product. In this case the matrices must be the **same** shape and you simply multiply each element from the left matrix with the element in the corresponding location in the right matrix

![hadamard_product.png](https://drive.google.com/uc?export=view&id=1NLeBfJII1acRSeAx38U51XR9ay63aTlS)

Use the **multiply** function from NumPy to multiply the two matrices declared in the previous cell

In [0]:
b = b.transpose()
c = np.multiply(a, b)
c

Now make two arrays *a and b* using NumPy, multiply them by the Haramad product, and store the result to *c*

In [0]:
a = 
b = 
c = 
c