## Numpy
Numpy is an array like data structure aimed at doing mathematical computations faster.

##### Array declaration

A common array can be represented as,
```
arr = [1, 2, 3, 4, 5, 6]
# We can convert that to a numpy array as,
np_arr = np.array(arr)
```
Looks nothing special?
Read on!

Numpy array is meant to do complex numerical matrix operations by leveraging [vectorisation](https://en.wikipedia.org/wiki/Vectorization). Looping over an array can take a lot of time. By efficiently using vectorisation, we can do matrix operations a lot faster. This can be used in scientific computing, machine learning and data processing.

In [1]:
# import the numpy array as follow to get started.
import numpy as np

#### Speed
Like mentioned, Speed is a major advantage for a numpy array. Let us verify this fact by taking two arrays and looping over them and doing a vector operation.

In [20]:
# we define an array which contains numbers from 1- 100000
arr = []
for i in range(1, 10000001):
    arr.append(i)

In [17]:
# we declare the numpy array for the same.
numpy_arr = np.array(arr)

In [18]:
# We are going to double the values in both the array and clock the time taken by both the operations.
import time

In [29]:
# For a normal Array

start_time = time.time()

for i in range(0, 1000000):
    arr[i] = 2*arr[i]

total_time_norm = time.time()-start_time
print('Ending time = {}'.format(total_time_norm))

Ending time = 0.11598682403564453


In [30]:
# For a numpy array.

start_time = time.time()

numpy_arr = 2*numpy_arr

total_time_np = time.time()-start_time
print('Ending time = {}'.format(total_time_np))

Ending time = 0.033846378326416016


In [31]:
total_time_norm/total_time_np

3.4268607092038716

##### Result
As we can see, the time taken is 3 times less when we use vectorisation operations. So it makes sense to process the data using numpy instead of a regular array.

#### Other features

The main features of numpy besides being able to do vector math, is that it has built in fuctions to compute statistical values like mean, median, sum, etc.

In [34]:
#Computing the mean
numpy_arr.mean()

160000016.0

You can read more about [numpy](https://docs.scipy.org/doc/). If that looks too overwhelming, you can refer the free Udacity course I had mentioned. They provide a very detailed overview about numpy. And the rest, comes with practice.

## Pandas

Numpy just made computation faster. Can life get more easier? Yes it can. And that is where pandas comes into play.
Pandas provides a data structure called pandas frame for exclusively storing spread sheet style data, processing as fast as numpy and also giving much more functionality and easy syntax! How cool right? Let's dive in.

#### Loading in csv's and making our lives easier.

The last notebook was all about opening and reading the document and processing with Loops after Loops after Loops!
Well in this notebook, we will show you how you can do that in a single line.