# Jupyter Notebook and NumPy introduction

Jupyter notebook is often used by data scientists who work in Python. It is loosely based on Mathematica and combines code, text and visual output in one page.

## 1. Basic Jupyter Notebook commands

Some relevant short cuts:
* ```SHIFT + ENTER``` executes 1 block of code called a cell
* Tab-completion is omnipresent after the import of a package has been executed
* ```SHIFT + TAB``` gives you extra information on what parameters a function takes
* Repeating ```SHIFT + TAB``` multiple times gives you even more information

To get used to these short cuts try them out on the cell below.

In [None]:
print 'Hello world!'
print range(5)

## 2. Parts to be implemented

In cells like the following example you are expected to implement some code. The remainder of the tutorial won't work if you skip these.

Sometimes assertions are added as a check.

In [None]:
### BEGIN SOLUTION
three = 3
### END SOLUTION
# three = ?
assert three == 3

## 3. Numpy arrays

We'll be working often with numpy arrays so here's a very short introduction.

In [None]:
import numpy as np

# This is a two-dimensional numpy array
arr = np.array([[1,2,3,4],[5,6,7,8]])
print arr

# The shape is a tuple describing the size of each dimension
print "shape=" + str(arr.shape)

In [None]:
# The numpy reshape method allows one to change the shape of an array, while keeping the underlying data.
# One can leave one dimension unspecified by passing -1, it will be determined from the size of the data.

print "As 4x2 matrix" 
print np.reshape(arr, (4,2))

print 
print "As 8x1 matrix" 
print np.reshape(arr, (-1,1))

print 
print "As 2x2x2 array" 
print np.reshape(arr, (2,2,-1))

Basic arithmetical operations on arrays of the same shape are done elementwise: 

In [None]:
x = np.array([1.,2.,3.])
y = np.array([4.,5.,6.])

print x + y
print x - y
print x * y
print x / y

## 4. Data import and inspection (optional)

[Pandas](http://pandas.pydata.org/) is a popular library for data wrangling, we'll use it to load and inspect a csv file that contains the historical web request and cpu usage of a web server:

In [None]:
import pandas as pd

data = pd.DataFrame.from_csv("data/request_rate_vs_CPU.csv")

The head command allows one to quickly see the structure of the loaded data:

In [None]:
data.head()

We can select the CPU column and plot the data:

In [None]:
data.plot(figsize=(13,8), y="CPU")

Now to show the plot we need to import `matplotlib.pyplot` and execute the `show()` function.

In [None]:
import matplotlib.pyplot as plt
plt.show()

Next we plot the request rates, leaving out the CPU column  as it has another unit:

In [None]:
data.drop('CPU',1).plot(figsize=(13,8))
plt.show()

Now to continue and start to model the data, we'll work with basic numpy arrays. By doing this we also drop the time-information as shown in the plots above.

We extract the column labels as the request_names for later reference:

In [None]:
request_names = data.drop('CPU',1).columns.values
request_names

We extract the request rates as a 2-dimensional numpy array:

In [None]:
request_rates = data.drop('CPU',1).values
request_rates

and the cpu usage as a one-dimensional numpy array

In [None]:
cpu = data['CPU'].values
cpu