## Part I: Basics
In python, the package numpy is used for handling arrays of numeric values

In [None]:
import numpy as np

We can use IPython interactively:

In [None]:
1 + 1

Assigning variables - note that numpy ranges are half-open intervals:

In [None]:
x = np.arange(1, 11)
x

Computations on numpy arrays are vectorized:

In [None]:
x * 2

Sequences can be created using the functions `tile` or `repeat`:

In [None]:
y = np.tile([1, 2], 5)
x * y

In [None]:
y = np.repeat([1, 2], 5)
x * y

There are multiple ways to select elements of a vector. When indexing using numbers, indexes start at 0.

In [None]:
x[0]

Index ranges are half-open:

In [None]:
x[1:4]

Negative index values are used to index from the end of the array:

In [None]:
x[-1] # last element

In [None]:
x[0:-1] # all but the last element

It is also possible to use boolean masks to select elements of a vector:

In [None]:
mask = np.ones(len(x), dtype=bool)
mask[1:4] = 0
x[mask]

Note the use of the numpy function `ones` to create a constant array.

In [None]:
x[np.tile([True, False], 5)]

When creating a range, we can also specify a step size:

In [None]:
x[0::2]

If you work on very large vectors, you might want to define them as a maskable array to do the operation inplace:

In [None]:
x = np.ma.arange(1, 11)
for i in range(1,4):
    x[i] = np.ma.masked
x[~x.mask]

Two-dimensinoal matrices can be created by reshaping an array to the matrix dimension. `order='F'` means that Fortran-like index order should be used (use the array values column-wise, like in R), otherwise the array elements are arranged row-wise.

In [None]:
x = np.arange(1, 13).reshape([3, -1], order='F')
x

To select a single element from a matrix, specify its row and column:

In [None]:
x[1, 2]

Entire rows or colums can be selected by using `:` in the other dimension:

In [None]:
x[0, :]

In [None]:
x[:, [1,2]]

The package `pandas` provides the most commonly used data structure for data analyses in python: the DataFrame

In [None]:
import pandas as pd
chol_df = pd.read_csv('cholesterol.csv')
chol_df

An overview of the data can be printed using `info`:

In [None]:
chol_df.info()

`pandas` offers multiple ways to select elements of a DataFrame. Numeric, boolean as well as character indexing is possible:

In [None]:
chol_df[1:10]

In [None]:
chol_df[chol_df.columns[1:3]]

In [None]:
chol_df.loc[:, 'Time']

In [None]:
chol_df[['Time', 'Cholesterol']]

In [None]:
chol_df.Time

In [None]:
chol_df['Time']

In [None]:
chol_df[chol_df.Time == 1]

DataFrames can be written to disk using the function `to_csv`

In [None]:
chol_df.to_csv('temp.csv', index=False)