# ERDA Jupyter Setup

In this step you are becoming familiar with ERDA (the Electronic Research Data Archive at KU Science), where you will be able to run Python in the context of Jupyter Notebooks, with text and executable Python code mixed.

## Editing a Jupyter notebook

A Jupyter notebook is a collection of text and executable code that works like an interactive document. Each cell can be evaluated in an arbitrary order. A _kernel_ running Python is executing the code (see "Kernel" in the top menu bar), and will remember the results and e.g. variable definitions between different cells. Often the notebook is intended to be run from top to down, but for example if you play with changing parameters in a setup, it can be useful to edit a small section of cells, and re-evaluate the code. 
Reading this on ERDA or in a copy on your laptop you can edit the document. The three most important things to know about editing is that
 - a cell can either be formatted as `Code`, `Markdown` (think simplified html. like this cell), or `raw` text. You can select the format in the notebook menu-bar.
 - `shift+Enter` will evaluate the contents of a cell, formatting the Markdown text or executing the Python code.
 - if you would like to add a new cell you can do that by marking a cell, and pressing the `+` sign in the notebook menu-bar. Using the mouse you can change the order of cells by dragging them using the blue ribbon in the left side.

# Python basics

This section, and the next, summarizes _and exemplifies_ some of the basic elements of Python that we are using in the course. To check that you are on top of these things, there is a small Absalon assignment at the end

#### Tutorials, Documentation and Online Help

Arguably, the fastest and best way to become proficient in Python programming (or in any programming language) -- without having to read longwinded tutorials and lessons -- is by learning from examples.  Code constructs used in a context are much more helpful than the ever so detailed descriptions of language elements. Nevertheless, here are a few links to a small selection of tutorials and books:

* [Learn Python in 10 minutes](https://www.stavros.io/tutorials/python/) (free on the web) -- brief and very useful
* [Think Python (2nd ed.)](https://greenteapress.com/wp/think-python-2e/) (free on the web) -- more longwinded
* [Introduction to Computation and Programming Using Python](https://mitpress.mit.edu/books/introduction-computation-and-programming-using-python-third-edition) (book) -- comprehensive, as expected from a book
* ... or just Google, e.g. "the best Python tutorials" or "the best short Python tutorial" or ...

If you want to accomplish a specific thing with Python, here are three good approaches to get help:
* Google is your help. You will often find links to StackOverflow, which is a community driven site.
* Look up the reference documentation. This is particularly useful for plotting with Matplotlib or for Numpy routines (see below)
* Use your peers: you are 28 students this year, and you are welcome to help each other with coding issues

## Basic computations in Python
To evaluate an expression, just type in normal mathematical notation (exponentiation is **).  

In [None]:
4*69.23*(17.04-9.86)**2

Note that division of integers results in a float.  To get integer division results, use double division sign:

In [None]:
5/2

In [None]:
5//2

In [None]:
-5//2

## Using standard mathematical functions
Standard math functions, such as exp() and sin(), are not available by default:

In [None]:
exp(1.)+cos(pi)

For interactive use, you can get access to standard math and plotting with the "magic command" %pylab, which also defines some common variable names, such as "pi". This is handy for small computations, but maybe not recommended for larger notebooks, as discussed below:

In [None]:
# This works but also populates the namespace of the session
# with a lot of global varioables, so is commented out
#%pylab inline
#exp(1.)+cos(pi)

## Basic Language Elements

In addition to normal variables (_scalars_), we will be using _lists_, _tuples_, _arrays_, _dictionaries_, and in particular _classes_. They are all examples of _objects_:
### Lists
_Lists_ are just a series of things, not necessarily numbers:

In [None]:
mylist=[1,2,'a','b']
mylist

In [None]:
mylist[3]

In [None]:
mylist.append('c')
mylist

Here `append()` is a _method_ that all list are born with. In addition to `append`, lists have many other useful built-in methods; cf. the output from

In [None]:
help(list)

The `__name__` entries with underscores are "hidden" methods 

Lists are typically used (also by us) to collect and arrange things.

## Tuples
_Tuples_ are similar to lists, but are _immutable_; i.e., their length and order cannot be changed:

In [None]:
mytuple=(1,2,3,'a','b',2)

A tuple has just two built-in methods: _count(item)_ counts how many _item_ it has, and _index(item)_ gives the (first) location. Notice that Python is zero based. The first element has index 0:

In [None]:
mytuple.count(2)

In [None]:
mytuple.index(2)

In [None]:
mytuple[3]

Tuples are used as argument lists to functions, and to return values from functions.

In [None]:
help(tuple)

## Arrays

_Arrays_ are what we normally think of as arrays in math; things to use in compuations.  If you have not used the `%pylab inline` command above, to get access to them and to use `pi` and standard mathematical functions, you first need to import them from `numpy`:

In [None]:
import numpy as np

In [None]:
a=np.array([12.,25.,14])

In [None]:
a/2

In [None]:
a**2

In [None]:
np.sin(a*np.pi/180)

A list of lists becomes a two dimensional (2-D) array. While lists can contain anything, they are also slower and less flexible to work with. Therefore we will often convert lists representing vectors and arrays only containing numbers to `numpy` arrays with `np.array`. This gives the convenience of being able to multiply, add, and do other mathematical manipulations with them, and the code executes much faster:

In [None]:
b=np.array([[1,2,3],[4,5,6]])

In [None]:
b.shape

In [None]:
b[0,2]

In [None]:
b[0]

In [None]:
b[:,0]

In [None]:
b.max()

In [None]:
print(b)
print(b*2)

## Objects
_Objects_ actually need no introduction, since everything in Python is an object.   We have already seen objects exemplified by _lists_, _tuples_, and _numpy arrays_, and even just the numbers we started with are objects.

Objects have _attributes_ and _methods_.  Think of attributes as scalars, which represent a property of the object.  One example is the  `.shape` attribute of an array, which gives it's "shape" ("dimension").  Think of _methods_ as functions, which operate on the object. One example is the `.max()` method of an array that we just used to return the largest element.

To make a new object in python we define a _class_. This example shows a new class which initially only contains the attribute `attr` and function `print_vars`:

In [None]:
class myclass():
    def __init__(self):
        self.attr = 'a string'
    def print_attribute(self):
        print(self.attr)

To create an _instance_ of the class (a new object of that type) we have to call the class, which will create the object and execute the hidden `__init__` method:

In [None]:
myobject=myclass()

We can add new attributes to the class on the fly. This is a way to attach data to the object:

In [None]:
myobject.a=14
myobject.b=11

To see the contents of an object:

In [None]:
print(vars(myobject))

We can get information about the content of the _attr_ component by using the function `print_attribute` defined by the class

In [None]:
myobject.print_attribute()

Objects defined by classes are good for describing and manipulating properties of things  -- this is exactly why _attributes_ and _methods_ have their names. We will use classes extensively in the course, since they map very nicely to the core objective, namely exploring and learning numerical methods, and the theory behind them. Because classes are used so much throughout the exercises, please make sure you have grasped the basic idea behind classes and objects in python.

## Dictionaries
_Dictionaries_ is a way to index a collection of things with anything; numbers, text, other objects.  Curly brackets are used to create a dictionary, which could be empty:

In [None]:
mydict={}

Now we can enter whatever we want, with whatever "index" we want:

In [None]:
mydict[1]='a text thing in mydict'
mydict[2]=23
mydict['a']=2
print(mydict)

Or, it could be initialized from the start:

In [None]:
mydict = {1:'a text thing in mydict', 2:23, 'a':2}

Retrieiving "things" is done with [] brackets, just as with _lists_, _tuples_, and _arrays_:

In [None]:
mydict[2]

In [None]:
mydict['a']

The "argument" (here `2` and `'a'`) is called the _key_, and the righthand side is called the _value_.

Dictionaries are typically used to collect and retrieve information, and we will sometimes be using them to store information about experiments and exercises.

### __Task 1:__

Make a _dictionary_ named `experiment`, which contains four items, illustrating how one could collect information related to a scientific experiment:
   1. Your name, as a text string (key: 'name')
   2. A tuple, with the current year, month, and date (key: 'date')
   3. A list with, say, times ('hr:min') when some lab measurement was made (key: 'times')
   4. A numpy 2x3 array, which we imagine be a table of fixed input values needed (key: 'table')

Once you have done that, make a _class_ `experiment_class` with four attributes (name, date, times, table) and a function, `print_vars`, to print the content of the class.

In [None]:
import numpy as np
experiment={}
experiment['name'] = ...
...

The dictionary is meant as an example on how one can collect a number of different pieces of information into one single entity -- in Python this is typically either a _dictionary_ or a _class_.

To print the content of the dictionary and of the class do

In [None]:
print(experiment)
expc = experiment_class()
expc.print_vars()

# Python Control Constructs

A basic set of constructs in a programming language are control statements, that can change or repeat the flow of a program. In this section we will use introduce basic control statements in Python

## For loops

For loops have this syntax -- note the colon and the (automatic) indentation:

In [None]:
n=10
for i in range(n):
    print(i,i**0.2)

range() is a built-in _generator_; a _thing_ that can generate a range of numbers (but that _isn't_ an actual range of numbers, just a generator for it).  Hence you can ask for a billion numbers to be generatded in a for loop, without actually using the memory that would be needed to hold a billion numbers.

In [None]:
range(2,n)

By itself, `range()` is just an expression (a "generator"). But the generator can also produce an actual list, or an array:

In [None]:
list(range(2,n))

In [None]:
from numpy import array
array(range(3,n,2))

The `arange()` generator which is part of `numpy` spits out floating point numbers, for example (note that -- as with `range()` -- the 2nd range limit is not included; this is consistent with the [a,b) math notation, and also gives in this example (2.0 - 1.0) / 0.1 elements

In [None]:
from numpy import arange
frange = array(arange(1.0,2.0,0.1))

In fact, any such "iterator" -- a generator, a list, a tuple, etc -- can stand after the __in__ in a __for__ loop. This is very useful if we would like to operate on a range of objects and makes the notation compact and clean:

In [None]:
for f in frange:
    print(f)

It is also possible to iterate on several lists in a single loop using `zip`:

In [None]:
for r,f in zip(range(10,20), frange):
    print(r,f)

In the common case of iterating over the index number and the elements of a list `enumerate` is useful:

In [None]:
for idx,f in enumerate(frange):
    print(idx,f)

## While loops

A __while__ loops kees running until the condition after the `while` is false:

In [None]:
n=1
while n < 1100:
    print(n)
    n *= 2

## Breaking out of loops

Sometimes one wants to break out of a loop before it is finished, based on some test.  This is done with a __break__ instruction.  Here's an example, giving the first N-faculty (N!) larger than a hundred thousand

In [None]:
a=1
a_max=10**5
for i in range(2,109):
    a=a*i
    if a>a_max:
        break
print(i,a)

Surprisingly few loops, right? 

## Conditional constructs

The syntax of __if statements__ in Python is

In [None]:
condition=[False,False,False,False,True]

if condition[0]:
    print('code 0')
elif condition[1]:
    print('code 1')
elif condition[2]:
    print('code 2')
else:
    print('code 3')

As always in Python, conditional blocks begin after a colon at the end of a line, and are set aside by indentation only -- no  _begin/end_ pairs or curly brackets are used. _It is therefore extremely important to keep track of the indentation of your code, because results depend on it._

A condition may also be used in direct assignments:

In [None]:
a=12 if condition[1] else 11
print(a)

## Comprehension

A _comprehension_ is a compact way to make a new _list_, which is a function of or selection from an existing _list_ (or _iterator_).  It looks like this:

In [None]:
a = [1,3,4,8,5,3,2,4,6,3]
b = [x**2 for x in a]
print(b)

That example produces a one-to-one mapping of a list to a new one of the same length. A _selection_ is made by adding an `if` expression:

In [None]:
c = [x for x in a if 2*(x//2)==x]
print(c)

### __Task 2:__

The task is to make a for-loop construct that produce a list of all primes smaller than a thousand, using for example code like this:

```
list=[]
for i in range(2,1000):
    ok=True
    ...
    if ok:
        list.append(i)
print(list)
``` 

Hints:
 1. A number is a prime if it is not divisible by any prime smaller than itself
 2. In Python 3, the expression `11/2` gives `5.5`, while `11//2` (integer arithmetics) gives 5

# Python Libraries

In the course, and in your general (data analytics and visualization) work with Python you will typicall be using the __numpy__ and __matplotlib.pyplot__ Python 'modules'.  A Python _module_ is a "library" with useful procedures (in practice either a file `library.py` or a folder `library/`).  To use procedures in a libary, one can refer to them by their full name, after an `import` statement, such as in

#### import

In [None]:
import numpy
x = numpy.arange(0.0,1.0,0.05)
print(x)
import matplotlib.pyplot
matplotlib.pyplot.plot(numpy.sin(x*2.0*numpy.pi),'-+')

This gets pretty heavy and unreadable, though, so one can use short-cut names for the libaries:

#### import as

In [None]:
import numpy as np
x = np.arange(0.0,1.0,0.05)
print(x)
import matplotlib.pyplot as plt
plt.plot(np.sin(x*2.0*np.pi),'-+');

### %pylab inline and from ... import

It is possible to import all procedure from a library, using this syntax:

```
from numpy import *
from matplotlib.pyplot import *
```

Since using the `numpy` and `matplotlib.pyplot` procedures in interactive work is extremely common and basic, there is a "magic command" (one among several) that accomplishes this, and also chooses (via the argument), the type of graphics interpreter to use. `%pylab inline`

### Don't do this in the general case!

For a several reasons this is not recommended in the general case (it leads to socalled _namespace pollution_):
   1. The number of imported procedures can be huge, and libraries may contain names that overlap with variable names, or with names in other libraries which can cause confusion.
   2. Procedures may mask (overload) builtin Python procedures (such as `abs(), sum(), int()`and so on), which can also cause confusion.
   3. Many Python editors have automatic syntax control, and will not detect missing og mismatched procedure names in this case.
   4. Not much typing is gained by omitting `np.` and `plt.`, and one quickly gets used to typing the abbreviations.
   5. It is good practice to keep the `name.` syntax in front of the procedure, because it informs the reader (yourself in a couple of months!) in which library the procedure is defined. 

So, when using other libraries, and also when using `numpy` and `matplotlib.pyplot` in files with pure Python code (perhaps a personal library), it is better to type names that include the (abbreviated) library name !

#### __Task 3:__

Let's take a look at the `numpy.roll` function, which we will need in the next Jupyter Notebook

In [None]:
import numpy as np
help(np.roll)

Suppose we have a list `f=[0,1,2,1,0,-1,-2,-1]`, which we would refer to as $f_i$ in math notation

What would be the expression that produces the list that corresponds to $f_{i+1}$ ?

# Embedded animation with matplotlib

There exists a large variety of powerful visualization tools in Python. Arguably, the most popular package is `matplotlib`, and in particular the sub package `matplotlib.pyplot`. We will use it in the exercises to explore our data. You will learn by example, and I assume you have a basic idea of how to produce plots. Plots are really good for analysing data.

An alternative, less quantitative but more intuitive approach is to display animations. Our eyes and brains can comprehend more data if it is changing compared to e.g. making hundreds of panel plots. Here is a basic example of how to construct an embeded animation in a jupyter notebook. You will encounter more in the exercises the coming weeks.

There are (at least) two ways to embed animations. The most popular it seems is to use HTML5 to embed a pre-rendered movie. The movie is encoded with an external tool. This method does not currently work on ERDA because the program to encode movies, `ffmpeg`, is not installed. An alternative method that does work and is actually better for exploration is to use javascript to produce a widget that can be viewed frame by frame or as an animation.

#### Animation based on a list of precomputed images

Often we will make a simulation where there is some time avancement and more complicated logic. In this example we will build the animation by storing a list of scenes and then passing them to the animate library for it to be rendered. Each scene consist of a single plot with an image, but it could also be more complicated.

The example is adapted from the matplotlib reference documentation: https://matplotlib.org/stable/gallery/animation/dynamic_image.html

In [None]:
from matplotlib import animation
from IPython.display import HTML

In [None]:
# define the figure to animate (it will show below as a static frame)
fig, ax = plt.subplots()

# simple function to make a 2D dataset
def f(x, y):
    return np.sin(x) + np.cos(y)

# setup coordinate arrays. y is reshaped to become a (100,1) array
# Python define axes from right to left, so x is implicitly a (1,120) array
# When used together the result is promoted to a (100,120) 2D array
x = np.linspace(0, 2 * np.pi, 120)
y = np.linspace(0, 2 * np.pi, 100).reshape(-1, 1)

# ims is a list of lists, each row is a list of artists to draw in the
# current frame; here we are just animating one artist, the image, in
# each frame
ims = []

# make a time-loop
for i in range(60):
    x += np.pi / 15.
    y += np.pi / 20.
    im = plt.imshow(f(x, y), animated=True)
    ims.append([im])

# create animation object
anim = animation.ArtistAnimation(fig, ims, interval=50, blit=True,
                                 repeat_delay=1000)

# render animation javascript widget
HTML(anim.to_jshtml())

### __Absalon turn in:__

This exercise is submitted entirely as a text file or as a PDF with the text.
  1. From task 1: Paste in the code for making the dictionary and the class together with the output from your dictionary and your class:
```
print(experiment)
expc = experiments_class()
expc.print_vars()
```
  2. From task 2: Paste in the code and the printout of the list
  3. From task 3: Type the expression for rolling into the Absalon text field