<h1><center>Python in a Nutshell</center></h1>
<h2><center>A (very) brief introduction for Matlab users</center></h2>
<br/>
<br/>
<center>Simon Pezold,</center>
<center>December 11, 2018</center>

# Goal of the next two sessions

## Get to know …

* the Python programming language
* similarities and differences between Python and Matlab
* standard Python libraries for numerics, data I/O, image processing, data visualization
* tools to write, execute (and debug) Python code

**Side note**: This tutorial covers the most recent version of Python as of writing, which is 3.X. Jumping from Python 2 to Python 3, some changes were made to the language that make these versions sometimes behave differently. Differing Python 2 behavior will *not* be pointed out here.

# Prerequisites
## What you need

* for now, just an up-to-date browser
* go to https://github.com/spezold/python-intro
* click on *launch binder*

# Your (inevitable) first line of code
## Hello, world!
Write `print("Hello, world!")` in the field below, then press `Shift`+`Enter`.

# The big picture
## Python vs. Matlab: What is *similar*?

Python is a programming language. Similar to Matlab's programming language, we have
* a clearly defined syntax
* different data types
* variables
* operators
* means of control flow (loops, `if` statements, …)
* functions (and classes and methods …)
* …

# The big picture
## Python vs. Matlab: What is *different*?

* Python's syntax is different from Matlab's
  * Some code looks quite different
  * Some code may look familiar, but has a different meaning
* Python is *only* a programming language
  * We have multiple ways to write and execute Python code
* Python is a *general purpose* programming language
  * Functionality like numerics and plotting need to be explicitly included in a project
* Python libraries are developed by multiple *(third) parties*
  * For certain functionality, there might be (a) multiple implementations or (b) no implementations at all
  * Documentation of these third-party libraries is very heterogeneous
* Python is *free* software
  * No cost
  * You could (theoretically) create/adapt your own version of Python
  * Still, Python may be used to write commercial software

# Further reading
## General Python introductions

Jake VanderPlas: *A Whirlwind Tour of Python*

* A brief general introduction to the Python programming language
* [Free PDF version](http://www.oreilly.com/programming/free/files/a-whirlwind-tour-of-python.pdf) (ca. 100 pages)
* [Github repository](https://github.com/jakevdp/WhirlwindTourOfPython)

Jake VanderPlas: *Python Data Science Handbook*

* Building on *A Whirlwind Tour of Python*
* Special focus on Python use in science (data input/output, numerics, visualization)
* [Free HTML version](https://jakevdp.github.io/PythonDataScienceHandbook/)
* [Github repository](https://github.com/jakevdp/PythonDataScienceHandbook)

This tutorial borrows heavily from both!

# Further reading
## Python for Matlab users

* *Enthought*'s webinar: [*Python for MATLAB Users, What You Need to Know*](https://www.youtube.com/watch?v=YkCegjtoHFQ) (YouTube video, ca. 45 minutes)
* Official documentation of the NumPy/SciPy packages: [*NumPy for Matlab users*](https://docs.scipy.org/doc/numpy-1.15.0/user/numpy-for-matlab-users.html)
* Scott Sievert: [*Stepping from Matlab to Python*](https://stsievert.com/blog/2015/09/01/matlab-to-python/) (extensive blog post with some practival guidance)
* *Mathesaurus*: [*NumPy for MATLAB users*](http://mathesaurus.sourceforge.net/matlab-numpy.html) (extensive comparison table, but a bit outdated)

# Python syntax overview

Let's look at the following bit of code (and run it by pressing `Shift`+`Enter`):

In [None]:
i = "Hello!"
print("Before the loop:", i)
for i in range(3):  # `i` will iterate over 3 values: 0, 1, 2
    if i == 1:
        print("This is special - iteration", i)
    else:
        print("Inside the loop - iteration", i)
print("After the loop!")

### Things to note:
* Comments start with `#`
* There is no special end-of-line marker
* Indentation matters!

### End of line

In general, the end of a line of Python code is the end of the line of text.

In [None]:
x = 1
y = x + 1

To fuse two consecutive lines, either use brackets `()` or a backslash `\`.

In [None]:
m = (1 + 1 +
     1 + 1)
n = 1 + 2 + \
    3 + 4

To put two lines of Python code onto the same line of text, use a semicolon `;`.

In [None]:
a = 1; b = 2
# This is equivalent to
a = 1
b = 2
# We can do the same with function calls:
print(a); print(b)

Unlike Matlab, there is *no* distinction between "silent" and "verbose" code lines (using `;`). All lines are silent by default. To produce an output, use the `print()` function.

In [None]:
print(x)

### Indentation

In [None]:
for i in range(3):  # `i` will iterate over 3 values: 0, 1, 2
    if i == 1:
        print("This is special - iteration", i)
    else:
        print("Inside the loop - iteration", i)
print("After the loop!")

* Look at the `for` loop or the `if`-`else` statements: There is no such thing as "end", "endif", or "endfor" in Python.
* What belongs to the loop and to the parts of the condition really only depends on the level of indentation.
* For indentation, either use tabs or spaces (usually 4). Both works, but don't mix them.

# Variables
## Types

### Dynamic typing
Python, like Matlab, is a *dynamically typed* language. This means, a variable can change its type over time. The following code is perfectly fine.

In [None]:
x = 1          # now, `x` is an integer
x = "hello"    # now, `x` is a string
x = [0, 1, 2]  # now, `x` is a list

### Basic data types
Python has all the basic types built in that are necessary for day-to-day general purpose programming: Here is an overview:

| Type       | Name                  | Explanation/Examples |
| ---------- | --------------------- | ------------------- | 
| `int`      | integer               | …, $-3$, $-2$, $-1$, $0$, $1$, $2$, $3$, …
| `bool`     | Boolean               | `True`, `False`
| `float`    | floating-point number | $0.5$, $6.62\cdot10^{-34}$, `inf`, `NaN`, … 
| `complex`  | complex number        | `2 + 1j`
| `string`   | string                | All kinds of text — also used for single characters
| `NoneType` | -                     | void/nothing

### A quick comment on the *NoneType*
The `NoneType` only has one possible value – `None`. It does not have an equivalent in Matlab and is mainly used in two situations:
1. as the return value of functions that do not explicitly return anything,
2. as a placeholder for function parameters or variables to signify that they have not been set with an actual, meaningful value.

### Everything is an object

In Python, different from many other programming languages, instances of even the most basic types (such as integers or floating point numbers) are *objects*. What does that mean? In programming, an object is an entity that encapsulates both *data* and respective *functionality*.

For example, a string (or `str` in Python) brings along its own functionality to get its uppercase representation.

In [None]:
print("hello".upper())

Likewise, a floating point number (or `float`) may return itself as a ratio of integers.

In [None]:
print(0.25.as_integer_ratio())

In both cases, we called a *method* on an *object*. How do we know, which methods an object provides? Usually, we can directly call the `help()` function on an object.

In [None]:
help(0.25)

If that does not work, we first have to find out the type of the object, and then call `help()`.

In [None]:
help(type("hello"))

# Variables as pointers
What do you think will be the output of `print(b)`? Press `Shift`+`Enter` to find out. What would have happened in Matlab?

In [None]:
a = [1, 2, 3]  # Create a list that contains the integers 1, 2, 3
b = a
a.append(27)   # Append the integer 27 at the end of `[1, 2, 3]`
print(a)
print(b)

* Variable `b` is the same list as variable `a`! In Matlab, we would have gotten `[1, 2, 3]` for `b` instead!
* Rather than thinking of Python variables as "buckets for values" we should think of them as "pointers to values":
  * Both `a` and `b` refer to the *same junk of computer memory* that actually contains our integers 1, 2, 3.
  * Once we append 27 to the list, the change is reflected in both variables.
  * We would have achieved the same result, writing `b.append(27)` rather than `a.append(27)`.

What do you think will now be the output of `print(b)`? Press `Shift`+`Enter` to find out. Does the result still fit to the "pointers vs. buckets" idea?

In [None]:
a = [1, 2, 3]
b = a
a = [1, 2, 3, 27]
print(a)
print(b)

* Although we wrote `b = a`, printing the variables `a` and `b` shows different values in the end.
* This is because of the line `a = [1, 2, 3, 27]`:
  * It creates a new list in a separate junk of computer memory …
  * … and immediately makes `a` refer to the new junk.
  * At the same time, `b` still refers to the original junk of memory with the original list.

What do you think will now be the output of `print(b)`?

In [None]:
a = 1
b = a
a = b + 1
print(a)
print(b)

* Again, the third line `a = b + 1` is essential for the different outputs:
  * It reads the value to which `b` refers (which is 1), …
  * … it adds 1, puts the result into a new junk of computer memory …
  * … and immediately makes `a` refer to the new junk.
  * At the same time, `b` still refers to the original junk of memory with the original value.
* We can see this by printing, after each line, the `id` of our variables (which is actually the memory address of our "junks of memory"):

In [None]:
a = 1       # Line 1
print("a is located at", id(a))
b = a      # Line 2
print("b is located at", id(b))
a = b + 1  # Line 3   
print("a is located at", id(a))
print("b is located at", id(b))

* In `Line 1`, we reserve a new junk of memory, put 1 there, and let `a` refer to it.
* After `Line 2`, both  `a` and `b` point to the same junk of memory – they have the same memory adress.
* After `Line 3`, `a` points to a *new* junk of memory (new memory address) that now contains the result of our addition.
* At the same time, `b` still points to the original junk of memory (old memory address) that still contains the old value (1).

**Bonus question**: What happens if you type `print(id(1))` and `print(id(2))`? Does that make sense?

For a more detailed explanation, see the Python FAQ: [*Why did changing list ‘y’ also change list ‘x’?*](https://docs.python.org/3/faq/programming.html#why-did-changing-list-y-also-change-list-x)

# Operators

## Arithmetic operators on numbers

Python defines all standard and some non-standard arithmetic operators on `int` and `float` data:

| Operator | Name | Explanation |
| -------- | ---- | ----------- | 
| a + b    | addition
| a - b    | subtraction
| a * b    | multiplication
| a / b    | (regular) division
| a // b   | floor division | divide regularly, then round down the result
| a % b    | modulo operator | remainder after floor division
| a ** b   | exponentiation | $a^b$
| -a       | negation

While most of them behave quite familiar, the ones related to division deserve a closer look. Let's compare the following:

In [None]:
print(10 / 3)

This is *true division*.
* This is division as we know it: $10 / 3 = 3 \frac{1}{3}$.
* Note that the result ends with $5$ rather than $3$. This is because floating point numbers cannot be stored with arbitrary precision on the computer (but that is a topic beyond the focus of this tutorial).

In [None]:
print(10 // 3)

This is *floor division*.
* We divide as before, then round down.
* This is the kind of division many programming languages do with `/`, given integer values .<br/>
  Python 3, like Matlab, uses true division with `/` and all number types. Floor division requires explicit use of `//`.

In [None]:
print(10 % 3)

This is the *modulo operation*.

* It produces the remainder after floor division.
* For positive numbers, this probably matches the kind of division that you first learned in primary school: $3$ completely fits $3$ times into $10$. But as $3\cdot3=9$, a remainder of $1$ remains.

We can combine the last two operations to gain the original dividend:

In [None]:
dividend = 10
divisor  = 3
print((dividend // divisor) * divisor + (dividend % divisor))

Given the last statement, what will be the results of the following lines of code, using negative numbers?

In [None]:
dividend = 10
divisor  = -3

print(dividend // divisor)
print(dividend % divisor)
print((dividend // divisor) * divisor + (dividend % divisor))

## Arithmetic operators on other types
Some types other than numbers also bring about meaningful functionality for arithmetic operators.

For example both *lists* and *strings* use `*` and `+` for concatenation.

In [None]:
print("apple" + "pie")
print("bla" * 3)

print([1, 2] + [3, 4] + [5])
print([1, 0] * 4)

## Assignment and shorthand operators

By now, we have implicitly used the assignment operator `=` quite a bit:

In [None]:
counter = 13  # initialize variable `counter` with 13
print(counter)

Very often in programming, we want to combine an assignment with an operation on a variable's previous value.

For example, we want to *increment* or *decrement* a counter variable.

In [None]:
counter = 43  # initialize variable `counter` with 43

counter = counter - 2  # decrement `counter` by 2
print(counter)

counter = counter + 1  # increment `counter` by 1
print(counter)

Python, like Matlab, provides shorthand notations for this.

In [None]:
counter = 43  # initialize variable with 43

counter -= 2  # decrement counter by 2
print(counter)

counter += 1  # increment counter by 1
print(counter)

Such a shorthand notation is provided for most binary operators and types. This is usually the same as the full-length notation.

In other words, generally speaking, the operation `a = a ∘ b` with the binary operator `∘` can be replaced by `a ∘= b`.

What is the result of `a /= 3` and `a **= 2` below?

In [None]:
a = 6
print(a)
a /= 3
print(a)
a **= 2
print(a)

There is a subtle catch with *mutable* types though, i.e. types that may actually alter their value, such as *lists*. (Remember the example from above?) Here,
* the shorthand notation `a ∘= b`  is an *in-place operation*, i.e. the original value is altered, 
* the full-length notation `a = a ∘ b` returns a new instance of the type. See here:

In [None]:
l1 = [1, 2, 3]
l2 = l1
l1 += [4, 5]  # Append `[4, 5]` to the list that both `l1` and `l2` refer to
print(l1)
print(l2)

In [None]:
l1 = [1, 2, 3]
l2 = l1
l1 = l1 + [4, 5]
print(l1)  # `l1` now refers to a new list that results from concatenation …
print(l2)  # ... while `l2` still refers to the original list, which remains unaltered

## Comparison
Python, like Matlab, provides all usual comparison operators:

| Operator | Name / Explanation |
| -------- | ------------------ | 
| a == b   | equals
| a != b   | does not equal
| a < b    | less than
| a > b    | greater than
| a <= b   | less than or equal
| a >= b   | greater than or equal

Note that in Python the $\neq$ ("does not equal") operator is written as `!=`

Complete the following piece of code, replacing `...` by the correct comparisons. It should print the correct message, depending on what value we enter for `a`.

**Bonus question**: Do you think, the `else` part can ever be reached? If so, how?

Side note: even without filling in `...`, we can run the following code without error (albeit with meaningless output). This is because `...` is *also* an object in Python – the `Ellipsis` object – and thus the code below has valid Python syntax. While `...` is used as a placeholder below, we will see its actual meaning later.

In [None]:
a = ...  # We should be able to write any other number instead of `a` here

if ...:
    print("a is zero")
elif ...:
    print("a is negative")
elif ...:
    print("a is positive")
else:
    print("a is very strange")

### Truth values
The type of such a comparison's result is perhaps not surprising. Do you remember how to find out the type of a value?

Print (1.) the *result* of comparing `1` to `2` for equality and (2.) the *type* of this comparison below.

The result of such a comparison is either `True` or `False`. We sometimes call this a Boolean, or `bool` in Python. This data type corresponds to the `logical` data type in Matlab.

In calculations, we can use `True` like the value `1` and `False` like the value `0`:

In [None]:
print(True + 1)  # Same as 1 + 1
print(False + 1)  # Same as 0 + 1

Conversely, in `if` statements we can use `0` like `False` and non-zero values like `True`:

In [None]:
for i in range(-3,3):
    if i:
        print(i, "I am True")
    elif not i:
        print(i, "I am False")
    else:
        print(i, "I should never happen")

### Boolean operators
We can combine Boolean data with the boolean operators `and`, `or`, `not`:

**Bonus question**: How could you write `not i == 2` in a more simple way?

In [None]:
for i in range(-10, 10):  # iterate i over 20 values: -10, -9, ..., 8, 9
    if i > 0 and not i == 2:
        print(i, "I am positive, but not two!")
    else:
        print(i, "I am either negative, zero, or two!")

Try to fill in the `...`, so that the `if` statement correctly produces the printount for even positive numbers.

Hint: you might want to use the modulo operator `%` here.

In [None]:
for i in range(-10, 10):
    if ...:
        print(i, "I am positive and even!")

### More operators

Apart from the arithmetic operators and boolean operators that we have seen above, there are more operators in Python, such as the check for *object identity* with `is`, the check for *membership* with `in`, and the *bitwise* operators (`&`, `|`, `^`, `<<`, `>>`, `~`). You can find more information on them e.g. in the book *A Whirlwind Tour of Python* (recommended for further reading above).

### Apart from *truth*, there is *truthiness*

In Python, we can create `bool` values from other values by calling `bool()`. Surprisingly, this does not only work for numbers in a meaningful way, but also for other types, such as strings or lists:

In [None]:
a = 0
print(a, "results in", bool(a))
b = 1
print(b, "results in", bool(b))
c = []  # An empty list
print(c, "results in", bool(c))
d = [1, 2, 3]
print(d, "results in", bool(d))
e = ""  # An empty string
print(e, "results in", bool(e))
f = "False"  # Although we write "False" here, this results in `True`. It's just because it is not an empty string!
print(f, "results in", bool(f))

We can see that when we convert an empty list or string to a `bool`, we get `False`. On the other hand, if the list or string is not empty, we get `True`. Such objects are sometimes called *truthy* or *falsy*, as they do not actually have the Boolean value `True` or `False`, but can be interpreted as such. As a rule of thumb:

* All kinds of numbers that are zero are *falsy*,
* all kinds of collections (such as lists) that are empty are *falsy*,
* everything else is *truthy*.

The same behavior is shown if we use *truthy/falsy* objects in an `if` statement. This can be very helpful, for example, if we want to do something with a list only if it actually contains something:

In [None]:
list_of_lists = [[], [1, 2, 3], [1, 2]]  # A list that contains one empty and two non-empty lists
for current_list in list_of_lists:  # `current_list` will iterate over all items in `list_of_lists`
    if current_list:
        print(current_list)
    else:
        print("The current list is empty -- nothing to be seen here!")

## Data types for collections
We have already seen basic data types, such as `int`, `bool`, and `float`, which (apart from `str`) all can hold *single* values only. However, Python also brings about data types to hold *collections* of values. From these, we have already used lists above, but here is a more complete overview:

| Name    | Example                           | Properties |
| ------- | --------------------------------- | ---------- |
| `list`  | `[1, 1, 2]`                       | ordered collection, mutable
| `tuple` | `(1, "two", 3)`                   | ordered collection, immutable
| `set`   | `{1, 2, 3}`                       | unordered collection of unique elements
| `dict`  | `{"name": "John", "height": 180}` | collection of key-value pairs

### Data access: indexing
We can access the elements in a `list`, `tuple` and `dict` (i.e. *dictionary*) objects with the respective index in brackets `[]` after the object or its name (e.g. `my_list[5]`).

Try to access the first element in the given list and print it:

In [None]:
my_list = ["a", "b", "c"]
print(...)

We note a big difference to Matlab: Python indexing is *zero-based*, i.e. the "first" element of a list/tuple is at index 0!

Python indexing of ordered collections is best visualized with the following scheme:
```
 0  —   1  —   2  —   3   <- positive index
 |      |      |      |
 | "a"  | "b"  | "c"  |   <- actual content
 |      |      |      |
-3  —  -2  —  -1  — None  <- negative index
```

The scheme shows two things:
1. We have *two* ways of indexing a list:
   * one from the *start* with *positive* numbers or 0 (e.g. "a" is at index 0, "c" is at index 2),
   * one from the *end* with *negative* numbers (i.e. "c" is at index -1, "a" is at index -3)
2. If we want to get a sub-collection of a collection, we have to *exclude* the upper bound:

In [None]:
my_list = ["a", "b", "c"]

print(my_list[0:2])  # This will give us the *first two* elements
print(my_list[:2])   # Same as above, but shorter

print(my_list[-2:None])  # This will give us the *last two* elements
print(my_list[-2:])      # Same as above, but written shorter

Try to print
* the last element of the given list …
  * … as the actual element
  * … as a (sub)-list containing only the element
* the last three elements of the given list
* all elements of the list, except for the first and last ones

In [None]:
l = ["a", "b", "c", "d", "e", "f", "g"]
print(...)  # the last element itself
print(...)  # a list containing the last element only
print(...)  # the last three elements
print(...)  # all except for the first and last

### Data access in dictionaries
In a dictionary, we can access all *values* by using their *keys* as index. Dictionaries are mutable, so we can also change the value for each key:

In [None]:
person = {"last_name": "Doe", "first_name": "Jane", "height": 170, "unit": "cm"}

print(person["first_name"], person["last_name"])
print(person["height"], person["unit"])

person["height"] /= 100
person["unit"] = "m"

print(person["height"], person["unit"])

To know which keys are valid, we can ask for all of them by calling `keys()` on the dictionary:

In [None]:
person = {"last_name": "Doe", "first_name": "John", "height": 170, "unit": "cm"}
print(person.keys())

## Data type missing: how about vectors and matrices?

So far, you might have wondered how to do matrix math in Python. Should we use lists for this?

The answer is: no!

**Python, unlike Matlab, does *not* have built-in types for vectors and matrices**

But:

**There is a quasi-standard package for this, which we can install: it is called *Numpy***

We will see how to work with *Numpy* in the next lesson.

# Control flow

## Conditional statements
In Python, `if` statements work pretty much as in Matlab. Key differences:

* There is no `end`. The end of an `if` block is defined by *indentation*
* `elseif` (Matlab) becomes `elif` (Python)
* There must be a colon `:` at the end of each `if`/`elif`/`else` line

There are *no* `switch`-`case` statements in Python. We use `if` statements in such situations.

In [None]:
x = input("Please enter an integer:")
x = int(x)

if x < 0:
    print("input is negative")
elif x == 0:
    print("input is zero")
else:
    print("input is positive")

## `for` loops
With `for` loops, we can iterate over a predefined collection of items:

In [None]:
my_list = ["a", "b", "c"]
for item in my_list:  # Iterate over all items in my_list
    print(item)
for value in range(3):  # Iterate over 3 values: 0, 1, 2
    print(value)

Note that, although it behaves exactly the same here, `range(3)` is not a `list`:

In [None]:
print(type(range(3)))

`range(3)` is a `range` object. These objects produce their values "on the fly", rather than storing them all in computer memory. If we want to iterate over *a lot of integer values* in consecutive order, this is extremely memory-efficient.

## `while` loops
A `while` loop iterates as long as its condition is fulfilled (we exploit the "truthiness" of non-empty lists here):

In [None]:
my_list = ["a", "b", "c"]
while my_list:
    last_element = my_list.pop()  # Remove and return the last element
    print("Last element was:", last_element, "-- what remains is:", my_list)

Increment variable `v` in steps of 2 until it reaches the value of 10. Print out the value of `v` each time before incrementing it.

In [None]:
v = -4
while False:  # Replace `False` by a suitable condition ...
    ...       # ... then do the work here

# TODO, next lecture
* defining functions (differences from Matlab), lambda functions
* error handling
* Numerics with Numpy
* Setting up Python with Anaconda
* Writing and debugging Python with Spyder
* Plotting with matplotlib
* Image I/O with skimage
* Data I/O with pandas
* …