# Programmierung in Python: Grundkonzepte
Tutor: Jan Philipp Albrecht (j.p.albrecht@fu-berlin.de), 
Jaime Rodríguez-Guerra (jaime.rodriguez@charite.de) 

## 1. Aims of this talktorial/session
This notebook will teach you the basic concepts necessary to understand and write basic python code. 

***



## 2. Learning goals

In general, the goal should be to understand the concepts behind the material as other lectures depend on it. Goal need not to be to directly know the answer to every question. Try to understand as much as possible and to aks as much questions related to your problems.

### 2.1 Theroy
- python and python notebooks
- variables
- flow control
- functions
- modules
- imports

### 2.2 Practical
- run cells seperately
- assigning variables
- perform operations on variables
- index a list
- create and alter dictionaries
- understand and define if-else conditions
- understand and define while and for loops
- defining and calling a function
- import modules

<font color=green>Don't hesitate to ask if concepts remain unclear or tasks are not understood!</font>

***

## 3. References
- https://docs.python.org/3/
- https://docs.python.org/3/tutorial/modules.html

***

## 4. Theory and Practice

This is a learning-by-doing notebook alternating between theroy and practice. First, a short introduction to a particular theory is given, showing examples with code. You are then ask to perform similar tasks to the examples in the theory.

You can check your results by viewing the sample solution.


### 4.1 python and python notebooks

#### What is Python?

Python is a widely used general-purpose high-level programming language.
The term "high-level" means that the language has a strong abstraction form the details of a computer. 
As with everything in computer science also python keeps evolving, thus several versions of python are available. 

In this course we will teach python 3.6 (or newer).

#### Where are we here?

This environment is a so called **Jupyter notebook**, a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. 

It allows you to define so called *cells* of different format:
- Markdown cells (this cell for example is a markdown cell)
- Code cells

You can *run* (execute) each cell seperately by pressing <font color=brown>Shift + Enter</font>. Varibles, functions etc. (you will shortly learn more about this) are available for all **following** cells .

Some code cells additionally produce an output (text, images, etc.). The output will appear underneath the code cell.


> *TASK:* <br>
*Try to execute the following cell and make sure the output appears.*

In [1]:
print('This is a cell with code.')
# This is a comment line. 
# Lines in a code cell starting with the '#' symbol are ignored in the interpretation of the code

# print('This scentence will NOT appear underneath the cell!')

This is a cell with code.


### 4.2 Variables and Operations

There are several <font color=blue>types</font> of variables. The type determines how the value is internaly stored/handled and what can be done with its content.

In the following we will introduce you to:
- elemental variables:
    - integers (<font color=blue>int</font>)
    - boolean (<font color=blue>bool</font>)
    - floating-point numbers (<font color=blue>float</font>)
    - strings (<font color=blue>str</font>)
- collection variables:
    - lists (<font color=blue>list</font>)
    - dictionarys (<font color=blue>dict</font>)

An assignemnt of a variable (regardless of the type) will be done in the following way: <br>
<code>**NAME** = **VALUE**</code>

You could think of a variable as a reference to certain value, similar to a name (the reference) you give to your newborn child (the value). Whenever you call the name of the child, you have an exact reference and adress your newborn child.

The type of the variable will be implicitly refered from the value. <br>


### Boolean
Boolean variables have exactly two conditions: <code>True</code>  and <code>False</code>.
Assigning one of these values to a variable would then look like the following:
```python
var_bool = True
```

> *TASK:* <br>
*Try to assign the value <code>False</code> to the variable from the example above.*

### Integers
Integers (shortly <font color=blue>int</font>) are numbers without floating decimal. In the following example, you see the definition of a variable named <code>var1</code>. After its assigment, the variable will be altered twice. After each alteration the new value is shown with the keyword <code>print()</code>. Everything inside the <code>()</code> will be shown as output.

```python
# assigning the name 'var1' to a value of '3'.
var1 = 3
# printing a variable will show its content underneath the cell
print(var1)

# assigning a different value to variable namend 'var1', thereby OVERWRITING the previous definition:
var1 = 4
print(var1)

# assigning a new value to variable namend 'var1', thereby setting a new value using the old one.
var1 = var1 + 1
print(var1)
    
```  

Output are the following three lines:
```python
    3
    4
    5
```

 ### Strings
 
Strings are words, scentences or symbols written in <code>""</code>. It is thereby possible to set an integer number as a string by writing its value inside the quotes. The following shows an example:

```python
# assinging the variable with the name 'my_string' the value "This is a string". 
my_string = "This is a string"

also_a_string = "1"  # I am a string due to the "", not a number!
i_am_an_int = 1  # I am an int due to the missing ""

# this cell will have no output as the code above does not mention WHAT to do with the variables!    

```

###  Float
Variables of type <font color=blue>float</font> are numbers with floating decimal. Whenever you write a number using an <code>.</code>, the type of the variable is infered to be a <font color=blue>float</font>.

```python
    var_float1 = 1.5  # a typical float
    
    # The following is also a float. 
    # Although this number mathematically has no floating decimal, it is written with a "."
    var_float2 = 1.0  
```



### Operations
Every type of variable has **operations** doing basic tasks. <br>
For example a variable of type <font color=blue>int</font> has among others the following operations:
- Addition <code>+</code>
- Subtraction <code>-</code>
- Multiplication <code>*</code>
- Devision <code>/</code>


A Division is allowed but **changes** the type of the variable (from Integer to <font color=blue>float</font>). 


```python
# applying basic operations on the varible named 'var1'
print(var1 + var1)  # type int
print(var1 - var1)  # type int
print(var1 * var1)  # type int
print(var1 / var1)  # type float


    10
    0
    25
    1.0

```

> *TASK:* <br>
*Now try to use the addition-operation for the variable named <code>my_string</code> already defined in the cell below. <br>
Save the addition in a variable called <code>res1</code> and print the result.*


In [None]:
my_string = "This is a string"
# your lines of code here



As you might already suspect, depending on the type of the variable these operations are **contextually** defined. <br>

<font color=red> Mixing two types of variables with an operation can result in an error!</font>

For example the following is forbidden:<br>
```python
var1 = 5
res1 = "test"
# this should produce an error, since the operant '+' can not combine 'int' and 'str' variables! 
var1 + res1

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-05639e44053b> in <module>()
      1 # this should produce an error, since the operant '+' can not combine 'int' and 'str' variables!
----> 2 var1 + res1

TypeError: unsupported operand type(s) for +: 'int' and 'str'
```

### lists

Python has a datatype called <font color=blue>list</font>, where other variables (regardless of their type) can be positionally stored. <br>

![def_list.png](attachment:def_list.png)

The code <code> my_list = [5, 10] </code> would therefore store <code>5</code> at first position, and <code>10</code> at second position. 

  
> *TASK:*<br>
 *Try to:*
 - *define a list called <code>list1</code>*
 - *at the first* **position** *save the string <code>"horse"</code>.*
 - *store in the list the string <code>"spider"</code> TWICE at* **position** *2 and 3.*
 

Once defined a list is not carved in stone! It can be altered and its values accessed. The **index** of a list starts to count from <code>0</code> and is therefore always the **position - 1**.

![pos_ind_list.png](attachment:pos_ind_list.png)

Once stored, the **values** of the stored variables are accassable by **indexing** the list: <code>my_list[INDEX] </code>, where <code>*INDEX*</code> is an integer value. <br>

Note: <code>INDEX</code> itself could be a variable of the type <font color=blue>int</font>!



> *TASK:*<br>
 *Now print out your favorite animal in the list in two ways:*
 - *by using an integer number corresponding to the <font color=brown>index</font> of your favorite animal.*
 - *by using and variable called <code>idx</code> you first assing the correct <font color=brown>index</font> to. You then print out the entry of the list <code>list1</code> using your defined variable.*
 
 
 <font color=green> **CAUTION: The first position in the list has index <code>0</code>!** </font>

<font color=red> Setting or acessing a value of an index of a list which is not defined results in an error!</font>
```python
list1[3]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-9-831b15cbf272> in <module>()
----> 1 list1[3]

IndexError: list index out of range

```

### Slicing (range-indexing) a list

In the following you find a definition of a list called <code>lst_new</code>. By giving a **range** of indices you have access to several elements in the list. A range can be set in the following way:
```python
lst_new = ["horse", "human", "spider", "millipede"]
lst_new[start:stop]
```
where <code>start</code> and <code>stop</code> are numbers (integers). Thus, <code> lst_new[1:3] </code> would give an output of <code>["human", "spider"]</code>. Notice that the <code>stop</code> index is <font color=red>not</font> included in the output.

> *TASK:*<br>
*print out all strings in the list <code>lst_new</code> beginning with an <code>"h"</code> by using slicing.*



In [None]:
lst_new = ["horse", "human", "spider", "millipede"]

Since the <font color=blue>str</font>-datatype behaves similar to a list, we can apply the same synthax to get a **substring** of our string. Keep that in mind, it will become necessary later on.

```python
my_str = "brothers"
print(my_str[2:7])
    
    other
```

### Dictionaries

Python has a handy datatype called <font color=blue>dict</font>. Each **key** gets a **value** assigned. The values can be of any type. The keys can also have different datatypes, but it is strongly advised to stick with one type.


![def_dict.png](attachment:def_dict.png)


Look at the following assignment of a dictionary:
```python
num_legs = {"horse": 4, "human": 2, "spider": 8, "millipede": 1000}
```
The keys in this example are "horse", "human", "spider" and "millipede". <br>
The values are 4, 2, 8, 1000 respectively.

Values can be accessed by refering to the specific **key** by using <code>[]</code> brackets. 

```python
num_legs["horse"]

    4
```

Variables of type <font color=blue>dict</font> can be easily extended by simply adding a **key**-**value** pair to the dictionary. This looks like the following
```python
num_legs["ants"] = 6
```

In the same way already existent  **key**-**value** pairs can be altered.

>*TASK:*
*In the following definition of the values of a dictionary evidently there is an error. No millipede known so far has <code>1000</code> legs.<br> Correct the **key**-**value** pair and set its value to <code>150</code> without overwriting the entire variable <code>num_legs</code>. Proof your correction by printing the new dictionary value.*


In [None]:
num_legs = {"horse": 4, "human": 2, "spider": 8, "millipede": 1000}

# Your lines of code here


### comparison operation
An **equality operation** (a comparison operation) can be performed by using the <code>==</code> operator. The result is always a <font color=blue>bool</font>-type having either the value <code>**True**</code> or <code>**False**</code>, depending on whether the content of two variables are indeed equal or not. Beside equality, we can also check for inquality be using the <code>!=</code> operation.

Imagine this as a question you ask to the computer. "Is the content of var1 **equal** to the content of var2?"

- Equality <code>==</code>
- Inequality <code>!=</code>

```python
print(var1==var1)  # type bool

    True
    
    
print(var1!=var1)  # type bool

    False
```

### 4.3 Flow control

As any programming language python gives flow control possibilities: <font color=orange> if, else, for, while </font> <br>

#### if, elif, else

Flow control lets you easily define routines. Suppose we need to decide on the basis of a number whether to execute a proper command line or not. 

![def_if.png](attachment:def_if.png)


Additionally, if not, we would like to run another command. This can be realized with the following code:

```python
status = 1
if status == 1:
    print("This line gets executed if and only if variable 'status' has the value 1 assigned.")
    # here, more lines could follow
else:
    print("This line gets executed whenever variable 'status' has NOT the value 1 assigned.")
    # here, more lines could follow
    
# The following lines are not indent any more.
# They will be executed regardless of the value of variable 'status'. 
print("status: " + str(status))
# str() converts type 'int' to type 'str' such that an '+' operation is possible
    
```
Note: there must not necessarily be an <font color=orange>else</font> for an <font color=orange>if</font>. If no alternative is given, simply nothing will be executed when the condition of the <font color=orange>if</font> statement is answered with <code>False</code>.

Notice the synthax: keyword <code>if</code> indicates the start of an <font color=orange> if</font>-statement. Thereafter, a **condition** follows, which has to be answered with a <font color=blue>bool</font>-type. Thus, either <code>True</code> or <code>False</code>. A <code>:</code> closes the <font color=orange> if</font>-statement. All **indend** lines will then only be executed under the above condition.

Now further think of a scenario, where we have to decide whether to run a certain code based on 3 different values of a variable: 
- when set to <code>1</code>, option A sould be executed
- when set to <code>2</code>, option B sould be executed
- else we want option C to happen.

This scenario can be realized using the keyword <code>elif</code> as the example shows:

```python
status = 2
if status == 1:
    print("This line gets executed if and only if variable 'status' has the value 1 assigned.")
elif status == 2:
    print("This line gets executed if and only if variable 'status' has the value 2 assigned.")
else:
    print("This line gets executed whenever variable 'status' has NOT the value 1 OR the value 2 assigned.")
    
```
***



### for-loop

Loops over the content of a list (or similar variable-types) are quite common. Therefore, python offers a so called <font color=orange>for</font>-loop to facilitate code. <br> The <font color=orange>for</font>-loop  automatically executes indend lines of code for **each** entry in the list (or entries in other variable-types). 

![def_for.png](attachment:def_for.png)

> *TASK:*<br>
*Use the <font color=orange>for</font>-loop to print out all entries in <code>lst_new</code>*

### while
In a <font color=orange>while</font> loop, all lines of code which are *indent* get executed **while** a certain condition is <code>True</code>.

![def_while.png](attachment:def_while.png)

> *TASK:* <br>
*Explain why the code above should <font color=red>NOT</font> be executed:*
*Correct the <font color=orange>while</font> loop from above, such that there is no problem any more. 
Write your solution in the cell below. Hint: The output <code>"Houston, we have a problem!"</code> should appear once.*





Instead of equality (<code>==</code>), there are other comparison operations which return a <font color=blue> bool</font>-type value. Think of it as other questions to the computer than only asking it for equality. These "questions" can be used in a <font color=orange>while</font> loop:
- <code>&lt;</code> strictly less than
- <code>&gt;</code> strictly greater than
- <code>&lt;=</code> less than or equal
- <code>&gt;=</code> greater than or equal
- <code>!=</code> not equal

> *TASK:*<br>
*Look at the following code and try to explain what it does. Check your answer by copying the code in the cell below and execute the cell.*

```python
inc_num=0
while inc_num < 10:
    inc_num=inc_num+2
    print(inc_num)
    
```

### optional: while loop tasks

Imagine a list of length 4 with the following entries:
<code>num_legs_list = [4,2,8,150]</code> <br>
Now suppose for example the following simple task: Adding up all entries in the list.

We could get the result by the following lines:

```python
idx_count = 0  # first index is 0
list_sum = 0 # we start with a 0
while idx_count < 4:
    list_sum += num_legs_list[idx_count] # 'x += 1' is short form for 'x = x + 1'
    idx_count += 1  
print(list_sum)
```

>*TASK:*
*Before you continue to read, answer the following questions for yourself:*
- *Why do we increase the <code>idx_count</code> **after** we increase the <code>list_sum</code>?*
- *Could we use a **different** comparison operator, other than <code>&lt;</code>?*

<font color=green>If you are unsure, don't hesitate to ask! This is a lot to understand. This is not easy!</font>


With a foor loop, we could implement the rather complex code from above easily in 3 lines of code:
```python
list_sum = 0 # again, we start with an empty sum
for entry in num_legs_list:
    list_sum += entry
print(list_sum)
    
```


### 4.4 Introduction to a function

A task like "summing up all entries in a list" is nothing no one has ever thought about. For common tasks it is always worth asking (google knows everything) if this task is generelly known and a **function** has already been created.

Functions (also subroutines) are sequence(s) of instructions that perform a specific task. They can be used (called) in other program parts whenever that certain task sould be executed. Everything you pass to a function is called an argument.

Think of it as a task which you give to somebody and he gives you back the result. Imagine a task for a cook like "cutting vegtables". The function would have the instruction **how** to move the knife, but <font color="red">not **what** </font> to cut, since there is more than one vegtable for which this instruction is valid. You then give for example a carrot to the cook (call the funtion with a "carrot" as argument) and get the cut carrot back.

Calling a routine with the name <code>sum_list</code> would look like this:

![funct_call_easy.png](attachment:funct_call_easy.png)


In this example nothing happens with the answer of the call. The returning value will simply be ignored. Think of the cook you asked to cut the carrot. He did it properly and hands you the cut carrot, but you don't take it.

Thus, we need to **store** the returning value in a variable to "take" it.

```python
num_legs_list = [4,2,8,150]
sum_num_legs_list = sum_list(num_legs_list)
```

Unfortunately, in order to use a function, you first have to know the function **exists**, what you have to **pass** (give) to the function and what it **returns** (gives back).


>*TASK:*
*You already know and even used one particular function. Do you know how it is named and what it does?*

Variables of nearly every datatype (you will learn more about other types in the next class), have defined **functions** and **attributes**. The latter are (here for simplicity) other variables always named the same for every variable of a certain type. Functions and attributes can be accessed by wirting a <code>.</code> after the **NAME** of the variable. 

They help the programmer performing basic operations and tasks.


A <font color=blue>float</font> datatype for example has an attribute called <code>real</code> and a function called <code>is_integer()</code>.
```python

x = 1.2  # a float value
print(x.real)  # accessing attribute of variable "x" called "real". Notice: No () brackets!
print(x.is_integer())  # calling function is_integer() of variable "x" with no argument.

    1.2
    True
```

The **attribute** <code>real</code> contains the real part of the floating point number (in case of a complex number this becomes important).<br>
The **function** <code>is_integer()</code> does not need any arguments and returns a <font color=blue>bool</font> type having the value <code>True</code> if <code>x</code> can be written as <font color=blue>int</font> type.


<font color=green> Make sure you understood this concept before you move on. Other lectures depend on these concepts! </font>


### 4.5 modules and imports

If you quit this program, the definitions you have made (functions and variables) are lost. Therefore, if you want to write a somewhat longer program, you are better off using a text editor to prepare the python code and execute the defined code directly from the file. This is known as creating a script. As your program gets longer, you may want to split it into several files for easier maintenance. You may also want to use a handy function that you’ve written in several programs without copying its definition into each program.

To support this, Python has a way to put definitions in a file and use them in a script. Such a file is called a module; definitions from a module can be imported into other modules or into the main module (e.g. this notebook-file).

A module is therefore a file containing Python definitions and statements.

This definition is taken from the [python documentary](https://docs.python.org/3/tutorial/modules.html) and was slightly altered.

The python [import system](https://docs.python.org/3/reference/import.html) is rather complex. We therefore only tell you the very basic information you need to know to follow the course. 

As there are a large number of available modules, it would be unfeasable to load all definitions directly when starting python (it simply would take too long and would be too memory consuming). Therefore, we have to tell python which modules (and its definition) we would like to use. <br>
That can be achieved with the **import** command.


One of the most important modules are **numpy** and **pandas**. An import would look like the following:
```python
import numpy
import pandas
```

In order to use a module, you need to **know** its existance and its content. Unfortunately, there is now other way then checking out the documentation of the corresponding module. Functions from the module can then be used in with the following synthax:
```python
# calling the function "mean" from the numpy module, giving the list "[1,2,3,4]" as argument.
numpy.mean([1,2,3,4])

```

Since module names are sometimes long (and informaticions are lazy) we can give them a different name by using the codewort <code>as</code>.

```python
import numpy as np
import pandas as pd
```

Whenever we need a function/definition from the <code>numpy</code> module we can further refer to the module as <code>np </code>.


***

### 4.6 Optional: Defining a own function

What if you have a task but you can not find a function for it which is already defined? Then you can write your own instructions to solve a particular task!

The general structure of a function python follows the following synthax:

![def_function.png](attachment:def_function.png)

The keyword <code>def</code> indicates the beginning definition. <code>ARG_1, ARG_2, ..., ARG_N</code> are arbitrair many **arguments** expected to be passed to the subroutine when calling the routine. Try to understand the following definition of a function:

![funct_sum_list.png](attachment:funct_sum_list.png)

The names of the arguments and its values will only exist in the scope of this function. Outside, these definitions are <font color=red>NOT</font> available.

> *TASK:*<br>
*Use the knowledge to define a function which builds the **average** of a list, using the definition of function <code>sum_list()</code> from this cell. Test your implementation with the list called <code>num_legs_list</code>.*

In [None]:
# This functions sums up each entry in a list.
# It expects exactly 1 argument, which is called in the scope of the function 'any_list'
# It returns (to the calling routine) the sum of the list, called 'list_sum'. (only in the scope of the function.)
def sum_list(any_list):
    list_sum = 0 
    for entry in any_list:
        list_sum += entry
    return list_sum

# define the desired number of arguments (if any necessary) and give them reasonable names
def average():
    # here you should probably call the function sum_list() and do something with the return value
    
    return avrg

# here you should probably call your average function with your list 'num_legs_list'
num_legs_list = [4,2,8,150]

*TASK:*<br>
*Call the function named <code>average</code> again. This time, pass <code>new_list</code> defined in the cell below as an argument. Check the correctness of the result.*

In [None]:
new_list = [1,4,10,25,10]

# call the function "average" using "new_list" as an argument.


## 5. Discussion




***

## 6. Quizz

- what output will have the following code?
```python
x = ""
for i in "python_is_amazing!":
    if i == "_":
        x += " "
    else:
        x += i
print(x)
```

- when executing the following code, will there be an error?

```python
x = 5
y = "5"

print(x + y)

```

- what do we need to change it order to get the correct output "10"?
***

In [1]:
# copy the code from the quizz and run the cell to check your answers!