# `python` crash course

**April 23, 2020**  
*Ingo Scholtes*  

We provide a brief introduction to core `python` concepts that we will use throughout the session. Here, we will start with basic `python` programming language constructs and data types, which we will complement by more advanced data science concepts introduced in the packages `numpy`, `scipy`, `matplotlib` and `pandas`. If you are interested in a more comprehensive introduction, you can find a number of good introductory tutorials on the web, e.g. the [official python tutorial](https://docs.python.org/3/tutorial/) or the [W3C python tutorial](https://www.w3schools.com/python/default.asp).

## Importing modules

Many functions that we will use are provided via `python` modules such as `numpy`, `scipy`, `scitkit-learn`, `matplotlib` or `pandas`. We can import such modules using the `import` keyword.

In [1]:
import numpy

After the import, all classes, functions, and variables defined by `numpy` are accessible in the namespace `numpy.X`. Let us try this by using the code completion functionality integrated into code editors like `jupyter` lab and Visual Studio Code. Rather than importing into a namespace that matches the module name we can use `import MODULE as ALIAS` to define a local alias.

In [2]:
import numpy as np

We can use the `from` keyword to only import certain parts of a module, e.g. a single class or function. We can also use this to import all symbols from a certain submodule. If we wanted to include all symbols in `numpy` into python's root namespace (which is not recommended) we could do this as with the following statement (which we will not execute in this example):

```
from numpy import *
```

## Basic data types

Python is a dynamically-typed language, i.e., we do not have to specifically define the types of variables as they will be automatically inferred at runtime. This means that we can simply assign values of different types to variables without having to explicit cast them. We can check the type of a variable using the `type` function. Let us try this by assigning the integer 42 to a variable `i` and then print the type:

In [3]:
i = 42
print(type(i))

<class 'int'>


If we want to store the number 42 as a float value, we can make an explicit cast using the class constructor `float`. We can cast the integer 42 to a float, assign it to `i` and print the type as follows:

In [4]:
i = float(42)
print(type(i))

<class 'float'>


As mentioned above, python will automatically infer the type of a variable based on its value. If we specify the number 42 with a decimal point, it will automatically store a floating point number. Let us assign the number 42 with a decimal point to variable `i` and print the type:

In [5]:
i = 42.
print(type(i))

<class 'float'>


Python comes with an integrated string class and we can cast values to strings using the class constructor `str`. Let us cast the variable `i` to a string and `print` the value:

In [6]:
i = str(i)
print(i)

42.0


We can define string literals using single or double quotes. This comes in handy if we want to define strings that contain quotes:

In [7]:
s1 = 'some text'
s2 = "some text with a 'quote'"

We can also create multi-line string literals by using the triple-quote. This is often used to create so-called docstrings to document a class or function and we will see this below.

In [8]:
s3 = """ This is a very long string 
that spans
multiple
lines
"""

Boolean (i.e. logical) values are defined using the keywords `True` and `False`. Common comparators which yield Boolean results are `==` (equal), `!=` (not equal), `<` (smaller), `>` (greater), `<=` (smaller equal) and `>=` (greater equal). They can be composed to logical expressions using the binary boolean operators `or` and `and`. Let us try this:

In [9]:
b = False
print(b)

x = 42
print(x > 20 and x < 45)

False
True


## Functions

We can use the `def` keyword to define functions that can be called with different arguments. We can assign standard values to optional parameters, which are used if the respective argument is omitted.

The `return` keyword can be used to return a value. Since python uses neither brackets (like C/C++/C#/Java) nor begin/end keywords (like Pascal/Basic) as scope delimiters, the statements that belong to the function have to be indented correctly. Let us define a function with three parameters, of which two are assigned default values.

In [2]:
def some_function(x, y=42, z='message'):
    # outputs several printable variables using the print method.
    # By default a space is used to separate the individual values
    print("Parameter x =", x)
    print("Parameter y =", y)
    print("Parameter z =", z)
    return x+y

`python` has an integrated documentation and help system, that is also used by tools that (i) automatically generate reference manuals for software packages, and (ii) IDEs like VSCode that show the documentation of functions as you type. We can call the integrated help function to output the documentation of a function or class.

In [3]:
help(some_function)

Help on function some_function in module __main__:

some_function(x, y=42, z='message')



For the function above, this help function only returns the function signature, because we have not included a so-called `docstring`. We can add a documentation by adding a multi-line string literal immediately after the function definition. It is common to use e.g. [numpy-style docstrings](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html), which can be parsed by automatic documentation tools like doxygen.

In [4]:
def some_function(x, y=42, z='message'):
    """Outputs several printable variables using the print method.
    
    By default a space is used to separate the individual values. 
    
    Parameters
    ----------
    x : int
        The first parameter.
    y : int
        The second parameter.
    z : str
        The third parameter

    Returns
    -------
    int
        Sum of x and y
    
     """
    print("Parameter x =", x)
    print("Parameter y =", y)
    print("Parameter z =", z)
    return x+y

If we now call the help function, we get ... 

In [5]:
help(some_function)

Help on function some_function in module __main__:

some_function(x, y=42, z='message')
    Outputs several printable variables using the print method.
    
    By default a space is used to separate the individual values. 
    
    Parameters
    ----------
    x : int
        The first parameter.
    y : int
        The second parameter.
    z : str
        The third parameter
    
    Returns
    -------
    int
        Sum of x and y



Note that python is a whitespace-sensitive language, i.e., **wrong whitespace characters can break your code or give it a different meaning!** As an example, changing the intendation of the function definition above results in a syntax error:

In [10]:
def some_function(x, y=42, z='message'):
 print("Parameter x =", x)
print("Parameter y =", y)
 print("Parameter z =", z)
 return x+y

IndentationError: unexpected indent (<ipython-input-10-4729bbbdf948>, line 4)

When calling a function, we must provide arguments for all parameters that do not have assigned a default value. We can also only set particular arguments in any order by specifying named arguments (i.e. stating "`z = `"). Let us call the function above by specifying (i) the first argument, (ii) the second named argument, and (iii) the third named argument.

In [15]:
result = some_function(0)
print(result)

result = some_function(42, y=45)
print(result)

result = some_function(42, z="some other message")
print(result)

Parameter x = 0
Parameter y = 42
Parameter z = message
42
Parameter x = 42
Parameter y = 45
Parameter z = message
87
Parameter x = 42
Parameter y = 42
Parameter z = some other message
84


Importantly, while we can change the order of named arguments, while the order of unnamed arguments is fixed. These unnamed arguments are called **positional arguments**, because they are determined based on the order of arguments. This implies that we cannot pass a named argument before a positional argument. Let's try to call the function from above by passing a named parameter as first argument;

In [16]:
some_function(y=45)

TypeError: some_function() missing 1 required positional argument: 'x'

## Lambda expressions

Rather than defining functions that consist of multiple expressions, and which are assigned a name, we often encounter situations where we e.g. want to pass a single expression as an argument to another function. As an illustrative example, consider the following function. It performs a generic, yet to be specified `test`, which returns `True` or `False` on the parameter `x`.

In [19]:
def perform_test(x, test):
    if test(x): 
        print('x passes the test')
    else:
        print('x does not pass the test')

We could call this function by passing the name of another function as parameter `test`. But this would require us to define and name this function, while we often simply want to pass a simple expression that returns a boolean value. This can be achieved by so-called [lambda expressions](https://docs.python.org/3/reference/expressions.html#lambda), a special type of anonymous function. The general syntax is: 

```
lambda params: expression
```

With this, we can for instance pass an expression that checks whether the parameter is a string. We can do this using the `isistance(x, y)` function, which checks whether `x` is an instance of the class `y`:

In [20]:
x = 42
perform_test(x, lambda y: isinstance(y, str))
x = '42'
perform_test(x, lambda y: isinstance(y, str))

x does not pass the test
x passes the test


## Loops and if-clauses

We can use for-loops to execute statements for a given number of elements. Instead of the syntax `for (int i=0; i<10; i++)` which you probably know from languages like Java or C, in `python` the syntax of for-loops is

`for i in SEQUENCE:`

where `SEQUENCE` can be any iterable object. For the common case that we want to iterate over increasing or decreasing numbers, we can use the `range(start=0,stop)` function. It generates a sequence of numbers in a rang, that we can then iterate over. The `stop` argument is an **non-inclusive** upper bound of the sequence, the start argument is an `inclusive` lower bound. The default value for `start` is zero, so a call to range(10) generates a sequence consisting of the 10 numbers [0,1,2,3,4,5,6,7,8,9].

Let us try this by priting the integers from 1 to 15:

In [21]:
for x in range(1, 16):
    print("x =", x)

x = 1
x = 2
x = 3
x = 4
x = 5
x = 6
x = 7
x = 8
x = 9
x = 10
x = 11
x = 12
x = 13
x = 14
x = 15


As we see in the following cell, the `range` function returns an iterable `range` object:

In [22]:
type(range(1, 16))

range

We can generate the actual ordered list of numbers by passing this iterable object to the constructor of the class `list` as follows:

In [23]:
list(range(1, 16))

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

The syntax of `while` loops is more familiar:

```while EXPRESSION:
    STATEMENT
```

The following thus prints the integers from 1 to 15:

In [24]:
x = 1
while x <= 15:
    print("x =", x)   
    x = x+1

x = 1
x = 2
x = 3
x = 4
x = 5
x = 6
x = 7
x = 8
x = 9
x = 10
x = 11
x = 12
x = 13
x = 14
x = 15


As in other languages, `if`-clauses allow us to execute statements conditionally. They use a similar syntax as loops, i.e. a colon followed by multiple indented statements. The `else` and `elif` branches are optional and can be omitted. `elif` is short for `else if`, i.e. it is an additional condition checked if the previous one does not apply. This is illustrated in the following example:

In [23]:
x = 3
y = 5
if x < y:
    print("x is smaller than y")
elif x > y:
    print("x is larger than y")
else: 
    print("x and y are equal")

x is smaller than y
