<img align="center" style="max-width: 1000px" src="https://raw.githubusercontent.com/HSG-AIML-Teaching/AI2024-Lab/main/banner.png?raw=1">

<img align="right" style="max-width: 200px; height: auto" src="https://raw.githubusercontent.com/HSG-AIML-Teaching/AI2024-Lab/main/hsg_logo.png?raw=1">

##  Lab 01 - "Python 101: Jupyter Notebooks and Python Basics"

Artificial Intelligence (Spring 2024), University of St. Gallen

In this first lab, we will touch on the basic concepts and techniques of Notebooks and the Python programming language in general.

## Objectives
* learn how to use a Jupyter Notebook
* learn about the basics of the Python programming language
    * the very basics
    * basic data types and data containers
    * fundamental programming structures
    * Python functions
    * Python modules


## Jupyter Notebooks

The lab environment of this course builds on Jupyter Notebooks (https://jupyter.org), which provide a full-fledged Python environment for writing Python code. 

Jupyter Notebooks run on a **server** - either locally or in the cloud - allowing different users to access and use a pre-defined environment with pre-installed modules and packages. 

We will use Jupyter Notebooks with a **Python kernel**, meaning that our Notebooks will utilize the Python programming language. However, Notebooks can be utilized with a wide range of programming languages.

The main advantage of Jupyter Notebooks is their unique character that mixes code elements with extensive documentation. This combination utilizes the concept of **cells** that either contain static content (like markdown text or other media) or code. The nature of a cell can be chosen from the corresponding menu (depends on the interface you are using).

#### Cells

This cell is a **markdown cell**. It only contains static text. You can modify the text by clicking into this cell (maybe twice, depending on the interface that you are using). Once you're done, "run" the cell by clicking the correponding button or by hitting `shift`+`enter` to render the output.

The following cell is a **code cell** that contains Python code. To evaluate a code cell, hit the "run" button or press `shift`+`enter` on your keyboard. Evaluating a cell evaluates each line of code in sequence, and prints the result of the last line (if any) below the cell.

In [None]:
40 + 2

A code cell can contain several lines of code, but keep in mind that only the result of the last line will be displayed.

In [None]:
2 + 2
3 + 3

Finally, you can create and insert new cells (of either type). This process depends on where your Notebook is running. In Google CoLab, you can simply move your mouse cursor between two cells and two buttons will appear that will add a code cell or a markdown cell.

# Python

## The very basics

Writing Python **code** is as simple as writing text. 

Let's start with the typical Hello-World example:

In [None]:
print('hello world!')

This code cell consists of a **function** call to the `print` function, which contains one **argument** (`'hello world!'`). The purpose of the `print` function is to print the argument on the screen. 

In order to support more or less complex calculations, Python supports **variables** to which values can be assigned:

In [None]:
a = 3

The value of **object** `a` is now 3; whenever the variable is evaluated, it is replaced by this value:

In [None]:
a

This can be exploited for mathematical expressions:

In [None]:
a + 1

But note that the value of `a` does not change:

In [None]:
a

The value of `a` only changes when `=` is involved

In [None]:
a = a + 1

In [None]:
a

#### Comments

The `#` character has a special meaning in Python. Whenever it appears, all following characters on the same line are considered a comment and are ignored. You can use comments to explain what certain parts of your code do.

In [None]:
# this is a comment

In [None]:
print(1) # comments can be on the same line as code, too

## Basic Data Types

The data type defines how data is stored in your computer's memory. This has implications for the format, sets the upper and lower bounds of the data.

In Python, we do not need to declare a variable by explicitly mentioning the data type (as, e.g., in C or other programming lanuages). This feature is famously known as **dynamic typing**.

Python determines the type of a literal (an element of code that has a fixed value) directly from the syntax at **runtime**. For example, non-decimal numbers will be stored as Integer types whereas those with a decimal point will be stored as floats (see below for definitions). The idea is to use that datatype that is able to store the data without loss in the most efficient way (i.e., using the least complex datatype).

There are a number of basic data types in the Python programming language, the most important of which are:

> * **Integers** - represent positive or negative whole numbers with no decimal point.
> * **Floats** - represent positive or negative real numbers and are written with a decimal point.
> * **Strings** - represent sequences of Unicode characters.
> * **Booleans** - represent constant objects that are either `False` or `True`. 

### Numerical Data Types: Integer and Float

**Whole numbers** (positive or negative) are represented in Python by **Integers** or **int**s. In Python 3, there is effectively no limit to how big an integer value can be. 

The **Float** datatype represents **real numbers** with a precision of up to 15 digits after the decimal point. 

As a result of *dynamic typing*, Python will decide whether to store a value as an integer or a float: 

In [None]:
x = 3
x

`x` is most likely an integer (as it contains a whole number), but we can check with the `type()` function [we will learn later that functions always have a set of brackets that contain the function's arguments]:

In [None]:
type(x)

Yes, it is an integer! So, if we assign a real number, will the result be a float?

In [None]:
y = 3.14159
type(y)

It certainly is. You can also enforce a specific data type. For instance, we can convert our integer value (`x`) into a float value: 

In [None]:
float(x)

Note the change in notation: instead of `3`, the output is now `3.0`, indicating the value's float nature. 

Of course, we can also convert our float value (`y`) into an integer value:

In [None]:
int(y)

Be aware of what is happening here: the real value `3.14159` cannot be represented as an integer without loss of information. Therefore, rounding is involved. To reflect its integer nature, the resulting value has no decimal point.

#### Mathematical Operations

Variables are important to store data and to perform mathematical operations on them. Such operations are defined following the usual algebraic notation with:
* `+` for addition
* `*` for multiplication
* `/` for division
* `**` for exponentiation
* `%` for the modulo operator

In [None]:
2*y/(x+1)

Please note how the typical rules involving brackets are followed. More complex mathematical functions are available in Python, but we will introduce them later when we start using the NumPy package.

For now, let's make another observation and that is that the values of x and y have not changed:

In [None]:
x, y

[this notation may look weird for now; we created a tuple, which we introduce below formally; for now, this is just an efficient way to display more than one value]

Of, course, we can assign the result of our computation to another variable and store its value:

In [None]:
z = 2*y/(x+1)
z

Great, now we can use Python as a sophisticated calculator!

#### Strings

A sequence of one or more characters enclosed within either single quotes `'` or double quotes `"` is considered as a **string** in Python. Any letter, a number or a symbol could be a part of the string. All the characters between the opening delimiter and matching closing delimiter are part of the string:

In [None]:
s1 = 'hello'    # string literals can use single quotes
s2 = "gruezi"  # or double quotes; it does not matter.
s1, s2

Let us check that the datatype of either variable is what we expect:

In [None]:
type(s1)

Strings are sequences, so they have a specific length. Using the `len()` function, we can get the length of the strings:

In [None]:
len(s1), len(s2)

#### Indexing and Slicing

Like most programming languages, Python allows accessing individual characters of a string based on their "index", which indicates the position in the string. The process of "getting" a specific element of the sequence is symbolized by a pair of square brackets (`[]`) and is typically referred to as **indexing**.

**Important Python Rule No. 1: Python starts counting indices at 0.**

The index of the first character is 0, the index of the second character is 1, and so on:

In [None]:
s1[0], s1[1], s1[2] # getting the first, second and third character of s1

Reaching the end of a string seems tedious? Not really, since Python also supports negative indices. Index `-1` represents the last character of the string. Similarly using `-2`, we can access the penultimate character of the string and so on:

In [None]:
s1[-1], s1[-2] # getting the ultimate and penultimate character of s1

Similar to indexing, which only extracts individuals characters, we can also extract parts of strings. This process is called **slicing** and uses a very similar notation:

In [None]:
s1[1:3] # slice from the second to the fourth (exclusive) character

Here, we create a slice that starts with the second character (index `1`) and ends before reaching the fourth character (index `3`). The range between those indices is indicated by the colon symbol. 

Python is smart about this notation. Leaving out the first (start) index will create a slice that starts at the very beginning of the string and leaving out the second (stop) index will create a slice that reaches the very end of the string: 

In [None]:
s1[:3], s1[3:]

Naturally, a slice with a start and stop index will return the full string:

In [None]:
s1[:]

One more detail: slices can also have a step size, which defaults to one. If a different step size is wished, it is simply added with another colon:

In [None]:
s1[::2]

This will return only every second character. Very useful is a step size of `-1`, which will simple reverse the string:

In [None]:
s1[::-1]

A very common error that occurs when indexing is the `IndexError`. It happens when the user tries to index an element that does not exist (index > length-1):

In [None]:
#s1[10]

#### String manipulations

We can concatenate two strings with `+` symbol [addition makes no sense for strings, so the symbol refers to concatenation in this case]:

In [None]:
s3 = s1 + ', ' + s2 + ' and guten tag'  # string concatenation
s3

[in case you are wondering what `', '` and `' and guten tag'` are: those are string literals]


Strings are implemented as a class (see below) and there exists a number of useful methods to deal with them. For instance, you can replace occurring character with the `replace()` method:

In [None]:
s1.replace('e', 'a')

You can strip a string of leading and/or trailing symbols like whitespaces with the `strip()` method:

In [None]:
'  there are too many whitespace around this       '.strip()

Another useful method is to split a string into a list (see below) based on some delimiter symbol; this can be achived with the `split()` method:

In [None]:
'John,12.5,New_York,#FF0000'.split(',')

Finally, strings can be formatted nicely with their `format()` method (see [here](https://docs.python.org/3/library/string.html#format-string-syntax) for details):

In [None]:
temp = 12.123425624523

print("The current temperature is {:.1f}C.".format(temp))

### Booleans

Python provides a Boolean data type. Objects of type boolean type may have one of two values, "True" or "False":

In [None]:
a = True
b = False
a, b

Booleans are often used in Python to test conditions. For example, variables can be compared to values:

In [None]:
x > 1  # read: is x greater than 1?

In [None]:
x > 3  # read: is x greater than 3?

In [None]:
x == 2  # read: is x equal to 2?

Other available conditional operators are `>=` (greater than or equal to), `<=` (less than or equal to) or `!=` (not equal to).

We can combine logical expressions:

In [None]:
(x > 2) and (y < 4)

Available logic includes `and`, `or` and `not`.

Booleans and logical expressions will be important in the context of `if` statements, which we will introduce later.

## Data Containers

Python provides a number of collection data types to store more complex data; the three most important data containers are:

> * **List** - a collection that is ordered and mutable (changeable). 
> * **Tuple** - a collection that is ordered and immutable (unchangeable).
> * **Dictionary** - a collection that is unordered, changeable and indexed. Elements must be unique.


### Lists

A list is a collection or sequence of basic Python data types in which "elements" are ordered in a mutable (elements can be changed) sequence.

In Python, lists are written with square brackets. Python lists allow duplicate elements, are resizeable and can contain elements of different data types. Lists can be initialized like this:

In [None]:
l = [42, 5, 128, 5, 97, 208]
l

Since we already introduced strings, which are sequences of characters, you will see that many properties also apply to lists. 

Lists have a length:

In [None]:
len(l)

Lists support indexing:

In [None]:
l[2]

Lists support slicing:

In [None]:
l[1:4]  # second to fourth element

Lists are their own variable type:

In [None]:
type(l)

Lists are mutable, so we can change individual elements...

In [None]:
l[1] = 0
l

... and we can append an element to the end of a list:

In [None]:
l.append('test')    # add a new element to the end of the list
l

Note that lists do not care about the datatypes of their elements: we can easily append a string to a list of integers.

### Tuples

Tuples are basically immutable lists. They use parentheses instead of square brackets as a visual distinction:

In [None]:
t = (1, "test", 56)
t

Most list functions also work on tuples:

In [None]:
len(t)

But keep in mind that tuples cannot be modified (because they are immutable):

In [None]:
#t[0] = 5

### Dictionaries

In Python, dictionaries are used to store associations similar to actual dictionaries that, e.g., translate between two different languages.

A dictionary is a collection of basic Python data types in which "elements" consist of (key, value) pairs that are unordered, mutable and indexed. Every key is only allowed once in the dictionary (since it is indexed), but the same value can occur many times. Dictionaries are written with curly brackets and  can be initialized like this:

In [None]:
d = {'cat': 'Katze', 'flower': 'Blume', 'house': 'Haus', 'owl': 'Eule'}
d

Retrieve and print the value corresponding to the key "cat":

In [None]:
d['cat']

We can add new dictionary elements:

In [None]:
d['pumpkin'] = 'Kürbis'
d

And we can modify existing dictionary element by overwriting them (since every key is only allowed once):

In [None]:
d['owl'] = 'Kauz'
d

We can retrieve all keys from a dictionary:

In [None]:
d.keys()

We can retrieve all values from a dictionary:

In [None]:
d.values()

Try to retrieve a dictionary value that is not contained in the dictionary (this will result in a `KeyError`): 

In [None]:
#print(d['apple'])

## Fundamental Programming Structures

Now that we introduced all the necessary datatypes, we can have a closer look at the basic programming structures of Python:

> * **For-Loops** - used to iterate over a block of code.
> * **If-Statements** - used to branch between different code blocks based on conditions.

### loop structures: `for` loops

`For` loops enable your program to iterate over a block of code, solving repetetive tasks and problems.

The idea behind `for` loops is simple as shown in this example:

In [None]:
for s in ['cat', 'night', 'pumpkin']:
    print(s)

What happens here can be easily translated into human language: `for` each element (we name it `s`) in a list (`['cat', 'night', 'pumpkin']`), print `s` on the screen (`print(s)`).

**Important Python Rule No. 2: Python relies on the concept of indentation, using whitespace (or tabs), to define a block of code**. This means that all code lines that use the same level of indentation belong to the same block. This is relevant for for-loops, if-statements, function definitions, etc. Other programming languages such as Java or C# often use brackets or curly-brackets for this purpose. 

Consider the following examples:

In [None]:
for s in ['cat', 'night', 'pumpkin']:
    print(s)
    print("I'm here")

In [None]:
for s in ['cat', 'night', 'pumpkin']:
    print(s)
print("I'm here")

The outcome differs, depending on whether the statement `print("I'm here")` is inside the indented code block, or not. So please be aware!

There are two important keywords to control `for` loops: `continue` and `break`. The following examples will showcase what they do. We will make use of the `range()` function here, which simply outputs a list based on given start and stop values:

In [None]:
for i in range(1, 10):
    print(i)

We can use the `break` keyword to stop a loop from iterating any further [note that the `if` statement is introduced below]:

In [None]:
for i in range(1, 10):
    if i == 4:
        break
    print(i)

In contrast, the `continue` keyword is used to tell Python to skip the remainder of the code block and to continue with to the next iteration of the loop:

In [None]:
for i in range(1, 10):
    if i == 4:
        continue
    print(i)

### Decision structures: `if` statements

Decision structures evaluate expressions that produce Booleans (**True** or **False**) as an outcome and enable your program to use different branches based on these outcomes.

For instance, we can use `if` statements to compare numbers:

In [None]:
a = 4
b = 7

if b > a:
    print("b is greater than a")

If the condition (`b > a`) is `True`, the following code block will be executed. Note the same use of indentation to define this code block as we saw in the discussion of the `for` loop.

We can easily increase the number of possible branches by adding additional conditions using the `elif` keyword. The `elif` keyword is Python's way of saying "if the previous conditions were not true, then try this condition":

In [None]:
a = 4
b = 4

# test condition 1
if b > a:
  print("b is greater than a")

# test condition 2
elif a == b:
  print("a and b are equal")

elif a != b:
    print("test check and so on... ")

Finally, we can use the `else` keyword to catch any case which isn't found by the preceding conditions:

In [None]:
a = 8
b = 4

# test condition 1
if b > a:
  print("b is greater than a")

# test condition 2
elif a == b:
  print("a and b are equal")

# all other cases
else:
  print("a is greater than b")

In the example above the value assigned to a variable `a` is greater than the value assigned to `b`, so the first `if` condition is not true. Also, the `elif` condition is not true, so we ultimately go to the `else` condition and print that "a is greater than b".

## Python Functions

A function is a block of organized, reusable code that is used to perform a single, related action. Functions provide better modularity for your application and allow for a high degree of reusable code. As you already saw, Python provides you with many built-in functions such as `print()`, etc. but you can also create your functions. These functions are called **user-defined functions**.

A function is a block of code that only runs when it is called. You can pass data, known as arguments or parameters, into a function. A function can return data as a result. Python functions are defined using the `def` keyword.

Let's recycle the code comparing two numbers from the previous section and put it into a function:

In [None]:
def compare(a, b):
    """This function compares two number and outputs its verdict on the screen."""
    if b > a:
        print("b is greater than a")
    elif a == b:
        print("a and b are equal")
    else:
        print("a is greater than b")

First, notice the use of indentation here: all code lines use indentation since they are part of the function definition. Those lines that are part of the if-structure use additional indentation. 

The line containing the tripe-double-quotes is called a **docstring**. It is common to include a docstring that briefly outlines what the function does.

Let's evaluate this function:

In [None]:
compare(1, 2)

In [None]:
compare(100, 2)

This does the trick, but the more "pythonic" way would be not to print the result on the screen, but to return it to the user:

In [None]:
def which_number_is_greater(a, b):
    """This function compares two numbers and returns the larger one."""
    if a > b:
        return a
    elif b > a:
        return b
    else:
        return None

In [None]:
which_number_is_greater(100, 99)

Note that the function call gets replaced with the result, which can be used right away in mathematical expressions:

In [None]:
which_number_is_greater(100, 99) + 1

Finally, there are two types of arguments: positional arguments that are identfied as per their position in the function call and keyword arguments that have to be addressed specifically and typically have default values:

In [None]:
def some_math(a, b, c=0, d=1):
    """A generic function that does some random math."""
    return (a + 2*b + c) * d

In [None]:
some_math(1, 2)

In [None]:
some_math(1, 2, d=4, c=2)

## Python Modules

Python provides a huge list of "built-in" functionality, things that it can do out-of-the-box. However, for specific use cases - like training deep neural networks - it requires functionality from external modules that can be readily "imported" into your Python runtime environment.

We will learn about the NumPy package next, so let's see how we can import it.

In [None]:
import numpy

That's it. Now presume that we would like to use its `sqrt` function to calculate the square-root of 9:

In [None]:
numpy.sqrt(9)

So, in order to use a function from NumPy, we have to specify that we would like to use the `sqrt` from NumPy by typing `numpy.sqrt`. This is necessary to resolve possible confusion with other functions that might carry the same name. 

Since typing `numpy.` everytime we want to use a NumPy function quickly becomes tedious, you can give the NumPy module a different name (`np` is the usual choice) so you have to type less:

In [None]:
import numpy as np

np.sqrt(9)