# The Programming Language Python

## Output

- the upper, grey-shaded part of a "code cell" contains the input (meaning the code that you write) while the lower, white-shaded part contains the output
- in addition to code cells, this "notebook" also contains text cells (like this one)

<br>

- it is possible to generate output from code cells in *interactive mode* where only the last output is printed to the output field 
- to output more than one line the *script mode* using ```print(<message>)``` function must be used &rarr; accepts any kind of printable output, e.g. strings, lists, numbers

In [None]:
"Moin Kiel!" # this isn't printed
"Hello World!"

In [None]:
print("Moin Kiel!")
print("Hello World!")  # the famous "Hello World" program

## Variables

### Definition

- are symbolic names for addressing an area of the main memory &rarr; placeholders used to store values
- have *name*, *value*, and *data type* &rarr; variable is *declared* and value is *assigned*

#### Types

- data type of variable determines
    - what values variable can hold,
    - how variable is stored in main memory of computer
- type of variable does not have to be defined like in other programming languages (e.g. Java: ```int mynumber;```) &rarr; type is deducted from value of variable
- type ran be read using the ```type(<var>)``` function

In [None]:
# str (string): "Hi"
print(type("Hi"))

# int (integer -> whole number): 5
print(type(5))

# float (floating-point number -> numbers with decimal point): 3.141
print(type(3.141))

# bool (boolean -> True or False): True
print(type(True))

#### Illegal Names

Some variable names are not allowed. Such are:
- starting with a number
- using illegal characters like '@'
- keywords like ```for```

In [None]:
1st_illegal_variable = 1

In [None]:
illeg@l_variable = 1

In [None]:
for = 1

#### Case sensitivity

- Python variable names are case sensitive

In [None]:
Bob = "Bob"
bob = "bob"

print(Bob == bob) # is 'bob' the same as 'Bob'?

### Typecasting

- process of converting type of variable - all build-in types in Python can be found [here](https://www.w3schools.com/python/python_datatypes.asp)
- done by enclosing variable inside variable type function
    - ```str(<var>)``` &rarr; conversion to string
    - ```int(<var>)``` &rarr; conversion to integer
    - ```float(<var>)``` &rarr; conversion to floating-point
    - ```bool(<var>)``` &rarr; conversion to boolean

<br/>

- sometimes necessary as certain operation can only work with certain variable types &rarr; e.g.only strings can be concatenated, not strings and integers

In [None]:
"This is number" + 3

In [None]:
"This is number " + str(3)

### User input

Variables values can be assigned by user input:

- input into program can be provided using the ```input(<message>)``` function
- input is string datatype
- input can be assigned to variable: ```var = input(<message>)```

In [None]:
input_variable = input("What is your name?")

print(input_variable)
print(type(input_variable))  # str-type

## Operators

### Mathematical operators

- to execute numerical calculations

In [None]:
4 + 2  # Addition

In [None]:
4 - 2  # Subtraction

In [None]:
4 * 2  # Multiplication

In [None]:
9 ** 4  # Exponent

In [None]:
9 / 4  # Division: creates floating-point number

In [None]:
9 // 4  # Integer Division: Division with an integer result. The remainder is of no interest.

In [None]:
9 % 4  # Modulus: Calculation of remainder of an integer division?

### Assignment (in-place) operators

- mathematical operators to update existing variable values
- generate another number
- an overview of (additional) in-place operators can be found [in the official Python documentation](https://docs.python.org/3/library/operator.html#in-place-operators)

In [None]:
n = 10
n += 2  # Addition: replaces n = n + 2

print(n)

n = "Hello"  # with strings this method only works for "+" and "+=" - here, it is used to concatenate strings
n += " World"

print(n)

In [None]:
n = 10
n -= 2  # Subtraction

print(n)

In [None]:
n = 10
n *= 2  # Multiplication

print(n)

In [None]:
n = 10
n **= 2  # Power

print(n)

In [None]:
n = 10
n /= 2  # Division

print(n)

In [None]:
n = 10
n //= 2  # Integer Division

print(n)

In [None]:
n = 10
n %= 2  # Modulus

print(n)

### Relational operators

- generate a boolean value
- can also be chained

In [None]:
print(4 > 2)  # greater: also greater-equal '>=', smaller '<', and smaller-equal '<='
print(4 > 2 > 0)  # this is equal to '4 > 2 and 2 > 0' -> this chaining principle also works for other operators

In [None]:
my_int_1 = 4
my_int_2 = 4
my_int_3 = 2

my_float_1 = 4.0
my_float_2 = 4.0
my_float_3 = 2.0

# comparing ints and floats:
print(my_int_1 == my_int_2)
print(my_float_1 == my_float_2)
print()

print(my_int_1 == my_float_1)
print(my_int_3 == my_float_1)

In [None]:
# equality vs being the same:
print(my_int_1 is my_int_2)  # comparison for being same only works with ints, not with floats
print(hex(id(my_int_1)), hex(id(my_int_2)))
print()

print(my_float_1 is my_float_2)
print(hex(id(my_float_1)), hex(id(my_float_2)))

In [None]:
4 != 2  # unequal

### Logical Operators

In [None]:
not True  # inversion with 'not' (NOT)

In [None]:
# combination with 'and' - both conditions must be fulfilled (AND):

print(True and True)
print(False and True)
print(False and False)

In [None]:
# exclusion with 'or' - one or both conditions must be met (OR):

print(True or True)
print(False or True)
print(False or False)

In [None]:
# exclusion with '^' - exactly one condition must be fulfilled (XOR):

print(True ^ True)
print(False ^ True)
print(False ^ False)

- booleans are a subclass of integers, ```True``` is equivalent to ```1``` and ```False``` equivalent to ```0``` 
- therefore, ```True``` and ```False``` can be calculated with like integers 

In [None]:
print(int(True))
print(int(False))

In [None]:
print(True + True)
print(True * 10)

- with typecasting, emtpy objects (like strings) and ```None``` result in ```False```, all others in ```True```

In [None]:
print(bool("Moin"))
print(bool(""))  # besides empty strings same result for empty lists, tuples, sets, dicts, None, ...
print(bool(None))

## Strings

### String formatting & concatenation

- can be nested by alternate usage of single quotes ```"``` and ```'```
- can be concatenated using different methods. f-strings (with leading ```f"```) are the most elegant method, so please don't use the ```format``` and ```%``` methods!

In [None]:
print("This is a string using " + str(3) + " and " + str(4) + " as numbers.")  # using +-signs
print()

In [None]:
print("This is a string using", str(3), "and", str(4), "as numbers.")  # using commas -> adds a space in between
print()

In [None]:
print("This is a string using %s and %s as numbers." % (3, 4))  # using %s-Operator (strings or any object with a string representation)
print("This is a string using %d and %d as numbers." % (3, 4))  # using %d-Operator (integers)
print("This is a string using %.2f and %.3f as numbers." % (3, 4))  # using %.<no of digits>f-Operator (floating-point values)
print()

In [None]:
print("This is a string using {} and {} as numbers.".format(3, 4))  # using format()-function
print()

In [None]:
print(f"This is a string using {3} and {4} as numbers.")  # using f-string
print(f"This is a string using {3:.2f} and {4:.3g} as numbers.")  # using f-string with format

print(f"This is a string using {123:>3} and {34:>2} as numbers.")  # using f-string with right alignment
print(f"This is a string using {3:>3} and {4:>2} as numbers.")

print(f"This is a string using {3:e} and {4:e} as numbers.")  # using f-string with exponential notation

string_integer_1 = 3
string_integer_2 = 4
print(f"This is a string using {string_integer_1 = } and {string_integer_2 = } as numbers.")  # f-string with debugging feature

- docstrings, enclosed in triple quotes ```"""``` and ```'''```, allow for multi-line strings

In [None]:
print("""This is a
multi-line
string""")

- special characters "\n" for linebreak and "\t" for tab &rarr; "\n" allows building multi-line strings in single-quotes

- the ```repr()``` methods shows us these special characters
- alternatively, you can provide a *raw string* using the r-notation ```r"<WHATEVER>"``` (similar to-fstrings) to print the special characters

In [None]:
string_with_special_chars = "This is a message with a tab \t to demonstrate special characters. \n That's it."

print(string_with_special_chars)
repr(string_with_special_chars)

### Manipulation of existing strings

- special characters and whitespaces at the beginning and end of strings can be removed using the ```<string>.strip()``` function

In [None]:
string_with_whitespaces = "\n This is a string with leading and trailing whitespaces. \t "

print(repr(string_with_whitespaces))
print(repr(string_with_whitespaces.strip()))
print(repr(string_with_whitespaces.rstrip()))  # remove only at the end
print(repr(string_with_whitespaces.lstrip()))  # remove only at the beginning

- converting strings between different cases is possible using the ```<string>.upper()``` and ```<string>.lower()``` function

In [None]:
print("Hello!".upper())
print("Hello!".lower())

character (sequence) instances in a string can be 
- counted using the ```<string>.count()``` function &rarr; adds up the number of times they appear in the string
- found using the ```<string>.find()``` function &rarr; returns the position number of the first character of the sequence in the string
- replaced using the ```<string>.replace()``` function

In [None]:
print("Hello!".count("l"))
print("Hello!".find("e"))
print("Hello!".replace("o", "ouu"))

- other string-methods can be found [here](https://www.w3schools.com/python/python_ref_string.asp)

## Decision structures

The execution of statements can be linked to conditions:
- If the condition is ```TRUE```, the ```Then``` statement block is executed
- If the condition is ```FALSE```, all the ```Elif``` statement blocks are executed one-after-one
- Else, the ```Else``` block is executed

<br/>

- If-else-trees can be nested
- writing ```if <CONDITION> is True:``` is not necessary - ```if <CONDITION>:``` suffices

<br/>

- Python relies on indentation (whitespace at beginning of line) to define scope in code

In [None]:
condition = True  # False
other_condition = True  # False

if condition:
    print("This is the 'then' block of the first tree.")
    
    if not condition:
        print("This is the 'then' block of the second tree.")  # this won't be executed
    elif not other_condition:
        print("This is the first (and only) 'elif' block of the second tree.")
    else:
        print("This is the 'else' block of the second tree.")
    
elif other_condition:
    print("This is the first (and only) 'elif' block of the first tree.")
else:
    print("This is the 'else' block of the first tree.")

## Errors and Exceptions

- when you know that something may go wrong, you can catch the exception with a try-statement 

<br/>

- the ```try``` clause is execuded &rarr; in case of an exception the ```except``` clause is executed
- the ```else``` clause is executed when there is no error
- in any case (so regardless of the outcome) the ```finally``` clause is executed in the end

<br/>

- the ```except``` condition should always include a specific exception so that it does not catch unexpected errors

In [None]:
integer = 5  # str(5)

try:
    print("It is not possible to concatenate this string with the integer " + integer + ".")

except TypeError as e: 
    print(f"Gotcha! It seems like there is a syntax error with the message '{e}'.")

else:
    print("No error found.")

finally:
    print("I'm the finally clause which is always executed.")

- exceptions are raised using the ```raise``` keyword and can include a message for the user
- a list of exceptions can be found in the [Python Documentation](https://docs.python.org/3/library/exceptions.html)

In [None]:
raise Exception("Die!")

## External Files

Many more file operations can be found [in this article](https://www.programiz.com/python-programming/file-operation).

### Reading from other files

- the built-in Python ```open("<storage path>")``` function takes the storage path of a file (as a string) as parameter and returns a file object, which can be assigned to a variable ```<filename>``` to read the file contents
- when the file is not where Python is asked to look for it, a ```FileNotFoundError``` is raised &rarr; wrapping the ```open("<storage path>")``` function in a ```try``` statement can catch this case

<br/>

- after using a file it better is closed again to avoid write-conflicts in the operating system using ```<filename>.close()```
- alternatively, the ```with``` statement can be used which automatically closes the file

<br/>

- having opened the file:
    - single lines can be read via the function ```<filename>.readline()```
        - this function moves on one line whenever it is called
        - when there are no lines left in the file, it returns an empty string
    - the pointer can be set to a specific *n*-th caracter of the file using ```<filename>.seek(n)>``` or the current position of the pointer can be read using ```<filename>.tell()>```
    - all remaining content after the current curser position can be obtained with the ```<filename>.readlines()``` command

In [None]:
f = open("customers.txt", encoding="utf-8")  # fix output with proper "encoding"-parameter
line = f.readline()  # read the first line
print(line)
f.close()  # don't forget to close the file again!
    
# --- alternative:
with open("customers.txt", encoding="utf-8") as f:
    line = f.readline()  # no closing statement is required minimising the risk for file corruption
    print(line)

# --- catching exceptions:
try:  # if the 'with' statement is omitted the file has to be closed in the 'else' statement
    with open("customer_names.txt", encoding="utf-8") as f:
        line = f.readline()
        print(line)

except FileNotFoundError:
    print("File not found. Skipping read process...")

### Writing to other files

- the file must be opened in *writing mode* using the ```mode="w"``` flag
- a file object can be written to using the ```<filename>.write(<content string>)``` function
- sometimes the encoding of the file has to be specified in the ```open()``` method using the ```encoding=<ENCODING>``` argument

In [None]:
to_be_saved = "Please save me!"

with open('output.txt', mode='w', encoding="utf-8") as f:  # can also be 'output.csv' an such
    f.write(f"{to_be_saved}\n")  # add line break character
    f.write("Alrighty, I'm on my way!")  

## Loops

- execute instruction blocks more than once – also called iterations

Common elements of all loop types:
1. Initialization of a counter variable (often ```i```, then ```j```, ...) &rarr; same variable can (oftentimes) only be used in non-nested loops - nested ones ofentimes require using different ones
2. counting function
3. termination condition

### The for-loop

- fixed number of iterations
    - in range of numbers ```<start>``` and ```<end>``` using ```range``` function &rarr; starting with ```<start>``` value and ending with ```<end>-1``` (!) value
    - in *iterable objects* like strings, lists, tuples, dictionaries

In [None]:
print(type(range(1)))  # 'range' function returns object of type 'range' (and not list -> introduced later)

In [None]:
for i in range(0, 5):
    print(i)
print("---")
    
# range function also supports step sizes:
for i in range(0, 10, 2):
    print(i)
print("---")

# in the range function the start value defaults to 0 and can be omitted in this case:
for i in range(5):
    print(i)
print("---")

# range function can also run backwards using a negative step size:
for i in range(10, 5, -1):
    print(i)

In [None]:
for letter in "Hallelujah!":  # looping through a string
    print(letter)

### The while-loop

- Loop is executed as long as the condition is true &rarr; condition can be changed inside the loop to break it
- ```while True``` loops run indefinitively

In [None]:
n = 10

while n > 0:
    print(n)
    n -= 1
print("Lift off!")

In [None]:
with open("customers.txt", encoding="utf-8") as file:
    line = file.readline()
    while line:                 # loop over each (non-empty) line (introduced later) -> empty lines return False, ending the loop
        print(line)             # print each line's entry. 
        line = file.readline()  # read the next line

### break and continue statements

- a ```break``` statement, when used inside the loop, will terminate the loop and exit. If used inside nested loops, it will break out from the current loop.
- a ```continue``` statement, when used inside a loop, will stop the current execution, and the control will go back to the start of the loop.

In [None]:
for i in range(10):
    print(f"{i = }")
    
    for j in range(10):
        
        if j > i:
            print("\t Let's get out of here...")
            break  # break also makes 'else'-statement redundant
    
        print(f"\t {j = }")
    
    continue  # continue with next iteration here: the code behind this statement is unreachable
    print("I am invisible.")

## Lists

- data structure consisting of a sequence of values called elements or items 

### Basic operations

- empty lists are initiated by brackets ```my_list = []``` or via the ```my_list = list()``` function

In [None]:
my_list = ["Hi", 3, [1.5, "Bye"]]  # list storing a string, an integer, and another list

In [None]:
# get the length of the list:
print(len(my_list))
print()

In [None]:
# print the whole list:
print(my_list)
print()

- elements can be all types of objects, and also of mixed type
- elements can be addressed by list name and their index: ```<listname>[<index>]``` &rarr; lists are an ordered collection of elements

<br/>

- lists are zero-indexed (!) meaning that the first element is element number zero
- negative indices start from the back of the list &rarr; ```<listname>[-1]``` accesses the last element of the list

In [None]:
# print (or access) a single list element:
print(my_list[0])
print()

hello_statement = my_list[0]

- lists are *mutable* &rarr; they (and their elements) can be modified after being created
- new items can be added to the list:
    - using the ```<listname>.append(<item>)``` function at the end of the list
    - using ```<listname>.insert(<index>, <item>)``` at a certain position &rarr; all elements after ```<item>``` are shifted to the right

<br/>

- items can be deleted in different ways:
    - if the index is known: ```del <listname>[<index>]``` or ```<listname>.pop(<index>)``` &rarr; ```pop``` also returns the value, allowing it to be stored in a variable
    - if the item is known: ```<listname>.remove(<item>)``` &rarr; searches from beginning to end, removing the first entry

In [None]:
# modify the list (in a Jupyter Notebook like this this will yield different results each time this cell is executed):
my_other_list = [3.141, 2.718]

my_list.append(my_other_list)  # add numbers as list element to list
my_list.extend(my_other_list)  # add numbers elementwise as single elements to list

first_deleted_element = my_list.pop(0)
print(first_deleted_element)

print(my_list)

- strings can be split into lists using the ```<string>.split(<delimiter>)``` function
- lists of strings can be joined using the ```<delimiter>.join(<listname>)``` function 

In [None]:
string_to_be_split = "Simply split this sentence into its words"

splitted_string_list = string_to_be_split.split(" ")
concatenated_string = " - ".join(splitted_string_list)

print(splitted_string_list)
print(type(splitted_string_list))

print()

print(concatenated_string)
print(type(concatenated_string))

### List slicing

Sling describes taking a part of the list.

- omitting the first index, the slice starts at the beginning
- omitting the second index, the slice ends at the end

In [None]:
list_to_slice = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

print(list_to_slice[5:8])
print(list_to_slice[:8])  # everything up until (not including) index 8
print(list_to_slice[8:])  # everything from (including) index 8 on
print(list_to_slice[:])   # take whole list

- a slice operator on the left side of an assignment can update multiple elements

In [None]:
list_to_slice = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
list_to_slice[1:3] = ['x', 'y']

print(list_to_slice)

### List combination

- lists can be combined by summing them up &rarr; other operations like subtraction do not work this way (even when both lists only contain numerical values)

- alternatively the ```.extend()``` method could be used

In [None]:
list_1_to_combine = [1, 2, 3]
list_2_to_combine = ["A", "B", "C"]

print(list_1_to_combine + list_2_to_combine)

### Looping over lists

#### ... the classical, multi-line way

In [None]:
loop_list = [12, 3, 1.5]

# looping directly over items:
for item in loop_list:
    print(item)
print()

# looping over indices:
for index in range(0, len(loop_list)):
    print(loop_list[index])
print()

- a loop over an empty list never runs the instructions

In [None]:
for i in []:
    print("This never happens")
print("Done.")

#### ... using list comprehensions

- *list comprehensions* allow creating a new list from an old one (and other iterables) within one line of code
- same works for tuples, dictionaries, sets (all mentioned later)

*Syntax*: ```new_list = [this_expression if condition else other_expression for item in iterable]```

- is theoretically also possible with nested expressions but much more difficult to decipher

In [None]:
loop_list = [12, 3, 1.5, 6, 9, 100]

# double items in value that are below 10, else triple them:

doubled_loop_list = [2*item if item < 10 else 3*item for item in loop_list]  
print(doubled_loop_list)

# this is identical to (but much shorter than):

doubled_loop_list = []
for item in loop_list:
    if item < 10:
        doubled_loop_list.append(2*item)
    else:
        doubled_loop_list.append(3*item)

print(doubled_loop_list)

### Other useful list methods

- summing up numercial values in a list using the ```sum(<listname>)``` function

In [None]:
summed_list = [2, 3.414, 5]
print(sum(summed_list))

# does only work when the list has numerical values exclusively:
summed_list.append("Hi")
print(summed_list)
print(sum(summed_list))

- sorting values in a list using the ```<listname>.sort()``` (returns ```None```) or ```sorted(<listname>)``` (returns list object) funtions

In [None]:
# Numerical sorting:

unsorted_list = [10.8, 1, 3.414]

print(unsorted_list.sort())  # in-place operation: returns None but updates list as can be seen in next print
print(unsorted_list)
print()

# ---

unsorted_list = [10.8, 1, 3.414]

print(sorted(unsorted_list))  # returns list but does not update variable as can be seen in next print
print(unsorted_list)

In [None]:
# String sorting:
unsorted_list = ["C", "A", "T", "S"]

print(sorted(unsorted_list, reverse=True))  # also reverse sorting possible

- testing for values in a list using the ```in``` operator

In [None]:
5 in [1, 2, 3, 4, 5]

## Tuples

- are similar to lists: are also ordered and elements can be accessed by index
- difference being that they are *immutable* &rarr; can't be modified once they have been created

<br/>

- allow collecting multiple values in one single variable

<br/>

- an empty tuple is initiated with parentheses ```my_tuple = ()``` or with the ```my_tuple = tuple()``` function

In [None]:
# modifying a tuple raises an error:

my_tuple = (1, 2, 3)

my_tuple.append(4)

In [None]:
my_tuple[0] = "A"

In [None]:
del my_tuple[0]

In [None]:
# tuples with single values are initiated with a comma:

tuple_var = ("A")
real_tuple = ("A",)

print(type(tuple_var))
print(type(real_tuple))

## Sets

- they are unordered
- once a set is created, it is not possible to change items, but they can be removed and new ones can be added
- their perk: they contain a collection of unique elements &rarr; duplicates are automatically deleted

<br/>

- an empty set is initiated with the ```my_set = set()``` function (but not with braces)

In [None]:
my_set = {"A", "A", "B", "C"}

print(my_set)  # the second, non-unique element is removed

# items can be added and removed:

my_set.remove("B")
print(my_set)

my_set.add("D")
print(my_set)

In [None]:
# sets are unordered:

print(my_set[0])

- allow for set-operations like determining intersections, unions, [etc.](https://www.programiz.com/python-programming/methods/set/intersection) &rarr; can be computationally slow with large sets

In [None]:
set_a = {1, 2, 3, 4}
set_b = {4, 5, 6, 7}

print(set_a.isdisjoint(set_b))

In [None]:
print(set_a.union(set_b))

## Dictionaries

- provides a special kind of list that can use (almost) any type of value as an index (not just integers like in lists) &rarr; maps **key-value pairs**
- are unordered and mutable

### Basic operations

- add new entries to the dictionary via square brackets, mapping the key to the value: ```<dict>[<key>] = <value>```, look up value via square brackets or with the ```<dict>.get(<key>)``` method
    - providing the same key twice (with two different values) overwrites one value

<br/>

- an empty dictionary is initiated with braces ```my_dict = {}``` or with the ```my_dict = dict()``` function

In [None]:
adresses = {
    "David Smith": "Marketstreet 3",
    "Julia Horn": "Bridgeway 5",
    "Adam Mouse": "Main Lane 5"
}

# adding values:

adresses["Don Joe"] = "Princess Street 1"
print(adresses)

### Retrieving values

In [None]:
# getting existing values:

print(adresses["Don Joe"])
print(adresses.get("Julia Horn"))

- the bracket- and get-methods return different values for keys that do not exist in the dict

In [None]:
print(adresses["Max Hinte"])

In [None]:
print(adresses.get("Max Hinte"))  # returns None by default which can be changed
print(adresses.get("Max Hinte", "No idea who that is."))

### Looping over dictionaries

- looping directly over a dictionary is not possible, it has to be specified over which part of the dictionary is being looped
- sequence of retrieved values from loop is arbitrary since dictionary is not ordered

In [None]:
for key, value in adresses.items():  # returns key-value tuple which can be unpacked in loop
    print(f"{key} lives in {key}")
print()

for key in adresses.keys():
    print(key)
print()

for value in adresses.values():
    print(value)

## Functions

- “takes” an argument and “returns” a result
    - functions have a *name*, *arguments* (also called *parameters*), and a *code body*
    - not all functions need arguments – can be called with empty arguments, too
    - not all functions return a value as their result – “void” functions have an effect, such as printing something
    
<br/>

- since Python executes all statements in one script from top to bottom any function needs to be either defined or imported before it can be called
- when the end of the function code is reached, the flow of execution returns to the line in the original program where it was called

<br/>

- Knowing how to access existing functions is a prerequisite to using external Python modules
    - Much given functionality comes in... functions
    - You need to know how to use a function by providing the right input and expecting the correct output
    - Opening a function’s code and reading it, without changing it, can help you understand in how far this module does what you need

<br/>

- Knowing how to organise your own code in functions lets you
    - re-use it in other areas of the program without having to type the commands out again
    - avoid typing the same code multiple times to educe errors
    - think systematically about the required and optional input for a piece of code and the resulting output

When writing or reading a program without exactly following the flow of execution through every function is termed a **leap of faith**. This means that we believe that there will be a function that can create the result that is needed because:
- it is already written and we do not want to check it now
- we will write it later and want to continue building the code that will use it
- we assume that Python provides some similar function and we will Google it later

Python also uses the leap of faith when running code that defines a function that relies on another function, which has not been defined yet (statements in the function are not run on definition, but only when the function is called) &rarr; missing or corrupt functions can create runtime errors

### Imports

#### General remarks

- when calling a function from an imported module, that function runs code from outside the script, as defined inside the function, from top to bottom
- importing creates a module object which can be accessed

<br/>

- modules not included in the standard Python library need to be downloaded, e.g. using ```pip``` ("pip installs python") via ```pip install <module name>```
- in non-local Jupyter Notebooks you can write ```!pip install <MODULE NAME>``` into a code cell

In [None]:
import math
import numpy as np  # modules can be imported using an alias

In [None]:
math.pi  # functions included in the module can be accessed via a dot notation

In [None]:
# generate normally distributed random numbers using numpy:

rng = np.random.normal(size=5)
print(rng)

In [None]:
# single functions can also be loaded into the main namespace, allowing them to be called without the dot notation
# also works with an alias 'as <xyz>'

from pandas import DataFrame, Series

print(DataFrame, Series)

#### Special package *numpy*

- Using fixed-type arrays (vectors and matrices) is more efficient for large data sets
    - fixed-type arras cannot store any kind of data, as lists do (where every element is an object with a type), but needs less memory and computation
    - the numpy package includes the data structure ndarray for this purpose

- is convenient and efficient for mathematical operations
- vectorised computations are much faster than iterating over the array with a for-loop &rarr; this is also very handy when dealing with high-dimensional arrays as every new dimension would require an additional iteration

##### Initialisation - object creation

- arrays can be initialised from specific values
- arrays can be inspected without actually looking at the specific values

In [None]:
import numpy as np

a = np.array([1, 2, 3])  # 1D array
b = np.array([[1.5, 2, 3], [4, 5, 6]])  # 2D array
c = np.array([[[1.5, 2, 3], [4, 5, 6]], [[3, 2, 1], [4, 5, 6]]])  # 3D array

print(c.shape)  # no of elements in each dimension
print(c.ndim)  # no of dimensions
print(c.size)  # total no of elements in array
print(c.dtype)  # data type of values in array

- a number of functions allow for the facilitated creation of arrays

In [None]:
print(np.zeros((2, 4))) # an array of zeros of specified shape
print()

print(np.ones((2, 4), dtype=np.int16))  # an array of ones of specified shape and data type
print()

print(np.full((2, 2), 7))  # a constant array
print()

print(np.eye(3))  # a 3x3 identity matrix
print()

In [None]:
print(np.arange(10, 25, 5))  # an array of evenly spaced values (step value)
print()

print(np.linspace(0, 2, 9))  # an array of evenly spaced values (number of samples)
print()

- the random module allows working with (psuedo-)randomly generated numbers
- random generations can be fixed by setting a seed via the ```np.random.seed(<seed>)``` function

In [None]:
print(np.random.random((2,2 )))  # an array with random float values between 0 and 1
print()

print(np.random.randint(low=1, high=11, size=5))  # an array with random integer values
print()

print(np.random.normal(loc=1, scale=1, size=5))  # an array from a normal distribution
print()

print(np.random.uniform(low=1, high=11, size=5))  # an array from a uniform distribution
print()

##### Arithmetic operations

- element-wise arithmetic operations

In [None]:
print(a + a)  # element-wise addition
print()

print(np.add(a, a))  # alternative for element-wise addition
print()

print(a.dot(a.T))  # dot product with transposed array of a

##### Selecting & aggregating data

- you can select data from arrays via their index
- replace array values using numpy's ```where(<condition>, <array>, <replacement value>)``` method
- conditions arise from element-wise comparisons

In [None]:
array_to_select_from = np.array([1, 7, 4, 12, 18, 2])

print(array_to_select_from < 10)  # selection condition -> array of booleans
array_to_select_from = array_to_select_from[array_to_select_from < 10]  # select values from array based on boolean array
print(array_to_select_from)

In [None]:
a = np.random.rand(2, 2)
cond_a = a > 0.5

print(a)
print()

print(cond_a)
print()

print(np.where(cond_a, a, 0))

- slicing in numpy is possible just like with lists
- slices return views rather than copies of the data – changing them changes the original array

In [None]:
original_array = np.array([1, 2, 3, 4, 5])

sliced_array = original_array[3:]
print(sliced_array)

sliced_array[1] = 6
print(sliced_array)
print(original_array)

- aggregations over multidimensional arrays, like ```np.sum(<array>)``` or ```np.count_nonzero(<array>)```, require specifying an axis when only a part of the array is supposed to be aggregated

In [None]:
a = np.random.rand(2, 2)

print(a)
print()

print(np.count_nonzero(a, axis=1))

#### Special package *pandas*

- pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive

- the two primary data structures of pandas are *Series* (1-dimensional) and *DataFrames* (2-dimensional)

- Dataframes can be regarded as a specialised dictionary which is a collection of multiple Series (comparable to an Excel table)
    - every column has its own title
    - every row is indexed by an index

- Dataframes rely on numpy arrays for efficient data storage
    - every column in one data frame is a series that can be referenced via the column name
    - every set of values in a series is a numpy array

##### Initialisation - object creation

- external data can be read from different formats, e.g. using the ```pd.read_csv()``` or the ```pd.read_excel()``` functions
- external files can be created in different formats, e.g. using the ```pd.to_csv()``` or the ```pd.to_excel()``` functions

In [None]:
import numpy as np
import pandas as pd

df = pd.read_csv("customers.txt", header=None, delimiter=";", names=["first_name", "last_name", "birthday", "value_of_bought_items"])

print(df)

- from existing data inside the python script one can create a Series by passing a list of values, letting pandas create a default *integer index*

In [None]:
s = pd.Series([1, 3, 5, np.nan, 6, 8])
print(s)

- creating a DataFrame is possible by passing a NumPy array -- here with a *datetime index* using ```date_range()``` and labeled columns
- a DataFrame can have more than one index

<br>

- creating a DataFrame is also possible by passing a dictionary of objects that can be converted into a series-like structure

In [None]:
dates = pd.date_range("20230101", periods=6)
print(dates)
print()

df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list("ABCD"))
print(df)
print()

df2 = pd.DataFrame(
    {
        "A": 1.0,
        "B": pd.Timestamp("20130102"),
        "C": pd.Series(1, index=list(range(4)), dtype="float32"),
        "D": np.array([3] * 4, dtype="int32"),
        "E": pd.Categorical(["test", "fail", "test", "train"]),
        "F": "foo",
    }
)
print(df2)

- the columns of the resulting DataFrame have different data types

In [None]:
print(df2.dtypes)

- use ```DataFrame.head()``` and ```DataFrame.tail()``` to view the top and bottom rows of the frame respectively

In [None]:
print(df.head(3))  # standard value is first five rows

- display the index or columns of da DataFrame

In [None]:
print(df.index)
print()
print(df.columns)

- you can sort a DataFrame by index or by column values

##### Sorting, selecting & updating data

In [None]:
df = df.sort_index(axis=1, ascending=False)
print(df)
print()

df = df.sort_values(by="B")
print(df)

- data from a  DataFrame can be selected

In [None]:
print(df["A"])  # select single column -> returns Series
print()

print(df[0:3])  # select rows by index number

In [None]:
# get DataFrame subset ...
print(df.loc["20230102":"20230104", ["A", "B"]])  # ... by index values (inclusive) and column names
print()

print(df.iloc[3:5, 0:2])  # ... by numeric value positions

In [None]:
# get single value from DataFrame ...
print(df.at["20230102", "A"])  # ... by index value and column name
print()

print(df.iat[1, 1])  # ... by numeric value position

In [None]:
# boolean indexing
condition_1 = df["A"] > 0

print(condition_1)  # boolean Series
print()

print(df[condition_1])  # filtered DataFrame
print()

print(df[condition_1 & (df["B"] < -1)])  # multiple conditions connected with logical operators -> enclose in parentheses
print()

print(df2[df2["E"].isin(["test", "fail"])])  # special functions are available for boolean indexing

- values of a DataFrame can also be updated

In [None]:
print(df)
print()

df.iat[0, 1] = 0
df.loc[:, "D"] = np.array([5] * len(df))

print(df)

##### Statistics

- pandas offers a variety of statistics functions

In [None]:
print(df.describe())  # overview function
print()

print(df["A"].mean())  # example of a statistics function

##### Applying functions to values individually

- ```DataFrame.apply()``` applies a user defined function to the data

In [None]:
print(df.apply(lambda x: x.max() - x.min()))

##### String manipulation & string selection

- *string methods* allow for manipulations and selections of string values

In [None]:
s = pd.Series(["A", "B", "C", "Aaba", "Baca", np.nan, "CABA", "dog", "cat"])

print(s.str.lower())
print()

print(s[s.str.contains("a", na=False)])

##### Merging data

- with SQL-like merging actions multiple dataframes can be combined

In [None]:
df3 = pd.DataFrame(np.random.randn(10, 4))

# concatenating pandas objects together along an axis
print(pd.concat([df3[:1], df3[3:7], df3[9:]]))

In [None]:
left = pd.DataFrame(
    {
        "key": "foo",
        "lval": [1, 2]
    }
)
right = pd.DataFrame(
    {
        "key": "foo",
        "rval": [4, 5]
    }
)

print(left)
print()

print(right)
print()

print(pd.merge(left, right, on="key"))  # inner join

##### Grouping data

- By *Grouping* we are referring to a process involving one or more of the following steps:
    - *Splitting* the data into groups based on some criteria
    - *Applying* a function to each group independently
    - *Combining* the results into a data structure

In [None]:
df4 = pd.DataFrame(
    {
        "A": ["foo", "bar", "foo", "bar", "foo", "bar", "foo", "foo"],
        "B": ["one", "one", "two", "three", "two", "two", "one", "three"],
        "C": np.random.randn(8),
        "D": np.random.randn(8),
    }
)

print(df4)
print()

print(df4.groupby(["A", "B"]).sum())

##### Generating plots

- pandas can use matplotlib to plot data from DataFrames

In [None]:
import matplotlib.pyplot as plt

df5 = pd.DataFrame(
    np.random.randn(1000, 4),
    index=pd.date_range("1/1/2000", periods=1000),
    columns=["A", "B", "C", "D"]
)

df5 = df5.cumsum()

df5.plot()
plt.legend(loc='best')

plt.show()
plt.close()

#### Special package *scipy*

- scipy includes algorithms for optimization, integration, interpolation, eigenvalue problems, algebraic equations, differential equations and many other classes of problems
- scipy has too many functions to show here so take this example as a teaser what the library may be used for

In [None]:
from scipy.signal import find_peaks
from scipy.datasets import electrocardiogram
import matplotlib.pyplot as plt

# select the atypical heart beats from an electrocardiogram based on minimum peak widths and prominences

x = electrocardiogram()[17_000:18_000]
peaks, properties = find_peaks(x, prominence=1, width=20)

plt.plot(x)
plt.plot(peaks, x[peaks], "x")
plt.vlines(x=peaks, ymin=x[peaks] - properties["prominences"],
           ymax = x[peaks], color = "C1")
plt.hlines(y=properties["width_heights"], xmin=properties["left_ips"],
           xmax=properties["right_ips"], color = "C1")

plt.show()
plt.close()

#### Special package *matplotlib*

##### Types of plots

- matplotlib is a library for making 2D plots
- it is designed with the philosophy that you should be able to create simple plots with just a few commands

In [None]:
# initialise
import numpy as np
import matplotlib.pyplot as plt

# prepare
x = np.linspace(0, 4 * np.pi, 1000)
y = np.sin(x)

# render
plt.plot(x, y)

# observe
plt.show()
plt.close()

- matplotlib offers several kinds of plots

In [None]:
x = np.random.uniform(0, 1, 100)
y = np.random.uniform(0, 1, 100)

plt.scatter(x, y)

plt.show()
plt.close()

In [None]:
x = np.arange(10)
y = np.random.uniform(1, 10, 10)

plt.bar(x, y)

plt.show()
plt.close()

In [None]:
z = np.random.uniform(0, 1, (8, 8))

plt.imshow(z)

plt.show()
plt.close()

In [None]:
z = np.random.uniform(0, 1, (8,8))

plt.contourf(z)

plt.show()
plt.close()

In [None]:
z = np.random.uniform(0, 1, 4)

plt.pie(z)

plt.show()
plt.close()

In [None]:
z = np.random.uniform(0, 1, 100)

plt.hist(z)

plt.show()
plt.close()

In [None]:
x = np.arange(5)
y = np.random.uniform(0, 1, 5)

plt.errorbar(x, y, y/4)

plt.show()
plt.close()

In [None]:
z = np.random.normal(0, 1, (100,3))

plt.boxplot(z)

plt.show()
plt.close()

##### Make plots prettier

- you can modify pretty much anything in a plot, including limits, colors, markers, line width and styles, ticks and ticks labels, titles, etc.

In [None]:
x = np.linspace(0, 10, 50)
y = np.sin(x)

plt.plot(x, y, color="red", linestyle="--", linewidth=2, marker="o")

plt.show()
plt.close()

##### Plot multiple data

- You can plot several data on the the same figure, but you can also split a figure in several subplots (named *axes*):

In [None]:
x = np.linspace(0, 10, 100)
y1, y2 = np.sin(x), np.cos(x)

plt.plot(x, y1)
plt.plot(x, y2)

plt.show()
plt.close()

In [None]:
x = np.linspace(0, 10, 100)
y1, y2 = np.sin(x), np.cos(x)

fig, (ax1, ax2) = plt.subplots(1,2)

ax1.plot(y1, x, color="C1")
ax2.plot(y2, x, color="C0")

plt.show()
plt.close()

- you can annotate a plot with labels

In [None]:
x = np.linspace(0, 10, 100)
y1, y2 = np.sin(x), np.cos(x)

fig, ax = plt.subplots()

ax.plot(x, y1)
ax.plot(x, y2)

ax.set_title("Sine and Cosine waves")

ax.set_ylabel(None)
ax.set_xlabel("Time")

plt.show()
plt.close()

- matplotlib allows you to export figures as bitmap or vector images

In [None]:
x = np.linspace(0, 10, 100)
y1, y2 = np.sin(x), np.cos(x)

fig, ax = plt.subplots()

ax.plot(x, y1)
ax.plot(x, y2)

ax.set_title("Sine and Cosine waves")

ax.set_ylabel(None)
ax.set_xlabel("Time")

fig.savefig("fig.png", dpi=300)
fig.savefig("fig.pdf", bbox_inches="tight")

### Defining own Functions

#### General remarks

Being able to create ones own building blocks is one of the key tenants of programming: write a function once, use it and use it again whenever and wherever you need it.
- functions are defined using the ```def``` statement

<pre><code>def function_name(argument_name_1, argument_name_2 = default_value_2, ...):
    """docstring"""
    
    &lt;code body&gt;</code></pre>
    
- *Docstrings* are optional and describe what the function does. There are different styles for docstrings that are currently used, for example [Google](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings) has defined their own style

In [None]:
# a function is an object, which has the type function
print(type(np.random.normal))

- functions define variables locally meaning that they can't be accessed outside the function
- inside the function variables from outside the function can, however, be accessed
- functions can be nested which leads to nested name spaces

In [None]:
def local_variable_function():
    local_variable_function_variable = 5
    
local_variable_function_variable

In [None]:
global_variable = 3

def outer_function():
    outer_function_variable = 12
    
    def inner_function():
        inner_function_variable = 6
        
        print(global_variable * outer_function_variable * inner_function_variable)
        
    inner_function()
    print(global_variable * outer_function_variable * inner_function_variable)  # this throws an error
    
outer_function()

- one function can also call another function, even if that function is defined afterwards

In [None]:
def function_1():
    print("Foo")
    
def function_2():
    function_1()
    print("Faa")
    
function_2()

# ---

def function_3():
    function_4()  # only defined later -> still works since all functions are loaded into memory before execution of the script
    print("Fii")
    
def function_4():
    print("Fee")
    
function_3()

- functions can recursively call themselves
- recursions are not exactly efficient and Python sets a limit of about 1000 calls that a function can make to itself

In [None]:
def recursive_countdown(n):
    if n <= 0:
        print("Lift off!")
    else:
        print(n)
        recursive_countdown(n-1)
        
recursive_countdown(5)

#### Arguments

- A function can take any number and type of variables as argument (or also called parameter) &rarr; what the function needs has to be described when defining it
- the type of the argument does not (but can) be defined when defining the function

In [None]:
def function_name(argument_1, argument_2):
    pass  # don't do anything ("placeholder" keyword)

In [None]:
# calling a function that requires an argument without providing one leads to a TypeError

def call_me_maybe(my_name):
    print(f"I'll maybe call you, {my_name}.")
    
call_me_maybe()

- by defining a default value for the argument when setting up the function, an argument can be made optional (or "default")
- when the function is called without values for optional arguments then the defaults are assumed, otherwise the values are overwritten
- optional arguments all have to be listed behind the mandatory arguments in the function definition

In [None]:
def compute_random_numbers(number_of_numbers = 5):
    print(np.random.rand(number_of_numbers))
    
compute_random_numbers()  # 5 numbers are created by default
compute_random_numbers(3)  # but 3 is also possible

- by providing *argument keywords* when calling a function, it is possible to call a function without having to stick to the order of arguments given in the definition
- adding a ```*;``` in the function definition forces the use of keywords in all following parameters

In [None]:
def interest(capital, rate=0.1):
    print(f"Interest is {capital*rate}")
    
interest(100)  # omitting default argument value
interest(200, 0.4)  # specifiying both without argument names -> values have to be in order as defined in function
interest(rate=0.5, capital=500)  # providing argument names allows switching the sequence of arguments

def interest_with_keywords(*, capital, rate=0.1):
    print(f"Interest with keywords is {capital*rate}")

interest_with_keywords(capital=500, rate=0.5)
interest_with_keywords(200, 0.4)

- functions are defined once and every time the function is called its current state is updated &rarr; mutable arguments will lead to unexpected function behaviour &rarr; *do not* use mutable default arguments in Python

In [None]:
def mutable_arguments_are_forbidden(mutable_argument = []):
    mutable_argument.append("some stuff")
    print(f"A list based on {', '.join(mutable_argument)}.")
    
mutable_arguments_are_forbidden()
mutable_arguments_are_forbidden()  # the last state of the argument 'mutable_argument' is preserved 
mutable_arguments_are_forbidden()

- as Python does not define the type of an argument when defining the function (as in "this function wants two numeric values") the type of the argument has to be checked inside the function
- the type of a variable can be checked using the ```isinstance(<variable>, <type>)``` function or by catching an exception

In [None]:
def multiplication(factor_1, factor_2):
    print(factor_1 + factor_2)

multiplication(factor_1="100", factor_2=50)  # providing a string triggers an exception

In [None]:
def multiplication(factor_1, factor_2):
    if isinstance(factor_1, (int, float)) and isinstance(factor_2, (int, float)): 
        print(factor_1 * factor_2)
    else:
        print("You can only multiply two numbers.")
        
multiplication(factor_1="100", factor_2=50)  # if it was not for the test, multiplying the string with the number would concatenate the string "100" 50 times
multiplication(factor_1=100, factor_2=50)

#### Functions with and without returns

- so far, the functions had an effect by printing some statements, but is was not possible to take the result out of the function
- to write a function that returns a value (which can then be assigned to a variable), include a ```return``` statement
- none of the statements after the ```return``` statement are executed!

In [None]:
def double_number(n):
    return 2*n
    print("I'm invisible!")
    
doubled_number = double_number(2)
print(doubled_number)

- generator functions look and act just like regular functions, but with one defining characteristic: they use the ```yield``` keyword instead of ```return```
- ```yield``` indicates where a value is sent back to the caller, but unlike ```return```, you don’t exit the function afterward &rarr; instead, the state of the function is remembered
- generators are a great way to optimize memory as they do not store all their contents in memory when they are being called the first time

In [None]:
gen_1 = (num**2 for num in range(4))
print(type(gen_1))

for i in gen_1:
    print(i)

- instead of using a ```for``` loop, you can also call ```next()``` on the generator object directly to get the next output from the generator

In [None]:
def sequence():
    num = 0
    while num < 3:
        yield num
        num += 1

gen_2 = sequence()

print(next(gen_2))
print(next(gen_2))
print(next(gen_2))

- generators throw a ```StopIteration```-Error when they run out of values to create
- ```next``` functions can define a default value to return in this case

In [None]:
print(next(gen_2))

### Storing a set of functions or parameters in a module

- any python script ```<module_name>.py``` can be considered as a module
- to store functions or parameters in a module that can be re-used
    - Define them in a separate script file, which should be stored where you also store the script you are currently working with
    - Import the module via the import statement referencing the script name
- access imported functions or parameters as previously discussed, e.g. ```<module_name>.do_something()```
- all functions in the imported module are defined in the course of the import statement – this may trigger errors if the function definitions are flawed!
- more on absolute and relative imports can be read in [PEP 328](https://peps.python.org/pep-0328/)

<br/>

- use this functionality to split the code of your project into different files to prevent single files from becoming overly cluttered (more on this [here](https://teclado.com/30-days-of-python/python-30-day-21-multiple-files/))

In [None]:
import my_functions  # import functions from file "my_functions.py" which is located in the same folder as the executed script

my_functions.hello()

## Classes

### General remarks

Python supports:
- *imperative programming* – by implementing algorithms as step-by-step changes in a single state (flow of execution from top to bottom with small detours for functions)
- *procedural programming* – by letting the user define functions
- *functional programming* – particularly via a module termed ```functools``` (not covered here)
- *object-oriented programming* – letting the user define **classes** as templates for objects
    
<br/>

Object oriented programming aggregates data and functions into **classes of realworld objects**. Classes organise objects that:
- have the same *attributes*
- can use the same *methods*

They combine the best of two worlds:
- storing data that relates to one thing in one place
- making a set of functionality as relevant to one thing available together
- thing = object

<br/>

Class objects are initiated using the ```class``` statement

<pre><code>class ClassName:
    """Optional docstring describing class"""
    
    def __init__(self, attribute_1, attribute_2 = default_value_2, ...):
        &lt;code block&gt;
        
    def method_1(self):
        &lt;code block&gt;</code></pre>
       
Classes are factories for objects:
- Creating a new object is called instantiation: any object is an instance of a class – one variant of parametrising the class template
- To create an object (an *instance* of a class), call the class as if it was a function, e.g.: ```new_instance = ClassName()```

<br/>

Naming conventions: 
- the names of classes are writen in "camel case": are capitalised and written as one word (e.g. "CamelCase") 
- the names of variables and objects are written in "snake case": with lower case letters separated by underscores (e.g. "snake_case")

Class names are obtained using the ```type()``` function

In [None]:
# class creation:
class Point:
    """Represents a point in 2D-space."""
    
# instance creation:
new_point = Point()

# attribute definition and value assignment:
new_point.x_coordinate = 15.5
new_point.y_coordinate = 66.5

# retrieving attribute values: 
print(f"The point is at x={new_point.x_coordinate} and y={new_point.y_coordinate}")

- it is not necessary to pre-define attributes in the class definition to store them in an object
- attemting to access an attribute that was not defined before will result in an ```AttributeError```

In [None]:
new_point_2 = Point()

# the attributes 'x_coordinate' and 'y_coordinate' were only defined for the instance 'new_point' of the class 'Point'
# but not for the instance 'new_point_2'
print(f"The point is at x={new_point_2.x_coordinate} and y={new_point_2.y_coordinate}")

As any new class is defined as a type like any other (integer, float, ...), when assigning a value to an object, it is also possible to assign a new object (an instance of another class) to the existing object.

- by simply assigning an object to a new variable, it is NOT copied but only aliased: the two variables will still refer to the same instance &rarr; this applies to all objects and can be annoying when overlooked
       
- this can be avoided by using the ```copy()``` function of the ```copy``` module

In [None]:
class Article:
    """"""
    
class Customer:
    """"""
    
banana = Article()
banana.price = 5
banana.comments = ["Yellow", "Fruity"]

new_customer = Customer()
new_customer.favourite_article = banana


# testing the identity of the attributes of the two classes:
print(hex(id(new_customer.favourite_article.price)), hex(id(banana.price)))
print(new_customer.favourite_article.price is banana.price)

# testing the effect of an alteration of the attribute of one class to the other:
banana.price = 6
print(new_customer.favourite_article.price)

banana.comments.append("Soft")
print(new_customer.favourite_article.comments)

In [None]:
# aliasing only applies to more complex types, not to integers, floats, strings:
int_1 = 5
int_2 = int_1

print(int_2)
int_1 = 6
print(int_2)
print()

# aliasing does apply to lists and dictionaries!
list_1 = [1, 2, 3]
list_2 = list_1

print(list_2 is list_1)
list_1.append(4)
print(list_1)
print(list_2)
print()

# copying prevents accidental changes:
from copy import copy

list_2 = copy(list_1)

print(list_2 is list_1)
list_1.append(5)
print(list_1)
print(list_2)

### Modifier functions

When being taking (pointers to) objects as arguments, functions can also work as modifiers, adjusting values within the object.
- This can be efficient, but can also make programs harder to track and debug.
- Anything that can be done in this way can also be achieved by handling return statements.

### Methods: class functions

- by defining functions as part of the object, so-called *methods*, the function can access all information that is already there
- any object instantiated from a class with that method (and only those) has access to that method

<br/>

The ```__init__``` method:
- by implementing an init method and relying on it in the further code, one can avoid building inconsistent instances of a class and making mistakes in attribute assignment
- this predefined type of method gets automatically invoked whenever an instance is instantiated

When a class has an ```__init__``` method defined, objects can be instantiated from it
- without arguments, so that attributes get assigned the standard values
- with arguments, so that attributes get assigned the values handed to the method

<br/>

The ```__str__``` method:
- invoked when the instance is printed
- by defining this method for a specific class, it can be defined what is printed when an object from that class is printed

In [None]:
import math


class Circle:
    def __init__(self, diameter):
        self.diameter = diameter  # by the provided diameter value 'diameter' is bound to a local 'self.diameter' object variable
        self.area = None  # not absolutely necessary to invoke attribute as None before calculation function is called, but recommended
        
    def __str__(self):
        return f"This is a circle with the diameter {self.diameter}."
    
    def calculate_area(self):  # calculation could also be done directly in __init__
        self.area = (math.pi/4) * self.diameter ** 2
        

small_circle = Circle(3)

print(small_circle.area)
small_circle.calculate_area()  # Note: methods are called with opening and closing parentheses...
print(small_circle.area)       #       ... and attributes without those!

# Coding Concepts

## Development approach: Prototype and Patch

1. First, write code that runs without errors but does not need to quite deliver the result you want it to deliver in the end
2. In later steps of development, you add code to “patch” the outcome from the first prototype, so that it achieves what you want
3. In a final stage, you can clean up the code so that it achieves the desired results efficiently.

- consider listing individual steps as comments (everything after a ```#``` in a code block), so you don’t “forget” anything &rarr; consider using the ```TODO``` keyword
- write the first lines and see whether the program is working - it does not need to do everything you want it to do yet!
- add and change the existing code incrementally - use variables to hold intermediary values to be able to keep track of them
- Test after each incremental step

*Advantages*:
- testing after incremental steps shows where an error is likely to come from
- writing code incrementally reduces large and complex tasks into manageable steps

*Disadvantages*:
- after you have written a working program incrementally, consider reworking it to avoid unnecessary steps or to improve computational efficiency

## Testing and Debugging

### System validation

- aims to ensure system reliability, recognise flaws in design or code, and to correct them.
- Professional projects allocate the same amount of time to testing as to implementing (50-50-rule)
- Difficult decision: When to stop testing?
    - Only the most creative (destructive) personnel should be testing
    - Programmers should never test their own code
    - Test cases should be developed by well-trained users during or after the definition of requirements
    - Testing can show the presence of bugs, but never show their absence (E.W. Dijkstra)

### Formal Structure

1. *Test Case Identifier*: can be a number or a unique name
2. *Test Case Description*: so that whoever does the testing knows what the test is for
3. *Test Data*: needs to allow for efficient testing but also include special cases
4. *Expected Result*: comparing expected and actual outcomes

### Test-Driven Design

1. Write the test of the desired functionality.
2. Run it and confirm that it fails as expected.
3. Then amend the code so that the test will succeed.

- Do nothing until you have written a test for it.
- Relies on functional, black-box, end-to-end, acceptance tests: What the user expects to see.
- Relies on automated testing for efficiency.

### Types of Bugs

- *Specification* – wrong model: e.g., incorrect formula
- *Omission* – model misses something: e.g., did not take learning into account
- *Data* – faulty parameters: e.g., mis-estimated willingness to pay
- *Syntactical* – wrong format of statement: e.g., miss „;“ at the end of a command in Java
- *Logical* – wrong conditioning: e.g., created an endless loop

### Types of Errors

*Syntax error*:
- problems in the structure of a program (comparable to spelling and grammatical errors in writing)
- shows up as soon as trying to run the program
- E.g.: number = 8+2)

*Runtime error*:
- error does not appear until after the program has started running
- code is syntactically correct but tries things that are not allowed or defined
- also called exception
- can be caught by Try-Catch blocks

*Semantic error*:
- related to meaning
- program runs without generating error messages but does not compute the right values
- does not do “the right thing”
- can be tricky to find – needs meaningful test cases and plausibility checks

#### Errors when a function is not working

- Something is wrong with the arguments the function receives
- Something is wrong within the function
- Something is wrong with the return value or the way it is being used

&rarr; Insert print statements in relevant areas of code\
&rarr; Insert checks (if statements) to ensure valid values

#### Errors from copying and aliasing

- recognise aliased structures by printing its memory adress ```print(hex(id(<var>)))```
- When you want to adjust a list (by removing items or sorting it), it can make sense to store a copy to not lose the original information (using the ```copy()``` function)

#### Errors when reading and writing files

- can result from whitespace, as spaces, tabs and newlines are normally invisible
- the function ```repr(<var>)``` can help, as it represents whitespace characters with the backslash sequences that introduces them in strings

#### Avoiding errors when using objects

- use ```isinstance(<obj>, <class>)``` to check whether objectname is an instance of classname or to check for the data type stored in a variable
- ```hasattr(<obj>, <attribute name>)``` checks whether objectname has an attribute termed attributename

#### Helpful information to be found in error messages

Where did the error occur?
- Which script file?
- Which module?
- Which line?

What type of error was it?

&rarr; read them carefully but do not assume that they tell you everything or that everything they tell you is correct

### Tips on Debugging

1. **Find ways to duplicate** the bug
    - Find a parameter setting where the bug occurs every time
    - Fix random seeds to avoid variation


2. **Describe** the bug
    - The act of stating the problem often brings its source to surface


3. Always assume it is a bug **in your code**
    - It is not a marvellous new finding
    - It is not a problem with the language or the framework


4. **Divide and conquer**
    - Find out which parts of the program are not buggy
    - Build the simplest model and setting that still duplicates the error
    

5. **Be creative**
    - Try out alternative ways of doing what you want to do, not to solve, but to eliminate the bug


6. **Leverage tools**
    - Use the environment’s step-by-step features, variable monitors, stop points etc.


7. Take notes and **go step by step**
    - Keep track of variable values 
    - Close the doors and shut out every distraction-
    - If it helps, do it in pairs


8. **Verify** that the bug is fixed
    - Try out whether you find new settings to bring the bug back

### Things to try when debugging

- *Reading*: read the code back to yourself
- *Running*: make experiments with different versions of the code
- *Ruminating*: What kind of error? What is information from the error message? What did you do before the error started to appear?
- *Rubberducking*: Explain the problem to someone else – or to a rubber duck.
- *Retreating*: Undo recent changes until the code is working again and then start rebuilding

<br>

- use small test cases to avoid long run times
- write checks that summarise the data, e.g. instead of printing out the list, try checking for the number of items to see whether it is complete
- Write checks in the code
    - “sanity check”: check for values that should never occur
    - “consistency check”: compute the same result through two ways and check whether the results actually match
- Format your print statements to let you quickly see if something is wrong