# Introduction and Recap of Python

This tutorial is part of the course "Business Analytics: Technologies, Methods, and Concepts"

Instructor: [Dr. Konstantin Hopf](https://www.uni-bamberg.de/eesys/team/konstantin-hopf/)

The Script was initially created by [Prof. Dr. Mathias Kraus](https://www.uni-regensburg.de/informatik-data-science/nachvollziehbare-ki/mitarbeiter/prof-dr-mathias-kraus/index.html) and [Prof. Dr. Patrick Zschech](https://www.wifa.uni-leipzig.de/institut-fuer-wirtschaftsinformatik/professuren/professur-fuer-anwendungssysteme/team)

## Programming Expectations
All assignments for this class will use Python and the browser-based Jupyter notebook format you are currently viewing. We will refer to the Python 3 [documentation](https://docs.python.org/3/) in this exercise and throughout the course. In case you are not professional in Python programming, we recommend using the service of google colab which we use to test our Jupyter notebooks.

## Learning Goals 
The goal of the Business Analytics exercise is to **teach all steps necessary to solve a predictive data analytics task** using machine learning/neural networks. On purpose, we do not go into **breadth but rather depth**, so you are able to train state-of-the-art machine learning models after only few exercises.

This introductory exercise is a condensed introduction and recap to Python programming. By the end of this exercise, you will feel more comfortable:

- Writing short Python code using variables, loops, and lists.
- Writing your own Python functions.

## Variables

Python variables are used to store values. Each variable has a data type. Common basic data types that come with the original Python are Booleans, Integers, Floats, Lists, Strings:

- Booleans is a data type that can only store two states --- True or False
- Integers are normal numbers --- -10, -4, 0, 49
- Floats are real numbers --- -2.39, 0.42, 1009023.1
- Lists are a sequence of other objects --- [True, -19, 0.32, False]
- Strings are texts = sequences of characters --- 'this is a string'

There are also other data types which we don't need in this exercise. 

Variable names should be lower-cased, can not start with a number and can not contain special characters (just don't use anything funky and you will be fine).

You can use the function `print` which outputs the stored values to the screen.

You can use the function `type` which returns the data type of the variable. 

`print` in combination with `type` allows you to output the type of a variable.

In [20]:
# This is a comment, we will use it to specify what is going on in a line of code
a = 4   # This creates a variable a which holds the value 4

In [22]:
print(a) # Using print to output the value stored in a

4


In [24]:
type_of_a = type(a) # Store the data type of a in variable type_of_a
print(type_of_a)  # Print the variable type which is stored in type_of_a

<class 'int'>


In [28]:
# The previous two lines can be put into one using the following. Thereby the returned value from the function type is directly forwarded to the function print.
print(type(a))
print(type("3"))

<class 'int'>
<class 'str'>


In [30]:
a = 2 # We can reset the value stored in variable a
print(a)

2


In [32]:
b = 3.4 # This creates a variable b and initializes it with the value 3.4
c = 'hello' # This creates a variable c and initializes it with the string 'hello'

In [34]:
print(b)
print(c)

3.4
hello


In [36]:
print(type(b)) # See what type the variable storing 3.4 has
print(type(c)) # See what type the variable storing 'hello' has

<class 'float'>
<class 'str'>


In [38]:
d = [1, 2, 'hello', 2.3] # This creates a list d and initializes it with the sequence 1,2, 'hello', 2.3. The square brackets tell that this is a list. 
# Note that the data types of the elements in the sequence are different.
print(d)
print(type(d))

[1, 2, 'hello', 2.3]
<class 'list'>


In [42]:
print(d[0]) # We can also access single positions in the list using squared brackets. As you can see, this accesses the first position. 
# Watch out: some programming languages start indexing at 1, Python starts counting at 0 (i.e., the first position is accessed by 0).

1


In [50]:
print(type(d[2])) # We can also print the data type of the element which is at the third position in the sequence
print(type(d))

<class 'str'>
<class 'list'>


In [52]:
d[0] = 5.2 # We can set positions in the list to new values. 
print(d) 

[5.2, 2, 'hello', 2.3]


In [54]:
print(len(d)) # The length of the list can be obtained using the function len

4


In [80]:
d.append(10) # We can also call a function/method append which creates another position at the end of the list and puts 10 there.
print(d)
print(len(d))

[10]
1


In [64]:
# The following creates more lists
empty_list = []
int_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
mixed_list = [1, 2., 3, 4., 5]
print(empty_list)
print(int_list)
print(mixed_list)

[]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2.0, 3, 4.0, 5]


## Exercises

1. Create two integers `start_int`, `stop_int` and initialize them with 100 and 1000.
2. Create a float `x` and initialize it with 2.3
3. Multiply `start_int` with `x` and write the result into a variable `res`. 
4. Print the variable `res`.
5. Also print the data types of `start_int`, `x`, and `res`. 
6. Create a list `e` with the sequence 1, 2, 3, 4, 5.
7. Replace the '3' with the string 'test'. Which index did you use?
8. What happens if you write 2.1 to the -1-th position (i.e.,  e[-1] = ?). Print the list.
9. Add another position at the end of the list with the value 12345.

In [1]:
#task 1
start_int = 100
print(type(start_int))
stop_int = 1000

#task 2
x = 2.3
print(type(x))

#task 3 and 4
res = start_int * x
print(res) #This is a problem of floating point operation, because computers cannot represent all numbers well. One could use the Decimal package to solve this problem or round the number
print(round(res))

#task 6
e = [1,2,3,4,5]
print(e)
e[2] = "test"
print(e)
e[-1] = 2.1
print(e)
e.append(12345)
print(e)
e.append(12345)
print(e)

<class 'int'>
<class 'float'>
229.99999999999997
230
[1, 2, 3, 4, 5]
[1, 2, 'test', 4, 5]
[1, 2, 'test', 4, 2.1]
[1, 2, 'test', 4, 2.1, 12345]
[1, 2, 'test', 4, 2.1, 12345, 12345]


### Basic Math Computations
The `+`, `-`, and `*` operators are overloaded in Python and depend on the data types of the variables which are added. Let's have a look

In [94]:
# Let's create some variables of different data types
a = 2
b = 3.0
c = 'text'
d = [1, 2, 'a']

# Now we can add some
print(a + a)
print(a + b)
print(c + c)
print(d + d)

4
5.0
texttext
[1, 2, 'a', 1, 2, 'a']


### Exercise
1. Try what happens when you add (`+`), subtract (`-`), and multiply (`*`) variables of various data types (Boolean, Integer, Float, Lists). Not all combinations work. What error message do you see, when it doesn't work? Can you read the error message.

In [96]:
print(b*a)
print(25+d)

6.0


TypeError: unsupported operand type(s) for +: 'int' and 'list'

### Additional Information: Error Messages and Debugging

In the previous exercises you most likely tried something which didn't work at the first try. The reason can be twofold: 

1. That you programmed something which runs from the interpreter perspective (it doesn't crash) but yields a wrong result. These mistakes are particularly painful to find. For me it helps to go through the code line by line and simply `print` the values of variables. Using this technique you quickly see at which point a wrong value is stored in a variable. This is also the main reason why we later pick Pytorch over Tensorflow as Pytorch allows to always `print` the values of variables.

2. The interpreter threw an error. This happens if you want to execute a line of code which doesn't make sense/is illegal in Python. Examples can be
```
a = 2 + 'hallo'
b = [1, 2, 3] + 4
```
Python is very good in describing its errors. The first line throws 
```
TypeError: unsupported operand type(s) for +: 'int' and 'str'
```
In these cases, closely read the error message. There is also always a line number shown. So go to the line number and try to figure out what happens based on the error message. Are you dividing by zero? Are there data types that you didn't want?

The above two things usually take 90% of the time spent programming. In the beginning, each error message is a pain and it takes long to fix it. After a while you have seen many things and already know what is going on before you even looked at the line the error is thrown.

### Comparison between Variables

Quite often, it is handy to evaluate if two variables have the same value. Especially when later looking at `if` and `else` statements, this is crucial. Let's have a look at a couple of examples:


In [104]:
a = 2
b = 3
print(a == b)
print(a == 2)
c = (a == 2)
type(c)
d = (a == 2) & False
d = (a == 2) | False
print(d)

False
True
True


Note the two equal signs ```==```. As the single equal sign sets a value to a variable, we need two equal signs to compare two values. Other comparisons are

- greater ```x > 5```
- greater or equal ```x >= 2```
- smaller ```x < 3```
- smaller or equal ```x <= 7```
- unequal ```x != 12```

Additional Information: A condition always boils down to an (internal) Boolean variable --- True or False. If the Boolean variable is True, the conditional code block is executed, if it is False, the block is not executed.  

### Slicing of Lists

Quite often, you want to extract a sub-list of your original list. This can be done with the syntax ```list[beginning_index:ending_index]```, where the ```beginning_index``` is included and the ```ending_index``` is excluded. Let's have a look at a couple of examples.

In [136]:
l = [10, 923, 12, 23, 1, 2, 35, 12, 34] # We first create a list
print(l)
print(l[2])

[10, 923, 12, 23, 1, 2, 35, 12, 34]
12


In [138]:
# Now we want to create a new sub-list that only contains the first three elements:
sub_list = l[0:3] # We start at index 0 and want to end at index 2 (Remember that the ending_index is excluded)
print(sub_list)

[10, 923, 12]


In [140]:
# Let's create a sub-list from the element at index 2 to the element at index 6.
sub_list = l[2:7]
print(sub_list)

[12, 23, 1, 2, 35]


Note that the description of the position can be unclear. Do we say that 923 is the second element (although its index is 1) or do we say that 12 is the second element (although its position is 3). Try to be precise here and always use the index of an element.

In [156]:
# We have seen before that we can also access the last element with the index -1. Let's see what happens if we use that when slicing
print(l[-3: -1])
print(l[-3: -4])
print(l[0:100])

[35, 12]
[]
[10, 923, 12, 23, 1, 2, 35, 12, 34]


Python lists can be accessed from the end of the list using negativ indizes. ```l[-3:-4]``` returns an empty list because the beginning_index is after the ending_index. 

In [134]:
# We can also not specify the beginning_index or the ending_index and the list starts at the beginning or runs until the end of the list, respectively
print(l[2:])
print(l[:4])

[12, 23, 1, 2, 34, 12, 34]
[10, 923, 12, 23]


### Exercise

1. Create a list my_list with a sequence of length 11 with booleans, integers, floats, and strings
2. Replace the element with index 3 by 'moin'
3. Replace the second last element by 0 and the element with index 4 with 15
4. Add 2 to the second last element
5. Multiply the element with index 4 by 3
6. Can you print the center element of the list?
7. Can you also print the center element without specifying the index. Hint: Use the length of the list. Probably you run into an error. Have a look at https://www.w3schools.com/python/python_casting.asp and try to fix the error. 

In [164]:
#task 1
my_list = [1, 3, 2, True, 3.4, 'hallo', 5, 4, 2, 3, False]
print(my_list)

#task 2 
my_list[3] = 'moin'
print(my_list)

#task 3 
my_list[-2] = 0
my_list[4] = 15
print(my_list)

#task 4
my_list[-2] = my_list[-2] + 2 #(short version: my_list[-2] += 2)
my_list[-2] += 2
print(my_list)

#task 5
my_list[4] = my_list[4] * 3 #(short version: my_list[4] *= 3)
my_list[4] *= 3
print(my_list)

#task 6
print(my_list[5])

my_list.append(4)
my_list.append(5)

#task 7
print(len(my_list) / ^2)
#print(my_list[len(my_list) / 2]) #leads to an error
print(my_list[int(len(my_list) / 2)])
print(int(3.2))

[1, 3, 2, True, 3.4, 'hallo', 5, 4, 2, 3, False]
[1, 3, 2, 'moin', 3.4, 'hallo', 5, 4, 2, 3, False]
[1, 3, 2, 'moin', 15, 'hallo', 5, 4, 2, 0, False]
[1, 3, 2, 'moin', 15, 'hallo', 5, 4, 2, 4, False]
[1, 3, 2, 'moin', 135, 'hallo', 5, 4, 2, 4, False]
hallo
6.5
5
3


## Conditions

Above, we have seen how logical comparisons can be made with `==` (equality), `!=` (inequality), `<` (less than), `<=` (less than or qual), ... 
Now, we can use these logical comparisons to alter the program flow.. 

In [21]:
a = 33
b = 200

if b > a:
  print("b is greater than a")
print("Wird immer ausgeführt")
    
#this condition can also be extended with `elseif` or `else` statements

a = 199
if b > a:
  print("b is greater than a")
elif b == a:
  print ("equal")
else:
  print("a is greater than b")

b is greater than a
Wird immer ausgeführt
b is greater than a


Please note that Python relies on **indentation** (whitespace at the beginning of a line) to define scope in the code. Other programming languages often use curly-brackets for this purpose. 

In [23]:
#this will not work:
if b > a:
print("b is greater than a")

IndentationError: expected an indented block after 'if' statement on line 2 (3106660785.py, line 3)

## For Loops

For loops are handy to iterate through a list. Iterating through a list means that you create a circle of programming code in which a variable stores the first value of the list in a first round, then in the second round stores the second value, ..., until you are at the last value of the list. It's easier to show it using an example :-)

In [27]:
list_to_iterate = [1, 2, 3, 4, 5] # We first create a list containing a couple of values
print(list_to_iterate)

[1, 2, 3, 4, 5]


In [29]:
# We now create the for loop.
for variable_to_store_values_in_each_iteration in list_to_iterate:
  print(variable_to_store_values_in_each_iteration)
  print('Next iteration')

# Of course, you would never name your variable with such a long name

1
Next iteration
2
Next iteration
3
Next iteration
4
Next iteration
5
Next iteration


In [31]:
# For loops are nice because you access all elements in the list:

for x in list_to_iterate:
  print(x * 2)

2
4
6
8
10


As you probably noticed, the print lines in the past two code blocks are indented. This is used by the Python programming language to put blocks of code together. In other programming languages, this is usually done by some kind of brackets. See the following example to understand the behavior:

In [33]:
for x in list_to_iterate:
  print('hello')
print('world')

hello
hello
hello
hello
hello
world


As you can see, 'hello' is indented and, thus, is in the for loop block. It gets executed multiple times. 'world' is outside (after) the for loop block and, thus, only gets executed once. 

## Exercises

1. Create a list list_2 with the values 5 to 10.
2. Create a for loop which iterates over the *positions* of the list (not the values) and sets all values of list_2 to 2 (i.e., the resulting list should store the sequence 2, 2, ..., 2). Hint: Create another list which helps you to access the positions in list_2
3. Have a look at the function range (for instance at https://pynative.com/python-range-function/). Can you solve the problem from 2 also using the range function?
4. Difficult: Print the following using a for-loop. Hint: Use `print('string', end=' ')` to print 'string' without having a new line afterwards. Use `print('\n')` to print a new line. 
```
1
1 2 
1 2 3 
1 2 3 4 
1 2 3 4 5
```

In [41]:
#task 1
list_2 = [5, 6, 7, 8, 9, 10]

list_index = [0, 1, 2, 3, 4, 5]

#task 2 - initial solution, using an additional (manually created) list
for x in list_index:
    list_2[x] = 2
print(list_2)

#task 3 - alternative solution: using range()
list_index = range(0,len(list_2)) #len gives the length of the list and range creates an integer row
for x in list_index:
    list_2[x] = 2
print(list_2)

#task 2/3 - alternative solution, using build-in list iteration interface
for idx, x in enumerate(list_2):
    print(x)
    list_2[idx] = 3
print(list_2)



#task 4
list_3 = [1,2,3,4,5]
list_3idx = [1,2,3,4,5]

for x in list_3idx:
    print(list_3[0:x]) # this gives us the output with the list-brackets

for x in list_3idx:
    for y in list_3[0:x]: # we can use a nested loop to come to the solution
      print(y, end = " ")
    print("\n")

5
6
7
8
9
10
[3, 3, 3, 3, 3, 3]
[1]
[1, 2]
[1, 2, 3]
[1, 2, 3, 4]
[1, 2, 3, 4, 5]
1 1 2 1 2 3 1 2 3 4 1 2 3 4 5 

### Loops combined with If-Else Statements

If and else Statements can be used if you only want to run code if a condition is fullfilled. Similar to the for-loop, Python uses indented code to tell where the block starts and ends:

```
if condition_1:
  print('condition_1 fullfilled')
  print('we are still in the if-clause')
elif condition_2:
  print('condition_2 fullfilled')
else:
  print('no condition fullfilled')

print('we are out of the if-else statement')
```

In [43]:
l = ['eva', 'markus', 'sandra', 'mathias'] # Let's create a list that we can iterate through

for name in l:
  if name == 'eva':
    print('hello eva')
  elif name == 'markus':
    print('moin markus')
  else:
    print('name not in if-else conditions')

hello eva
moin markus
name not in if-else conditions
name not in if-else conditions


### Exercise

1. Create a list l with number from 0 to 10.
2. Iterate through the list and print 'even' whenever the element at the iteration is even. Else print 'not even'

In [48]:
list = range(0,11)

for x in list:
    #print(x)
    if x % 2 == 0:   # is the modulo operation - this is the rest of a dividion
      print(x, "is even")
    else:
      print(x, "is not even")

0 is even
1 is not even
2 is even
3 is not even
4 is even
5 is not even
6 is even
7 is not even
8 is even
9 is not even
10 is even


## Functions

A *function* is a reusable block of code that does a specific task. Functions are commonplace in Python, either on their own or as they belong to other objects. To invoke a function `func`, you call it as `func(arguments)`.

We've seen built-in Python functions and methods. For example, `len()` and `print()` are built-in Python functions.

## User-defined functions

We now learn to write our own functions. Below is the syntax for defining a basic function with one input argument and one output. You can also define functions with no input or output arguments, or multiple input or output arguments. As you can see, Python uses indented blocks to tell where the function code starts and where it ends.

```
def name_of_function(arg):
  ...
  return(output)
```

Here are two such functions with one input and one output argument.

In [50]:
def square(x):
  x_sqr = x*x
  return(x_sqr)

def cube(x):
  x_cub = x*x*x
  return(x_cub)

We can now call these two functions with arguments.

In [52]:
square_return = square(5)
print(square_return)
square_return = cube(90)
print(square_return)

25
729000


What if you want to return two variables at a time? Simply separate them with a comma:

Additional Information: The returned thing (125, 25) (with the normal brackets) is of data type `tuple`. It is not as common as a list, so we skipped it previously.

In [62]:
def square_and_cube(x):
  x_cub = x*x*x
  x_sqr = x*x
  return x_cub, x_sqr

a, b = square_and_cube(5)
print(a)
print(b)
print(square_and_cube(5))

x = square_and_cube(5)
print(type(x))

125
25
(125, 25)
<class 'tuple'>


And if you want to have two inputs?

In [65]:
def elemt_wise_sum(x, y):
  result_sum = [] #This is an empty list

  for index in range(len(x)):                 # We now iterate through the list
    result_sum.append(x[index] + y[index])    # and add the respective values from the two lists
      
  return result_sum

print(elemt_wise_sum([1, 2, 3], [3, 4, 5]))

[4, 6, 8]


## Methods
A function can also belong to an object (if you don't know what an object and a class is, please refer to https://www.geeksforgeeks.org/python-classes-and-objects/ or other sources). When a function belongs to an object, it is called a *method*. By "object," we mean an "instance" of a class (e.g., list, integer, or floating point variable).

For example, when we invoke `append()` on an existing list, `append()` is a method.

**Please note:** There are functions that belong to objects (called methods) and that there are functions that are independent of objects (called functions).

## Exercises

1. Create a function `elemt_wise_product` which takes three lists as an input and returns a list in which each position is equal to the product of the three values from the three lists, 
e.g., `elemt_wise_product([0, 1, 2], [4, 5, 6], [2, 3, 4])` should return `[0, 15, 48]`. You can assume that the lists have equal length.
2. Create a function `hello_name` which takes an input `name` and prints 'hello' if the name equals 'eva', 'sandra', or 'markus'. Else it doesn't print anything. 

In [71]:
#task 1
def elemt_wise_product(list_1, list_2, list_3):
    #check if the length of the given lists is equal (not required for the task but good programming behavior)
    if not (len(list_1) == len(list_2) and len(list_1) == len(list_3)):
        #this command produces an error with the given message
        raise ValueError("list lengths are not equal")

    #create a list with the indices
    list_idx = range(0, len(list_1))

    #create a list for results
    result = []

    #iterate over the lists
    for x in list_idx:
        result.append(list_1[x] * list_2[x] * list_3[x])
    return(result)

print(elemt_wise_product([0, 1, 2], [4, 5, 6], [2, 3, 4]))

#task 2
def hello_name(name):
    if name == "eva" or name == "sandra" or name == "markus":
        print("hello " + name)

hello_name('Hans')
hello_name("sandra")

[0, 15, 48]
hello sandra
