<div style="text-align: center;" >
<h1 style="margin-top: 0.2em; margin-bottom: 0.1em;">Python and Basics in Python</h1>
<h4 style="margin-top: 0.7em; margin-bottom: 0.3em; font-style:italic">The full Python Basics Notebook.</h4>
</div>
<br>

*partly adapted from: Prof. Dr. Karsten Donnay, Stefan Scholz (see the [github repo](https://github.com/stefan-scholz/python-block-course-2019))*

### __Structure__

1. What is Python and Why Python?


2. Where Python?


3. How to write Python Code?
    
    Style Guidelines


4. Data Structures in Python
    
    4.1 Primitive Data Types in Python
    
    4.2 Type Conversions in Python
    
    4.3 Abstract Data Structures in Python
    
    
5. Advanced Techniques
    
    5.1 Functions
    
    5.2 Loops and Control Flow
    
    5.3 Methods
    
    5.4 List Comprehension
    
    5.5 Lambda Functions
    


### __1. What is Python and Why Python?__

Advantages of Python:
- Fast to learn
- Easy to write and read code
    - High level language that highly abstracts from the assembly language by the CPU
    - Automatically manages memory and type declarations for the user
    - Formatting/indentations of the code have a meaning 
- Flexible
    - Widely used in Data Science and Statistical Learning
- Extensive library support 
- Uses an Interpreter
    - Can be used as a calculator
- Allows for Object Oriented Programming and functional programming

### __2. Where Python?__

Of course, you could also write (Python) code in text editor programms (not Microsoft Word, but i.e. Sublime Text, Notepad++ or Atom) and later run in with an exteral compiler. This, however, is not really recommended as you can't check or debug the code by running/executing it immediately, while working on it. 

You should, rather, use an IDE (Integrated Development Environment) which has an integrated compiler that executes your code. By highlighting and formatting the written code, IDEs further support you coding.

![title](source/Differences_TextEditor_IDE.png)

IDE's you could use are:
- Visual Studio Code
    - Intelligent Code Editor (one of the most popular ones)
    - Built-In Developer Tools
    - Provides debugging and git support
    - Lightweight with options to extend it
    - Supports almost every major programming language
    - Download and Install: [here](https://code.visualstudio.com/)


- PyCharm
    - Intelligent Code Editor
    - Built-In Developer Tools
    - Cross-Platform Support
    - Provides nice Oython Courses in the Educational Edition
    - Free Student Licenses for Professional Edition
    - Download and Install: [here](https://www.jetbrains.com/pycharm/)
    
    
- Jupyter Notebook (what you are looking at, right now)
    - Interactive IDE in Web Application 
    - Sharable Notebooks (containing Code, Visualizations, Text)
    - Free and Open Source
    - Supports over 40 Programming Languages
    - Download and Install: [here](https://jupyter.org/install)

### __3. How to write Python Code?__

Within such IDE's the code is structured in boxes, the cells. 
When running a code-snippet (by pressing the "Play" button or "Shift" + "Enter") the whole cell is executed and the result printed below. 
This way, you can immediately check, and eventually adapt, you code. 

You can use Python just like a usual calculator with the operators `+`, `-`, `*`, `/`  and parentheses `()`. 
For some operations there are special commands, i.e. power is `**`.

Although it might seem unneccessary in some situations, it is good practice to wirte comments in you code. 
A comment always starts with a hashtag `#`.

**For example:** 
- What is the result of $\sqrt{5-3}$?

In [1]:
# Calculate result 
(5-3) ** (1/2)

1.4142135623730951

Apart from the basic arithmetic operators there are many functions in Python.
Functions are indicated by round brackets, apperaring after the functions name.
The functions inputs (if required) are handed over within the brackets. 

While there are built-in functions in Python you can of course also define your own functions. 
But more on that later.

For now, we are going to look at some very basic functions, i.e the `print()` function.
Above, we saw that Python prints the output of a cell below the cell. 
But have a look at the following:

In [2]:
# Calculate first result 
(5-3) ** (1/2)

# Calculate second result 
(5-3) ** (2)

4

Apparently, only the output from the last line of code is printed. 
If you want to force Python to print both outputs, you need the `print()` function and hand over the input:

In [3]:
# Calculate and print first result 
print((5-3) ** (1/2))

# Calculate and print second result 
print((5-3) ** (2))

1.4142135623730951
4


Now, we get both results printed out (note that its not necessary, but also not harmful to add print() on the second (last) calculation).

However, the print-function can do more than simply prompting the result.
You can also format numbers (i.e. to round them to a certain digit) or combine text with calculated numbers:

In [8]:
# Calculate, print and format first result 

print("The first result (no rounding) is: {}".format((5-3)**(1/2)))
print("The first result (rounded) is: {:.2f}".format((5-3)**(1/2)))

# Calculate and print second result 
print("The second result is: ", (5-3) ** (2))

The first result (no rounding) is: 1.4142135623730951
The first result (rounded) is: 1.41
The second result is:  4


Another helpfull function is the `help()` function. 
Typing it around another function gives you an explanation of what the function does, which inputs it takes and what it returns, i.e. `help(print)`.
Or you just google stuff ;)

Other useful functions you might need:

| Function | Purpose |
| -------- | ------- |
| `abs()` | absolute value of the argument |
| `dir()` | list of arguments and methods |
| `len()` | number of items in a container |
| `max()` | with a single iterable argument, return its biggest item |
| `min()` | With a single iterable argument, return its smallest item |
| `open()` | open file and return a stream |
| `range()` | produces a sequence of integers from start (inclusive) to stop (exclusive) by step |
| `round()` | round a number to a given precision in decimal digits (default 0 digits) |
| `sorted()` | new list containing all items from the iterable in ascending order |
| `sum()` | sum of iterable of numbers|
| `type()` | objects type |
| `zip()` | tuple where the i-th element comes from the i-th iterable argument |

For a complete list of functions, please have a look at the [Python documentation](https://docs.python.org/3/library/functions.html).

As you code becomes more and more complex, it might be helpful to store numbers as variables, which you can reuse and recall from different point in your code. 
Variable names can be any combination of letters, underscores and numbers, but they can't start with a number. 
For readability, you usually use lower case letters and underscores to separate words.
The equal sign `=` is used to assign the variable name to a number. 

A basic example would be:

In [10]:
# define variables
base = 5 - 3
exponent = 1 / 2

# calculate result
result = base ** exponent

# print result
print(result)

1.4142135623730951


<div class="alert alert-block alert-info">
    <b>Exercise 1 </b>: Compute the sum and average of all integers from 0 to 1,000. Use variables for intermediate results. Print your results
</div>

In [8]:
# Let's get active:

print(1+1)  # ignore please... its a github error test

<div class="alert alert-block alert-info">
    <b>Exercise 2</b>: Compute the following. Store your results in a variables named a1 and a2 and print your results.
    <br>1. (2<sup>3</sup> + 3<sup>2</sup>) / 2
    <br>2. (3<sup>2</sup> - 2<sup>4</sup>)<sup>2</sup>
</div>

In [7]:
# Let's get active:



While you are generally free in choosing your variable names, there are a few keywords, which have a fixed meaning for the Python interpreter and can therefore not be used as variable names:

In [6]:
import keyword

# print keywords occupied by interpreter
print(keyword.kwlist)

['False', 'None', 'True', '__peg_parser__', 'and', 'as', 'assert', 'async', 'await', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'nonlocal', 'not', 'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield']


**Style Guidlines**

- Use lower case letters (only) for variable function names and add underscoreds for readability
    - i.e. my_function()
    - round_no, personal_id
- Comment your code (to explain what you are doing)
    - use '#' to add a comment to a line (enerything behind the '#' will not be evaluated
- When defining functions yourself (see later), these should have a header/description
    - See 5.1 Functions in this notebook

### __4. Data Structures in Python__

In Python, every variable we assign has a certain data type/structure. 


> **Data structures** are a way of organizing and storing data so that they can be accessed and worked with efficiently. They define the *relationship between the data*, and the *operations* that can be performed on the data ([Source](https://www.datacamp.com/tutorial/data-structures-python)).

![title](source/Python_Data_Structures.png)


There are two kinds of data types/structures, primitive and non-primitive ones. 
- **Primitive data structures**
    - simplest form of representing data
    - contain pure, simple values
    - in general, the different primitive data types are: ![title](source/Primitive_Data_Types.png)
    - however, Python only distinguishes:
        - Boolean
        - Integer
        - Float
        - String



- **Non-primitive (abstract) data structures**
    - more advanced/complex structures to serve special purposes
    - within these structures, they contain primitive data types
    - in Python the abstract data types are
        - List (linear vs. non-linear)
        - Array
        - Dictionary
        - Tuples
        - Set
        - File


### __4.1 Primitive Data Types in Python__

Python features four basic variable types, namely booleans, numbers, strings and colections. 

These variables are assigned by using the "=" operator. It is even possible to assign multiple variables at once 

> i.e. 
> - ``x = 1; y = "a"; z = [1, "a"]``
> - ``x, y, z = 1, "a", [1, "a"]``
> - ``x = y = z = 1``

*Note:* In Python you don't have to explicitly declare the data type when initializing a variable. Python dynamically/automatically assigns the correct storage type (other than i.e. Java).  


In [2]:
# assigning multiple variables
x, y, z = 1, "a", [1, "a"]

# printing them individually 
print(x)
print(y)
print(z)

1
a
[1, 'a']


In [3]:
# assigning the variables again
x = y = z = 1

# printing them individually 

print(x)
print(y)
print(z)

1
1
1


**Booleans**

Booleans are logical values. They can either be ``True`` or ``False`` and are for example returned by comparative operators.

*Logical Operators*:
- Logical AND: ``<Boolean> & <Boolean>``
- Logical OR: ``<Boolean> | <Boolean>``
- Logical exclusive OR: ``<Boolean> ^ <Boolean>``
- Negation: ``not<Boolean>``


*Comparators*:

Comparators are used to compare inputs and return booleans. This way, booleans are widely used in flow control or conditional statements.
- Equality: ``<Input> == <Input>``
- Not equal: ``<Input> != <Input>``
- Greater (equal): ``<Input> >= <Input>``
- Lower (equal): ``<Input> <= <Input>``

> i.e. 
> - ``x = 2``,  ``y = 4``
> - ``x == y`` = False
> - ``x > y`` = True
> - ``z = (x == y)`` = False
> - ``if z: print("True")``
    - ``else: print("False")``


Due to their binary nature, they can also be expressed as 1 (True) and 0 (False).

If you are working with binary numbers:
- Bitwise Operators (only for integers)
    - ``~`` bitwise complememt
    - ``&`` bitwise *and* (1&1 = 1, else 0)
    - ``|`` bitwise *or* (0|0 = 0, else 1)
    - ``^`` bitwise exclusive *or* (1^0 = 1, 0^1 = 1, else 0)
    - ``<<`` bitwise shift to the left (0010 << 2 = 1000)
    - ``>>`` bitwiese shift to the right (0110 >> 2 = 0001)


In [54]:
1 == 1

True

In [55]:
2 > 3

False

In [57]:
x = True
#x = False

if x:
    print("X was True")
else:
    print("X was False")

X was False


<div class="alert alert-block alert-info">
    <b>Exercise 3</b>: In Exercise 2 you computed a1 and a2. Now check the following and print the results.
    <br>1. a1 equal to  a2
    <br>2. a1 smaller than a2
    <br>3. a1 greater than a2
</div>

In [9]:
# Let's get active!




**Numbers**

Python features two types of numbers (important for us), integers and floats.

- **Integers** are whole numbers, i.e. ``x = 1``
    - you can cast other data types to integers by using ``int()`` i.e. ``int("1")`` = 1
- **Floats** (floating point numbers) are real numbers, i.e. ``x = 1.0`` or ``y = 3.1415``
    - you can cast other data types to floating point numbers by using ``float()`` i.e. ``float(2)`` = 2.0
    
*Arithmetic Operators*:
- Basic Operators: ``+, -, *. /``
- Raise to the power of: ``<Number> ** <Number>``
- Modulo: ``<Number> % <Number>``
- Integer/Float division: ``<Number> / <Number>``




In [46]:
3+2

5

In [45]:
3-2

1

In [48]:
# Note that the result is automatically a floating point number
5 / 2 

2.5

In [52]:
# Parenthesis work, as you would expect them to work:
print(2 + 3 * 4)
print((2 + 3) * 4)

14
20


In [53]:
# Floating point numbers
1.1 + 1.1

2.2

**Strings**

Strings are sequences of characters, i.e. ``x = "abc"`` (you can use single ``'abc'`` or double ``"abc"`` quotation marks). Other data types can be cast to strings by using: ``str(<Input>)`` i.e. ``str(1)`` = "1".

For strings over multiple lines, you use three quotation marks.

`
sentence = '''I love Python!
It is so much fun.'''
`

*Operations*:
- Slicing and indexing
    - ``"abc"[1]`` = "b"
    - ``"abc[1:]"`` = "bc"
- Concatenation (combining strings): ``<String> + <String>``
    - ``"a" + "b"`` = "ab"
    - ``"Hello" + " " + "World" + "!"`` = "Hello World!"
    - Note the blank space, we had to include.
- Repeating: ``<String> * <Number>``
    - ``"a" * 3`` = "aaa"
- ``in`` to query for a character within a string
    - ``"a" in "Daniel"`` = True
- String comparison
    - ``s1 = "Hello World"`` ; ``s2 = ""Hello World"``, ``s1 == s2`` = True

Additionally, Python has many built in functions for strings like: ``string_1 = "Hello World"``
- Captitalize : ``str.capitalize("hello")`` = "Hello"
- Get the length: ``len(string_1)`` = 11
- Check if a string contains only digits: ``string_1.isdigit()`` = False
- Replace parts of a string ``string_1.replace("World", "Elena")`` = "Hello Elena"
- Split strings ``string_1.split(" ")`` = "Hello",  "World"
- Find substrings within strings: ``string_1.find("llo")`` = 2 (because "llo" starts at index position 2)


Be aware, that strings can also contain numeric characters, i.e. ``x = "1"``, ``y = "4"``, ``z = x + y = "14"``

In [5]:
first_string = "I love Python!"

# is the same as 
second_string = 'I love Python!'

print(first_string)
print(second_string)

I love Python!
I love Python!


In [8]:
first_name = "eric"
print(first_name)

# Change the first letter to upper case:
first_name = first_name.title()
print(first_name)

# capitalize does the same
first_name = first_name.capitalize()
print(first_name)

eric
Eric
Eric


In [9]:
first_name = "eric"
print(first_name)

# changing all letters to upper case
first_name = first_name.upper()
print(first_name)

# and to lower again
first_name = first_name.lower()
print(first_name)

eric
ERIC
eric


In [25]:
# Quotations
quote = "Eric said, 'I love Python!'"

print(quote)

Eric said, 'I love Python!'


In [27]:
# Creating strings with the .format() - method

favorite_language = "Python"
person = "Eric"

# create a new string by using .format() method
quote = "{} is the new favorite language of {}".format(favorite_language, person)
print(quote)

Python is the new favorite language of Eric


In [28]:
# Slicing strings
string = "I love Python!"


# Extracting the "Python": it starts at index position 7 (the beginning is inclusive)
# Note that blankspace count as well.
# It ends at 12 (but ending is exclusive), therefor until 13

print(string[7:13])

Python


In [16]:
# More string methods
print(string.startswith("I"))

# note that it is case sensitive (upper or lower case)
print(string.startswith("i"))

print(string.endswith("thon!"))

True
False
True


In [19]:
print(string.replace("Python", "R"))

print(string.replace("love", "hate"))

I love R!
I hate Python!


Note, how the change of "Python" to "R" is not permanent? It is only for the specific line. We did not replace the ols string with the new one.

Therefore, when changing "love" to "hate", the orignal string is taken, and the programming language is again Python.


In [22]:
# overwriting the old string with the new one
string = string.replace("Python", "R")

string = string.replace("love", "hate")

# now it works
print(string)

I hate R!


<div class="alert alert-block alert-info">
    <b>Exercise 4</b>: Find and store a sentence you like into a variable. Store a person (first and last name), who said the sentence.
    <br>1. Print the sentence in the format "X once said, 'your_sentence'."
    <br>2. Store the first and last name of the person in two single variables (split and extract the words).
    <br>3. Change first and last name both to lower case letters.
</div>

In [10]:
# Let's get active!




In [11]:
# Let's get active! 




In [12]:
# Let's get active!




**Checking the Data Type with: type()**

Calling the function `type()` on any variable will tell you the data type of the object.

- `type(2)` = ìnt
- `type("Hello")` = str



### __4.2 Type Conversions in Python__


> You can turn expressions/variables from one type into other types. This is done by so called type conversions. 

In general conversion to "larger" data types can be done without any loss of information.
Conversions to "smaller" data types will usually lead to a loss of information (i.e. precision). If you, for example, need to convert a floating point number to an integer, the decimal places simply get cut off (= loss of precision). 
The general type hierarch is:
- byte $\subset$ short = char $\subset$ int $\subset$ long 
- float $\subset$ double 

Such type castings can either be done *implicitly* or *explicitly*. 

- **Implicit type casting**:
    - Assume you have a float ``x = 6.4`` and an integer ``y = 2``. If you cant to devide them ``z = x/y`` this works without problems and gives you ``z = 3.2`` (a float). 
- **Explicit type casting:**
    - If you have a integer ``x = 2`` and a string ``y = "Hello u "``, the following won't work: ``z = y + x``. This is because Python does not know how to combine an integer and a string. To make it work, wou would need to do ``z = y + str(x)`` = "Hello u 2". This way, the 2 is first converted into a string and then the two strings can get concartinated.
    
To check the type of a variable in Python, simply use ``type(<Variable>)``.
 

Luckily, Python "hides" most of this complexity and most of the times automatically takes care of the proper conversions of types. In other languages (like Java) you might have to explicitly do such conversions. 

In [25]:
# define float
result = 1.4142

# convert float to integer
result = int(result)

# print integer
print(result)

1


In [28]:
# define string
result = "1.4142"

# convert string to float
result = float(result)

# print float
print(result)
print(type(result))

1.4142
<class 'float'>


In [29]:
# define string
sentence = "I love Python!"

# convert string to list
sentence = list(sentence)

# print list
print(sentence)

['I', ' ', 'l', 'o', 'v', 'e', ' ', 'P', 'y', 't', 'h', 'o', 'n', '!']


**Overflows**

The Data Types limit the value range (in terms of memory storage). If the maximum is reaches, an arithmetic overflow occurs.



### __4.3 Abstract Data Types in Python__

Non-Primitive/Abstract data structures do not just store a value, but rather are a collection of different values in various formats. 


**Basic Collection: List**

Lists in Python are ordered collections of items, sepataed by a comma. These items can be of different variable type and lists have variable lengths. 
You can even have lists as items within other lists (= nested lists). Since lists are ordered you access the elements within the list using their index-position in squared brackets. Note that the index starts at 0.

Additionally, lists are mutable. This means you can initialize a list with a certain content and later delete or modify the content/parts of the content. Later on you will get to know other data types where this is not possible.  

*Basic Operations*:
- Generate an empty list called l: ``l = []``
- Adding a single value to the list: ``l.append(<Value>)``
- Insert a value at a specific position x: ``l.insert(x, <Value>)``
- Remove the value from a certain position: ``l.pop(<IndexPos>)``
- Remove a certain value: ``l.remove(<Value>)``
    - Removes the first occurance of the specified value (in case, multiple are present) 
- Concatenating two lists: ``<list> + <list>``
- Extract or slice lists: ``<List>[<IndexPos>]`` or ``<List>[<Slice>]``
    - For Example: ``l = [2, 1, 4, 3]``
    - ``l[1]`` = 1
    - ``l[-1]`` = 3
    - ``l[:3]`` = [2, 1, 4]
    - ``l[1:3]`` = [1, 4]
    - ``l[-2:]`` = [4, 3]
    - ``l[:-2]`` = [2, 1]
- Change a value at a certain position: ``l[0] = 99`` results in ``l`` being [99, 1, 4, 3]
- Nested List: ``nl = [[1, "Hello"], [3.4, True, 8]]``
    - Note that the lists within lists may be of different length.
    - If you need to index them, ``nl[0]`` = [1, "Hello"] and ``nl[1]`` = [3.4, True, 8]. Therefore, if you want to access the 3.4 you need to type: ``nl[1][0]``
- Sort lists: ``l.sort()`` = [1, 2, 3, 4]
- Reverse lists: ``l.reverse()`` = [3, 4, 1, 2]

Lists can be linear or non-linear. A linear list means, that the elements are ordered sequentially and traversed each after another (linear). In a non-linear list, one elements can be connected to multiple other ones, or none at all. The connections themself reflect a specific relation. Such non-linear list are for example trees or graphs. But we will learn more about them later on this semester. Stay tuned ;)


In [70]:
# define a first list
list_1 = ["Claire", "Andri"]
print(list_1)

# append a new tutor
list_1.append("Elena")
print(list_1)

['Claire', 'Andri']
['Claire', 'Andri', 'Elena']


In [71]:
# insert at specified index position

list_1.insert(2, "Max")
print(list_1)

['Claire', 'Andri', 'Max', 'Elena']


In [72]:
# remove an element by index position
list_1.pop(2)
print(list_1)

# or by the specified element
list_1.remove("Elena")
print(list_1)

['Claire', 'Andri', 'Elena']
['Claire', 'Andri']


In [73]:
# concatenate two lists
list_2 = ["Student_2", "Student_1"]

final_list = list_1 + list_2
print(final_list)

['Claire', 'Andri', 'Student_2', 'Student_1']


In [76]:
# Sorting for strings is alphabetic
final_list.sort()
print(final_list)

# And for numers
list_numbers = [3, 2, 566, 39]
list_numbers.sort()
print(list_numbers)

['Andri', 'Claire', 'Student_1', 'Student_2']
[2, 3, 39, 566]


In [80]:
# Indexing and slicing
print(final_list)

# The Tutors are on index 0 and 1. When slicing, the starting index is inclusive, the ending exclusive
print(final_list[0:2])

# The students: if you want to go up until the end, you don't have to specifiy the end explicitly
print(final_list[2:])

# Same counts for the begining, if you want to start in the front
print(final_list[:2])

['Andri', 'Claire', 'Student_1', 'Student_2']
['Andri', 'Claire']
['Student_1', 'Student_2']
['Andri', 'Claire']


In [86]:
# You can access list from the back
print(final_list[-1])

# the three last entries
print(final_list[-3:])

Student_2
['Claire', 'Student_1', 'Student_2']


In [90]:
# a nestes list
people = [['Andri', 'Claire'], ['Student_1', 'Student_2']]

# now, simple indexing delivers the sublists
print(people[0])
print(people[1])

# to get "Claire" we need the first sublist (index 0) and within there the second name (index 1)
print(people[0][1])

['Andri', 'Claire']
['Student_1', 'Student_2']
Claire


**Arrays**

Arrays are also mutable collections of different data elements. Compared to lists, however, arrays can only store one data type (i.e. only integers, floats, strings or else) and only elements of the same length. Additionally, the array's structure/size needs to be defined while inizializing it.
In the most simplest case, an array is also just one list of elements. But arrays can also be multidimensional. 

In Python, there is no build in module for arrays, but we would recommend working with the NumPy-package. You can download and import it like:

``pip install numpy
import numpy as np``

While arrays are somewhat more rigid in their structure (predefined size, same type and length of elements) they offer other benefits. Arrays are optimized for numerical operations and especially NumPy also provides vectorized (element-wise) operations.

*Operations*:

- Initialize an array: ``x = np.array([3, 6, 9])``
- Access element at certain index position: ``x[0]`` = 3
- Vectroized (element-wise) operations are possible, i.e. ``y = x/3`` = [1.0, 2.0, 3.0]
- Initialize an array of ones with size 4: ``np.ones(4)``

Arrays can also be **multidimensional**. 

For 2D-arrays, you can think of them as a table with rows and columns.
- Inizialize an 2D-array of ones, with 3 rows and four columns: ``np.ones((3, 4))``
- The above would be the same as ``np.array([[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]])``

A 3D-array can be thought of as a collection of several tables of the same layout. Such an array would be initialized like

``np.array([[[1, 2, 3, 4],
            [5, 6, 7, 8]],
           [[1, 2, 3, 4],
             [9, 10, 11, 12]]])``

It can be read as two tables, stored behind each other. On the "front" table (layer) we have a table of two rows and four columns with the numbers 1-8. The second table *has to have* the same shape (2 rows, 4 columns) but this time with the values 1-4 and 9-12. 


In [None]:
# Array




**List vs. Array**

![title](source/List_vs._Array.png)


**Tuple**

Tuples are immutable data structures. Once defined, their content/values inside can not be changed or manipulated anymore. Like lists, tuples can store heterogenous items and they are initialized by comma-separated elements in round brackets (or without brackets at all). Usecases would be, if you i.e. pass you data to someone else and you would like to make sure, the other person can use the data (for calculations etc) but can not change it.

*Operations*

- Initialization:
    - ``x_tuple = 1, 2, 3, 4, 5``
    - ``y_tuple = ("t", "e",  "s", "t", [1, 2, 3])``
- Indexing and Slicing
    - ``x_tuple[3]`` = 4
    - ``y_tuple[0:3]`` = ('t', 'e', 's')
- Iterable
    - ``for x in x_tuple: print(x)`` = 1, 2, 3, 4, 5
- Modification of values won't work!
    - ``x_tuple[0] = 100``



In [92]:
# Tuples behave like lists in some cases 

colors = ('red', 'green', 'blue')
print("The first color is: " + colors[0])

print("The available colors are:")
for color in colors:
    print("- " + color)

The first color is: red
The available colors are:
- red
- green
- blue


In [93]:
# ... but they can not be changed (adding, removing, replacing values)

colors = ('red', 'green', 'blue')
colors.append('purple')

AttributeError: 'tuple' object has no attribute 'append'


**Dictionary**

Dictionaries are data structures that are in their working similar to i.e. a telephone book. Given you have a persons name, you can identify his/her phonenumber. Therefore, the name has to be a unique key and the value is ifentified over this unique key. 

Dictionaries are such unordered, mutable collections of key-value pairs. They are implemented using curly brackets ({}), the key-value pairs are separated by comma and the key is separated from the value using ":".
Note that the key might be any type (Strings, Integers, even mixed is possible) and the values can themselves also be arbitrarily complex (just an integer, a list, a nested list or another dictionary).

Throughout this course you will also learn how to work with JSON-files. These special file-types have such a dictionary-structure, with nested dictionaries within others... Stay tuned. 

*Operations*
- Initialization
    - ``dict_1 = {'Andri' : 123456789, 'Claire': 987654321, 'Elena': 24681012}``
    - ``dict_2 = {1: 'Hallo', 'String': [33, 22, 11], 3: {'key': 'value'}}`` (mixing key and value-types: a value can also be a list or another dictionary) 
- Acessing elements works by using squared brackets and the specific key inside:
    - ``dict_1['Andri']`` = 123456789
    - ``dict_2[1]`` = 'Hallo'
    - ``dict_2['String']`` = [33, 22, 11]
    - ``dict_2[3]`` = {'key': 'value'}
- Delete elements using the key:
    - ``del dict_2['String']`` = {1: 'Hallo', 3: {'key': 'value'}}
    - ``dict_2.pop('String')`` = prints [33, 22, 11], and the dict_2 becomes {1: 'Hallo', 3: {'key': 'value'}}. ``.pop()`` deletes they specified key-value pair from the dictionary and returns the value from the specified key
- Get all the keys of a dictionary:
    - ``dict_1.keys()`` = dict_keys(['Andri', 'Claire', 'Elena'])
- Get all the values of a dictionary:
    - ``dict_1.values()`` = dict_values([123456789, 987654321, 24681012])
- Get all the key-value pairs of a dictionary:
    - ``dict_1.items()`` 
- Overwrite an existing key's value
    - ``dict_1['Elena'] = 100100100`` = {'Andri' : 123456789, 'Claire': 987654321, 'Elena': 100100100}
- Append a new key-value pair:
    - ``dict_1['Max'] = 123123123`` Simply use a new unused key and the "="-assignment



In [108]:
# Dictionaries

dict_1 = {'Andri' : 123456789, 'Claire': 987654321, 'Elena': 24681012}
dict_2 = {1: 'Hallo', 'String': [33, 22, 11], 3: {'key': 'value'}}

# The dictionary keys
print(dict_1.keys())

# The dictionary values
print(dict_1.values())

# Andris student id
dict_1['Andri']

dict_keys(['Andri', 'Claire', 'Elena'])
dict_values([123456789, 987654321, 24681012])


123456789

In [109]:
# Remove from a dictionary

print(dict_1.pop('Elena'))
print(dict_1)

24681012
{'Andri': 123456789, 'Claire': 987654321}


In [112]:
# Add a new entry

dict_1['Elena'] = '100100100'
print(dict_1)

{'Andri': 123456789, 'Claire': 987654321, 'Elena': '100100100'}


**Sets**

Sets are a collection of the distinct/unique input-objects. This is helpful, if you i.e. want to find the unique values within a list. Sets are mutable and unordered.

*Operations*:

- Initialization
    - ``set_1 = set("test")``
    - ``set_2 = set("hello")``
    - ``print(set_1)`` = {"t", "s", "e"} Unordered output. Upper and lower case letters would be distinct (i.e. T $\neq$ t) 
- (Difference): Elements from `set_1` without those that are also in ``set_2`` 
    - ``print(set_1 - set_2)`` = {"t", "s"}
- (Union): Elements from either ``set_1`` or ``set_2`` (or both)
    - ``print(set_1 | set_2)`` = {"t", "e", "s", "h", "l", "o"}
- (Intersection): Elements that are in both sets
    - ``print(set_1 & set_2)`` = {"e"}
- Elements that are only either in the one or other:
    - ``print(set_1 ^ set_2)`` = {"t", "s", "h", "l", "o"}
- (Subset) Are the elements from one set a subset of the other
    - ``set_3 = set(1, 2, 3)`` and ``set_4 = set(2, 3)`` then: ``set_4 < set_3`` = True

In [96]:
# undordered but unique values:

set_1 = set([2, 5, 3, 8, 3, 8, 2])
print(set_1)

{8, 2, 3, 5}


In [101]:
set_2 = set([4, 7, 9, 3, 9])
print(set_2)

# Difference
print(set_1 - set_2)

# Union
print(set_1 | set_2)

# Intersection
print(set_1 & set_2)

{9, 3, 4, 7}
{8, 2, 5}
{2, 3, 4, 5, 7, 8, 9}
{3}


**File**

The last data structure to be mentioned here are files. Of course, you need to be able to read and modify or write files (from your desktop, a database etc) within your data science code. 

*Operations*:
- Open file and store it in variable f
    - ``f = open('filename', 'w')``
    - the second argument defines if you only want to 'r' read a file, 'w' write, 'a' append or 'r+' read and write
- Read entire files
    - ``f.read()``
- Read a single line only
    - ``f.readline()``
- Write a string to a file
    - ``f.write('String to be written to file f.')``
- Close a file
    - ``f.close()``

### __5. Advanced Techniques__
### __5.1 Functions__

Functions are routines of commands that are executed for a set of input variables.
We already saw a coule of basic build in functions, like `print()` or `len()`.

Of course, you can also define your own functions.
The basic syntax is

``def <Function> (<Input>):
    <Expression>
    return <Output>``
    
One example could be a function, that takes numbers as inputs and returns them to the power of two, i.e.

``def power_of_two (x):
    y = x * x
    return y``
    
Functions
- can take none to many different Inputs
- can return none or many outputs

It is good practice, to define a **docstring** (the red colored explanation) which explains the inputs the function takes, what the function computes and what it returns.

In [119]:
# Define a greeting-function

def greet(person, location):
    
    """
    Function that greets people
    
    Parameters
    ----------
    location: String
        Name of location
        
    person: String
        Name of person
    
    Returns
    -------
    String
        Welcome message
    
    """
    
    print("Hello {} in {}! Nice to see you.".format(person, location))


In [120]:
greet("Hanna", "Uni")
greet("Paul", "Cafe")

Hello Hanna in Uni! Nice to see you.
Hello Paul in Cafe! Nice to see you.


### __5.2 Loops and Control Flow__

Often, we don't want to exectue all lines of code for all inputs.
Certain commands might need to be repreated, other shall be skipped, based on certain criteria.
Control Flow Statements help us to define such structure.


In case of repeated executions, loops are the way to go.
In a loop, a set/block of operations is performed for a certain, defined amout of repetitions. There are different types of loops.

**for-Loops:**
- repeat a block of operations for n times

``for <Iterator> in range(n):
            <Expression>``

`` x = 1
for i in range(10):
    x += i
``

$\to$ ranges over i from 0 to 9 and always adds i to x. 

- for loops can also run over lists, applying the operations on every element in the list, i.e.

``for <Item> in L:
    <Expression>``
    
``l = ["a", "b", "c"]
for w in l:
    print(w)``

$\to$ iterates over every item in the lists and prints it


In [113]:
# define ages
ages = range(14, 19)

# print ages
print(list(ages))

# for loop over ages
for age in ages:
    # check conditions and print results
    if age < 16:
        print(f'I am {age} years old and allowed to get water!')
    elif age < 18: 
        print(f'I am {age} years old and allowed to get water and beer!')
    else:
        print(f'I am {age} years old and allowed to get water, beer and spirits!')

[14, 15, 16, 17, 18]
I am 14 years old and allowed to get water!
I am 15 years old and allowed to get water!
I am 16 years old and allowed to get water and beer!
I am 17 years old and allowed to get water and beer!
I am 18 years old and allowed to get water, beer and spirits!


**while-Loops:**
- repest a certain block of operations while a condition is True (until the condition is False)

``while <Condition>:
    <Expression>``
    
``x = 2
while x < 1000:
    x **= 2``

$\to$ x starts at 2. While it is smaller than 1000, x is always raised to the power of two.

- you can also call a break statement from within the operations block:

``while <Condition>:
    <Expression>
    if <Condition>:
        break``
        
``x = 2
while True:
    x **= 2
    if x > 1000:
        break``

$\to$ does exactly the same as the loop above. But here, the first condition is always True. Only within the while loop we have another if-clause that checks if x is smaller than 1000. If thats false, a break is called and the loop stops.




In [31]:
# define ages
age = 0

# for loop over ages
while age < 19:
    # check conditions and print results
    if age < 16:
        print("I am {} years old and allowed to get water!".format(age))
    elif age < 18: 
        print("I am {} years old and allowed to get water and beer!".format(age))
    else:
        print("I am {} years old and allowed to get water, beer and spirits!".format(age))
    # increment age
    age += 1

I am 0 years old and allowed to get water!
I am 1 years old and allowed to get water!
I am 2 years old and allowed to get water!
I am 3 years old and allowed to get water!
I am 4 years old and allowed to get water!
I am 5 years old and allowed to get water!
I am 6 years old and allowed to get water!
I am 7 years old and allowed to get water!
I am 8 years old and allowed to get water!
I am 9 years old and allowed to get water!
I am 10 years old and allowed to get water!
I am 11 years old and allowed to get water!
I am 12 years old and allowed to get water!
I am 13 years old and allowed to get water!
I am 14 years old and allowed to get water!
I am 15 years old and allowed to get water!
I am 16 years old and allowed to get water and beer!
I am 17 years old and allowed to get water and beer!
I am 18 years old and allowed to get water, beer and spirits!


**Conditional Statements:**

Besides loops, we can also have Conditional Statements.
The most basic one is the if-statement.
The statements clause will only be executed, if the condition is evaluated to `Ture`.

Whenever the if-statements evaluates to `False`, the else-statement if executed.
If you need to define more criteria, which shall be checked, `elif` is the (optional) command for it.

- when defining conditional statements, you use the following syntax:

``if <Condition>:
    <Expression>
elif <Condition>: # else if (not necessary)
    <Expression>
else:
    <Expression>``
    
Example: you can combine several ``elif`` statements. These are checked subsequential, each after another. The first condition that is met defines the operations that will be performed. The ``return`` statements within each ``elif``-block ensure that the function breaks if this ``elif``- case was called. 

![title](source/Conditions_Example_Factorial.png)

In [30]:
# define age
age = 17

# check conditions and print results
if age < 16:
    print("I am allowed to get water!")
elif age < 18: 
    print("I am allowed to get water and beer!")
else:
    print("I am allower to get water, beer and spirits!")

I am allowed to get water and beer!


### __5.3 Methods__

Methods are somewhat like functions, but they are associated (meaning only applicable) to certain objects.
For example, there are methods that work only on strings (the object) and do something (i.e. convert to upper case) to every instance (every letter) within the string:

In [1]:
# initialize string
sentence = "I love Python!"

# convert string in upper case
sentence = sentence.upper()

# print string
print(sentence)

I LOVE PYTHON!


In [13]:
# initialize string
sentence = "I love Python!"

# replace values by other values
sentence = sentence.replace("Python", "R").replace("love", "hate")

# print string
print(sentence)

I hate R!


There are also methods workling on lists, only:

In [14]:
# initialize list
points = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# add element to list
points.append(11)

# print list
print(points)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]


To get all methods, applicable to a certain object, you can use the function `dir()` applied to the variable:

In [19]:
# define a list
points_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# print the methods for lists
print("List Methods: ", dir(points_list))

# define a string
string = "I love Python!"

# print the methods of strings
print("String Methods: ", dir(string))

List Methods:  ['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
String Methods:  ['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr_

### __5.4 List comprehension__

List comprehensions are a useful and elegant technique if you need to create a new list based on another list. 
Imagine you have a list of integers and want to raise each of them to the power of two. 
OR you have a list of fruits and want to filter all fruits, that contain the letter "a". 
You could write such code using loops and conditions, as learned above. 
List comprehensions do exactly the same but in a more conpact/concise way.

The general syntax is:
- ``[<Expression> for <Item> in <List> (if <Condition>)]``

Instead of ``<List>`` you often have range-objects, i.e.:
- ``[<Expression> for <Item> in range(n, m) (if <Condition>)]``

Examples would be:
- ``[2 ** x for x in range(11)]`` creates a list, with the values $2^0$, $2^1$, $2^3$, ... until $2^{11}$ $\to$ [1, 2, 4, 8, 16, ..., 512, 1024]
- ``[x for x in range(11) if x%2 == 0]`` takes all the even numbers between 0 and 10, because these have a modulo 2 of zero (i.e. no rest after the division by two).
- ``[x for x in "abcd"]`` creates a list of the single letters ["a", "b", "c", "d"]
- ``[x*2 for x in [1, 2, 3, 4]]`` multiplies each integer by two, [2, 4, 6, 8]

In [121]:
# define list of names
names = ["Ebo", "Jia", "Erik", "Ji-Woo", "Anna", "Eli", "Rümeysa"]

# use list comprehension to create a list consisting only of names that start with "E"
selection = [x for x in names if x.startswith("E")]

# print list with E-names
print(selection)

['Ebo', 'Erik', 'Eli']


In [122]:
# use list comprehension create a new list where every item of the old names list is replaced with its uppercase equivalent
upper = [x.upper() for x in names]

# print list with uppercase names
print(upper)

['EBO', 'JIA', 'ERIK', 'JI-WOO', 'ANNA', 'ELI', 'RÜMEYSA']


<div class="alert alert-block alert-info">
    <b>Exercise 5 </b>: Using list comprehension, create a list with all integers from 1 to 8. 
</div>

In [14]:
# Let's get active!




<div class="alert alert-block alert-info">
    <b>Exercise 6 </b>: Take every value in that list to the power of two. 
</div>

In [13]:
# Let's get active!





### __5.5 Lambda Functions__

Lambda Functions are small, anonymous  functions that can take multiple inputs, but only has one expression.
They are anonymous, because the functions are unnamed, not like in ``def my_function(<Input>)``.

The basic syntax is:
- ``lambda <Input>: <Expression>``

You could for example write a function to always add 1 to the input x:
- ``lambda x: x + 1``
- if you surround this anonymous function in parenthesis and add an input in brackets behind, you immediately apply the function, i.e. ``(lambda x: x + 1)(2)`` = ``lambda 2 : 2 + 1 `` = 3

Lambda functions can also take multiple arguments,. These are listed without parenthesis, separated by comma. When callsing the function, however, the inputs have to be declared in brackets.
- ``lambda a, b : a * b``
- ``(lambda a, b : a * b)(5, 6)`` = ``lambda 5, 6: 5 * 6`` = 30

The power of lambda function is better shown, when they are nested within other function, as small, unnamed helper functions.

You can use then together with the ``map()`` function. ``map(<Operation>, <List>)`` applies a certain operation to every item of a list.
- ``list(map(lambda x: x**2, range(11)))``, due to ``map`` we apply the lambda function with input x to every item of the ``range(11)``elements. This means, every integer 0, 1, ..., 10 is once the x in the lambda function, being raised to the power of two. Finally ``list()`` makes a list out of the result. This could equivalently be done using list comprehension. 

Combined with ``filter()`` you could filter items from a list, based on a lambda function
- ``list(filter(lambda x: x < 50, range(100)))`` returns a list of all numbers between 0 and 99, that are smaller than 50 (this is what the lambda function did).

[Source](https://realpython.com/python-lambda/)

In [123]:
# define list of dates
dates = ["13/02/2017", "28/07/2016", "02/04/2013", "30/09/2018", "01/05/2018"]

# sort by default
dates = sorted(dates)

# print sorted dates
print(dates)

['01/05/2018', '02/04/2013', '13/02/2017', '28/07/2016', '30/09/2018']


The problem with this sorting is, that `sorted()` sorts by the first number, the day.
If we want to sort by year, lambda functions can help:

In [125]:
# define list of dates
dates = ["13/02/2017", "28/07/2016", "02/04/2013", "30/09/2018", "01/05/2018"]

# sort by year
dates = sorted(dates, key=lambda x: x.split('/')[-1])

# here the lambda function specifies that we split at every slash, 
# and take the last element of the resulting list, meaning the year.
# The sorting is then done based on these years.
# The result however, does not show that we splitted.

# print dates sorted by year
print(dates)

['02/04/2013', '28/07/2016', '13/02/2017', '30/09/2018', '01/05/2018']


In [126]:
# Another example: filtering

# define list of numbers
numbers = [3, 5, 16, 4, 26, 35, 7, 14]

# filter list to only keep even numbers (where the modulo is zero, no rest)
filtered = list(filter(lambda x: (x % 2 == 0), numbers))

# print list with even numbers
print(filtered)

[16, 4, 26, 14]


<div class="alert alert-block alert-info">
    <b>Exercise 7 </b>: Create a list with all odd integers from 1 to 30 (using lambda functions). 
</div>

In [5]:
# Let's get active

odd_numbers  = list(filter(lambda x: (x % 2 == 1),range(1,31)))
odd_numbers


[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29]

<div class="alert alert-block alert-info">
    <b>Exercise 8 </b>: Create a lambda expression that filters and returns values > 20.
</div>

In [7]:
# Let's get active

larger_20 = list(filter(lambda x: (x > 20),range(1,31)))
larger_20


[21, 22, 23, 24, 25, 26, 27, 28, 29, 30]

<div class="alert alert-block alert-info">
    <b>Exercise 9 </b>: Create a lambda expression that performs integer division by 3.
</div>

In [16]:
# Let's get active


