# Einführung in das Programmieren in Python
# Session 1-2: Data Types and Methods

Jack Krüger, Sebastian Staab

WS 20/21

In this session we will explore the fundamental data types that exist in Python. We will look at how to convertFurthermore, by doing so we will see how basic **functions** are executed, information is stored in **variables** and code is structured with **conditional** and **control flow statements**. 

## 1.6 Data Types

In Python, every variable value has a certain **data type**. Actually every data type is a **class** and every **value** is an **instance** of any of these classes. When we declare a variable, we do not need to explicitly mention the data type. This feature is famously known as **dynamic typing**.

We will discuss the **standard data types** in Python step by step. But first let us get an **overview** of them all:

| Data Type | Description |
| -------- | ------- |
| `Integer` | integer number |
| `Float` | floating point number |
| `Boolean` | truth value (either true or false) |
| `String` | text |
| `Array` | mutable sequence of values with same type |
| `Tuple` | immutable sequence of values of any types |
| `List` | mutable sequence of values of any types |
| `Dictionary` | associative mapping with keys and values |
| `Set` | unordered set of distinct values |

### Integer and Float

Ideally we start with the **numerical types** because we have already got to know them. 

The first is `Integer` and can hold **whole numbers**, **positive** or **negative**. `Integer` cannot hold decimals. In order use **decimals**, a `Float` must be used. `Float` can also hold positive and negative numbers. 

Let us try this with an **example**. To verify the **type** of any object, we use the function `type()`.

In [1]:
# define integer number
base = 5

# check type of integer number
print(type(base))

<class 'int'>


In [31]:
# define decimal number
result = 1.4142

# check type of decimal number
print(type(result))

<class 'float'>


### Boolean

The next data type is of great importance in programming, as we will see with the **conditional** and **control flow statements** today. `Boolean` can hold **truth values**, which are either `True` or `False`. In comparison to the numerical context they behave like `1` and `0`. 

Let us initialize our **first boolean**. To verify the **type** of any object, we use the function `type()`.

In [33]:
# initialize boolean
difficult = False

# check type of boolean
print(type(difficult))

<class 'bool'>


In [34]:
# initialize boolean
understood = True

# check type of boolean
print(type(understood))

<class 'bool'>


### String

We have already seen the next data type too, as we printed text with the function `print()`. In general, **text** can be held in `String`, with the text surrounded either by **single quotation marks** or **double quotation marks**. Accordingly `"Hello world"` is the same as `'Hello world'`. To write a `String` over **multiple lines**, you can use **three quotation marks**.

Let us write some **first texts**. To **prompt** the texts, we use the function `print()`. To verify the **type** of any object, we use the function `type()`.

In [35]:
# initialize string
sentence = "I love Python!"

# print string
print(sentence)

# check type of string
print(type(sentence))

I love Python!
<class 'str'>


In [39]:
# initialize multiline string
sentences = """I love Python!
This is very easy!"""

# print multiline string
print(sentences)

# check type of multiline string
print(type(sentences))

I love Python!
This is very easy!
<class 'str'>


Due to the fact that `String` is a **sequence** of **characters**, the **individual characters** can also be accessed using **indexing** and **slicing**. Each letter, number, whitespace or symbol gets its own index. With a **positive index** you can access a `String`, where the index starts with `0` from the **beginning** of the string. With a **negative index** you can also access a string, this is especially helpful if you want to access the **ending** of the `String`. In the negative case the index starts with `-1` at the end of the string. 

Let us have a look at the **index breakdown** of a string.

| Character | I | | l | o| v| e |  | P | y | t | h | o | n | !   
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | ---
| Positive Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13
| Negative Index | -14 | -13 | -12 | -11 | -10 | -9 | -8 | -7 | -6 | -5 | -4 | -3 | -2 | -1

To access a **single character** within a `String`, we simply take its index and write it in **square brackets** behind the **variable name** where the `String` is stored. 

Let us try to get the **exclamation mark** out of the previous `String`.

In [12]:
# initialize string
sentence = "I love Python!"

# print character with positive index
print(sentence[3])

# print character with negative index
print(sentence[-11])

o
o


To access **multiple** (but not all) **characters** in a `String`, you can use the same indexes to make **slices**. With slices, a **range** of **characters** is taken from the actual `String` based on their index numbers. Note that the **first character** at the slice start is **inclusive** and the **last character** at the slice end is **exclusive**. For slicing, the slice start and slice end are again written in **square brackets** behind the **variable name**, where both are separated by a **colon**. If you omit the start slice or the end slice, and use a colon, the `String` is simply used from the beginning or to the end. 

You can do the same with **negative indexes**, but keep in mind that start and end index are **reversed**, and thus also which **character** is **inclusive** and **exclusive**. We will not go into detail here, just try it yourself. 

Let us try to extract **some parts** of the previous `String`.

In [13]:
# initialize string
sentence = "I love Python!"

# extract inside characters
print(sentence[7:13])

# extract beginning characters
print(sentence[:7])

# extract ending characters
print(sentence[7:])

Python
I love 
Python!


<div class="alert alert-block alert-info">
    <b>Exercise</b>: Create 3 new objects. <b> object_1</b> should be of type <b> int</b>, <b> object_2</b> should be of type <b> boolean</b>, and <b> object_3</b> should be of type <b> string</b>. 
</div>

<div class="alert alert-block alert-info">
    <b>Exercise</b>: Check the type of each of your new objects.
</div>

### Array

In addition to the primitive data types that we have dealt with so far, there are also data types which are rather a collection of these. Traditionally, the `Array` is the first **non-primitive data type** that is dealt with. In general, `Array` in Python are a compact way of **collecting basic data types**, all the entries in an `Array` must be of the **same data type**. However, `Array` is not popularly used in Python and not a build in data type, unlike in other programming languages. To work with them you would need to `import` **additional libraries**.

In general, when people talk of an `Array` in Python, they are actually referring to `List`. However, when we will work with the `numpy` library, you will see that there are **fundamental differences** between them. But at this point, we will first discuss what a `List` is.

### Tuple

But before we actually get to `List`, we will discuss another data type that exists in Python. `Tuple` is another **standard sequence data type**. The main difference between `Tuple` and `List` is that `Tuple` is **immutable**, which means once defined you cannot delete, add or edit any values inside it. Due to this property, `Tuple` is mainly suitable for collections which are fixed and will not change. 

`Tuple` is typed with **round brackets**, `(` and `)`, and its **elements** are separated by a comma `,`. To access certain **values** inside your `Tuple`, you can use their **indexes** again. 

Let us create our first `Tuple`. To verify the **type** of any object, we use the function `type()`.

In [3]:
# define tuple
points = (2, 1, 8, 3, 4, 5, 6, 12, 8, 0, 10)

# check type of tuple
print(type(points))

<class 'tuple'>


### List

A `List` is a data structure that contains **multiple values** in an **ordered sequence**. These are **mutable**, which means that you can change their content without changing their identity. You can recognize a `List` by its **square brackets** `[` and `]` that hold **elements**, separated by a **comma** `,`. `List` is built into Python: you do not need to invoke them separately.

Let us create our first list. To verify the **type** of any object, we use the function `type()`.

In [50]:
# define list of decimal numbers
points = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# check type of list with decimal numbers
print(type(points))

<class 'list'>


A `List` can contain various **elements** of **different data types**, which can be duplicates and `List` again.

Let us look at a **worst-case example** to illustrate what is **principally possible**. 

In [9]:
# define list with different data types
messy_points = [False, 1, [2, 3, 4, 5, [6, 7, 8, 9]], "ten"]

# check type of list with different data types
print(type(messy_points))

<class 'list'>


Whenever we want to access an individual **element** of a `List` we can do so by typing the **list name** and the **index** of the element in **square brackets**. And if we want to access **multiple elements** in a `List`, then it also remains the same, we can select with **slicing** a start and stop index. Again, indexes start at zero, and may be negative. 

In [52]:
# define list
points = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# extract one element
print(points[-1])

# extract multiple elements
print(points[:-1])

10
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [21]:
messy_points = [False, 1, [2, 3, 4, 5, [6, 7, 8, 9]], "ten"]

<div class="alert alert-block alert-info">
    <b>Exercise</b>: How many elements are in the list <b>messy_points</b>? What are the data types of each of these elements?
</div>

In [19]:
len(messy_points)

4

In [20]:
print(type(messy_points[0]))
print(type(messy_points[1]))
print(type(messy_points[2]))
print(type(messy_points[3]))

<class 'bool'>
<class 'int'>
<class 'list'>
<class 'str'>


In contrast to a `Tuple`, the **elements** in a `List` can also be **changed afterwards**. How we add and delete elements, we will learn later in the methods section. But we can already discuss here how **elements** inside a `List` can be **changed**. For this, the elements which are supposed to be changed are called with their **index** or **slice** and then assigned to **new elements**. 

In [53]:
# define list
points = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# overwrite element
points[10] = 11
points[9] = 8

# print list
print(points)

# modify elements
points[9:] = [9, 10]

# print list
print(points)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 8, 11]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


Another characteristic of `List` is that it cannot be simply copied from variable to variable. Only the **reference** where the `List` is saved is stored in the **variable**. For example, if you copy `list1` to `list2`, all changes made in `list1` will automatically change `list2` too, and vice versa. To actually create a new `list2`, you can pass the original `list1` into the function `list()` to actually **force** a new `List`. 

Let us take a look at a **common problem** with **references**.

In [59]:
# define list
points = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# create new list
new_points = list(points)

# modify new list
new_points[10] = 11

# print original list
print(points)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


### Dictionary

Like a `List`, a `Dictionary` is a **collection** of **many values**. Unlike `List`, which is indexed by a range of numbers, `Dictionary` is indexed by keys. **Keys** for `Dictionary` can use many different data types, not just integers. It is best to think of a dictionary as an **unordered set** of **key-value-pairs**, with the requirement that the **keys** are **unique** (within one dictionary). A pair of **curly brackets** creates an empty `Dictionary`: `{` and `}`. In them you can add as many **key-value-pairs** as you like, each written in the notation `key: value`, and separated by **commas** `,`. 

`Dictionary` is exactly what you need if you want to implement something similar to a **telephone book**. None of the data structures that you have seen before are suitable for a telephone book.

Let us create our first `Dictionary`. To verify the **type** of any object, we use the function `type()`.

In [60]:
# define dictionary
happy = {"no": [0, 1, 2, 3, 4], "yes": [5, 6, 7, 8, 9, 10]}

# check type of dictionary
print(type(happy))

<class 'dict'>


If we want to **access** a certain **value** inside a `Dictionary`, then we can access it by typing the **dictionary name** followed by the corresponding **key** inside **square brackets**. 

Let us try to get the **value** in a `Dictionary` based on its **key**.

In [61]:
# define dictionary
happy = {"no": [0, 1, 2, 3, 4], "yes": [5, 6, 7, 8, 9, 10]}

# print value
print(happy["yes"])

[5, 6, 7, 8, 9, 10]


Just like in a `List`, you can change and add elements in a `Dictionary` afterwards. To change the value of an **existing key**, the position of the **key** in the `Dictionary` can be **overwritten**. We have already seen how this works with a `List`. In addition, **new keys** can be added by simply adding a new key to the dictionary using the **key** as **index** and assigning it to the corresponding **value**. 

Let us try to **change** an **existing dictionary**. 

In [64]:
# define dictionary
happy = {"no": [0, 1, 2, 3, 4], "yes": [5, 6, 7, 8, 9, 10]}

# modify existing key-value-pair
happy["yes"] = [8, 9, 10]

# add new key-valie-pair
happy["not quite"] = [5, 6, 7, 11]

# print dictionary
print(happy)

{'no': [0, 1, 2, 3, 4], 'yes': [8, 9, 10], 'not quite': [5, 6, 7, 11]}


As with a `List`, a `Dictionary` cannot be simply copied from variable to variable as well. The reason is that only a **reference** is stored in the **variable**. To actually create a new `Dictionary`, you can pass the original `Dictionary` into the function `dict()` to actually **force** a new `Dictionary`.  

Let us demonstrate how to make an **independent copy** of a `Dictionary`. 

In [None]:
# define dictionary
happy = {"no": [0, 1, 2, 3, 4], "yes": [5, 6, 7, 8, 9, 10]}

# initialize copy
new_happy = dict({"no": [0, 1, 2, 3, 4], "yes": [5, 6, 7, 8, 9, 10]})

# modify copy
new_happy["yes"] = [8, 9, 10]

# print original dictionary
print(happy)

### Set

A `Set` is an **unordered** and **mutable collection** of **distinct** (unique) **elements**. It is useful to create something like a `List` that can only hold unique values. This is particularly helpful when going through a huge dataset. `Set` objects also support mathematical operations like union `|`, intersection `&`, difference `-`, and symmetric difference `^`. Within a pair of **curly brackets** `{` and `}`, you can add as many **elements** to the `Set` as you like, each separated by a **comma** `,`. 

Note, however, that if you want to create an **empty** `Set`, you do **not** use the **curly brackets** because they will create a dictionary, but you rather call the function `set()` explicitly.

In comparison to a `List` or `Tuple`, a `Set` **cannot** be accessed with **indexing** or **slicing** because it is an unordered data type. However, a `Set` can be edited, but this only works with methods which we will introduce later. 

Let us create a **first set** anyway and show that the elements are **deduplicated** automatically.

In [10]:
points = {0,1,2,3}

In [12]:
type(points)

set

In [65]:
# define set
points = set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 10, 10])

# print set
print(points)

# check type of set
print(type(points))

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
<class 'set'>


<div class="alert alert-block alert-info">
    <b>Exercise</b>: A small zoo currently only has 5 animals - a lion, a giraffe, an elephant, a monkey, and a snake. Store this info in the most appropriate data type.
</div>

In [13]:
animals = ["Lion", "Giraffe", "Elephant", "Monkey", "Snake"]
print(animals)

['Lion', 'Giraffe', 'Elephant', 'Monkey', 'Snake']


<div class="alert alert-block alert-info">
    <b>Exercise</b>: Suppose you want to be able to find out students' current assignment scores based on their student ID. Create student Hans, whose student ID is 01/99999 and assignment scores are 9, 8, 10 points. Choose the most appropriate data type. 
</div>

In [14]:
Hans = {"01/99999": [9,8,10]}
print(Hans["01/99999"])

[9, 8, 10]


## 1.8 Type Conversion

Now that we can determine the most appropriate data type for our use case, nothing can go wrong anymore right?

Not quite. What we do not know yet is how the **different data types** can be **combined**, e.g. in calculations. What data type can be linked with other data types is mostly a matter of **experience**. But luckily Python offers a big support here: **Implicit type conversion**. This is the **automatic conversion** of Python into the **smallest common data type**, when operands of different types appear in an expression, such that they have the same type. 

For all other cases, there is the **explicit type conversion**: Using the **functions** for the different **data types**, certain types can be converted into others. 

Here is a brief **overview** of which **functions** can be called to explicitly convert to the corresponding data type, and with which **data types** it is compatible.

| Function | Compatible Data Types  |
| -------- | ------- |
| `bool()` | `Boolean`, `Integer`, `Float`, `String`, `Tuple`, `List`, `Dictionary`, `Set` |
| `float()` | `Integer`, `Float`, `String` |
| `int()` | `Integer`, `Float`, `String` |
| `str()` | `Integer`, `Float`, `String`, `Tuple`, `List`, `Dictionary`, `Set` |
| `tuple()` | `Tuple`, `List` |
| `list()` | `String`, `Tuple`, `List`, `Dictionary`, `Set` |
| `dict()` | `Dictionary` |
| `set()` | `String`, `Tuple`, `List`, `Dictionary`, `Set` |

Let us do a couple of **examples** with **explicit type conversion**. 

In [74]:
# define float
result = 1.4142

# convert float to integer
result = int(result)

# print integer
print(result)

1


In [75]:
# define string
result = "1.4142"

# convert string to float
result = float(result)

# print float
print(result)

1.4142


In [76]:
# define string
sentence = "I love Python!"

# convert string to list
sentence = list(sentence)

# print list
print(sentence)

['I', ' ', 'l', 'o', 'v', 'e', ' ', 'P', 'y', 't', 'h', 'o', 'n', '!']


In [77]:
# define dictionary
happy = {"no": [0, 1, 2, 3, 4], "yes": [5, 6, 7, 8, 9, 10]}

# convert dictionary to list
happy = list(happy)

# print list
print(happy)

['no', 'yes']


<div class="alert alert-block alert-info">
    <b>Exercise</b>: Change the type of the one of the 3 objects you created earlier. <b>object_1</b> should be changed to type <b>string</b>. Once you have done this, check that you were successful, and then change it back to type <b>int</b>.
</div>

## 1.8 Methods

Besides functions, there are also methods in Python. Unlike functions, **methods** are associated with **certain objects**. The method is implicitly used for a certain object for which it is called. The method usually works on the data of an instance of an object, and returns a desired result accordingly (or not). 

The **syntax** of methods is the **name** of the **instance object** followed by the respective **method name** seperated by a **dot** `.` Hopefully, the concept of methods will become clear with the help of the upcoming examples. 

Let us try out some **simple methods**.

In [83]:
# initialize string
sentence = "I love Python!"

# convert string in upper case
sentence = sentence.upper()

# print string
print(sentence)

I LOVE PYTHON!


In [81]:
# initialize string
sentence = "I love Python!"

# replace values by other values
sentence = sentence.replace("Python", "R").replace("love", "hate")

# print string
print(sentence)

I hate R!


In [3]:
# initialize list
sample_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# add element to list
sample_list.append(11)

# print list
print(sample_list)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]


As we saw earlier, **lists** can also contain elements of **different type**. Our current list contains only elements of type **int**, but we can use our append method to append elements of type **boolean** or **string** as well.

In [8]:
# add element to list
sample_list.append(True)

# add element to list
sample_list.append("Hello!")

# print list
print(sample_list)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, True, 'Hello!', True, 'Hello!']


Some methods only work for specific data types. For example, if we try to **append** an element to a **boolean** or a **dictionary**, we will get an error.

In [6]:
# create a boolean
sample_boolean = True

# try to append
sample_boolean.append(False)

AttributeError: 'bool' object has no attribute 'append'

In [7]:
# create a dictionary
sample_dictionary = {"A": ["Apples, Avocados, Apes"], "B":["Bananas", "Beans", "Beer"]}

# try to append
sample_dictionary.append(11)

AttributeError: 'dict' object has no attribute 'append'

In [85]:
# define sets
points = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
happy_no = {0, 1, 2, 3, 4}

# get difference of lists
happy_yes = points.difference(happy_no)

# print list
print(happy_yes)

{5, 6, 7, 8, 9, 10}


To get a list **all attributes** and **methods** a variable has, we use the function `dir()`.

Let us see what other **methods** a **list** has.

In [9]:
# create a list
points = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# print methods
print(dir(points))

['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
