In [1]:
%load_ext lab_black

## Datatypes

In the following we want to introduce the concept of data types (especially) in Python. As explained in the previous chapters, data types are related to the type of information that should be processed or which is stored in a variable. We have already learned that Python (and other programming language) differentiates between numbers and characters. In relation to this there are appropriated data types for this kind of information. Based on the *type of data* the following atomic built-in data types are available in Python

Kind of information | Data type |
:- | :-: | 
text and characters | `str` | 
numbers **without** decimal point | `int` |
numbers **with** decimal point | `float` | 
complex numbers in a mathematical meaning | `complex` |
logical values (\[1, 0\]; \[TRUE, FALSE\]) | `bool` |

> **Note:** The datatype of a variable / information is automatically determined by Python. The associated datatype is assigned by Python itself.

> **Note:** The datatype of a variable is retrieved by using the `type()` function.

> **Note:** All these data representations are objects in Python. Although there are a few functions that work on these datatypes, you will find a lot more helpful associated object properties and methods.

## Simple build in Data types

We want to have a closer look at the atomic data types and how they are used in Python

### Numbers

#### Integer values

Numbers without a decimal point are automatically stored as `integer`:

In [2]:
int_number = 33

In [3]:
type(int_number)

int

#### Float values

In contrast to `integer` values numbers with a decimal point are stored as `float`:

In [4]:
float_number = 33.33

In [5]:
type(float_number)

float

> **Note:** You only need to set the decimal point so that Python knows that the value of the variable is a `float`.

There is also the possibility to change the data type of a variable manually. For instance:

In [6]:
int(float_number)

33

In [7]:
type(int(float_number))

int

> **Note:** The explicit, manual change of a data type is referred to the term of *type casting*. Be aware when you cast a data type into another since not every type cast is supported. Casting could also go along with information loss. According to this you should check the result of your type cast carefully.

### Character chains - `Strings`

Besides numbers we also get to know about chains of characters or letters. The associated data typed is called `str` for strings. Every information is automatically interpreted as string if the value is put in `""` or `''`:

In [8]:
a_word = "soga"
a_sentence = "I enjoy reading soga"

In [9]:
type(a_word)

str

In [10]:
type(a_sentence)

str

There is also the possibility to type cast any data type into a `str` by using the `str()` function. As mentioned before, be aware of casting a variable into a string object. For instance:

In [11]:
type(str(29))

str

The `print()` function is used to display the output of an variable:

In [12]:
print(a_sentence)

I enjoy reading soga


Or also for user given outputs:

In [13]:
print("This is the soga chapter about atomar data types.")

This is the soga chapter about atomar data types.


> **Note:** The `""` are used to indicate that the output is a string. You could also use the `''` instead. The only difference is that special or reserved characters in strings are not interpreted if you use the `''`. We do not want to go in detail here.

the `print()` function could also be used for the concatenation of objects:

In [14]:
firstname = "Francis Ford"
lastname = "Coppola"

In [15]:
print(firstname, lastname)

Francis Ford Coppola


> **Note:** every argument of the `print()` function is separated by a space by default.

### String methods

Besides the above introduced functions there are a variety of very useful `str` methods. The `format()` method is a nice extension for the `print()` statement:

In [16]:
chapter = "ninth"
soga = "soga"

In [17]:
print("This is the {} chapter of the {} tutorial".format(chapter, soga))

This is the ninth chapter of the soga tutorial


> **Note:** The brackets are filled in the order which is specified within the `format()` method.

> **Note:** Please remember the different syntax between a *function* and a *method* call, which is described in detail in the previous chapter.

Besides this there exist a lot of methods that manipulate the given string for you:

In [18]:
a_sentence = "I enjoy reading soga"

In [19]:
a_sentence.title()

'I Enjoy Reading Soga'

In [20]:
a_sentence.upper()

'I ENJOY READING SOGA'

In [21]:
a_sentence.lower()

'i enjoy reading soga'

In [22]:
a_sentence.replace("a", "--")

'I enjoy re--ding sog--'

A full and nicely formatted list of all in build `str` methods is given at [W3schools](https://www.w3schools.com/python/python_ref_string.asp).

> **Exercise:** Use the defined variables below to reproduce the given output! Combine the defined variables with the `print()` function and do the calculation with the help of operators!

In [23]:
name1 = "Alan"
age1 = 20
name2 = "Marie"
age2 = 19

In [24]:
### your solution

### Logical values - `bool`

Logical values are a kind of information that only consists of two contrastive values: 

- `0` in the meaning of *no* or *false*
- `1` in the meaning of *yes* or *true*

> **Note:** The advantage of this kind of data is for programming beginners normally not easy to grasp. So please do not despair if the meaning of these values is not plausible yet.

Logical values are related to comparisons between variables, so called logical expressions. The result of a logical expression is either `TRUE` (0) or `FALSE` (1). In this understanding they are essentially important for controlling the program flow. Although this concept will be explained in the next chapters, let's have a look at an example for the use of logical values: 

In [25]:
x = 100
y = 200

In [26]:
# is x lower than y?
x < y

True

In [27]:
# is x greater than y?
x > y

False

In [28]:
# is x = y?
x == y

False

> **Note:** the `=` operator is reserved for assigning a value to a variable. If you want to compare two variables or values with one another, you have to use the `==` operator!

## Indexing and Slicing

At last, we want to introduce the the concept of indexing a specific value out of a chain of values. The only chain of values that you know to this point are `str` objects. In the next chapter you will also get to know about other common data structures that combines the introduced atomic data types to create new data objects widely used for the purposes of data science and statistical modelling. To use these data structures properly you need to understand how to retrieve one or more specific values out of a chain of values. We want to explain this with the help of a `str` object:

> **Note:** the term *indexing* is referred to adress the position of a value in a chain whereas the term *slicing* relates to the extraction of values out of a chain of values.

In [2]:
a_sentence = "I enjoy reading soga"

In [3]:
len(a_sentence)

20

To slice values out of this chain of letters, we first must think about the associated positional numbers of each letter in this character chain:

<center><img src="figures/Indexing_and_slicing.png" alt="Image to demonstrate how positional numbers in Pythons are connected with values in value chains" style="width: 700px;"/></center>

As you can see the first letter in this string is associated with the place marker $0$. The second letter with the $1$ and so on.

> **Note:** Also spaces and special characters (e.g. `(`, `)`, or `*` are counted at a single letter! Be aware of this for indexing purposes.

> **Note:** It is important to know that Python starts counting with $0$. That means the first character has always the associated positional number $0$ whereas the last letter in will have the position $n - 1$, in this case 19.

To access one or more specific letter in a value chain, we make use of the `[<position>]` brackets. For instance, if we want to access the first letter in a string:

In [31]:
a_sentence[0]

'I'

The last letter is accessed by:

In [32]:
a_sentence[19]

'a'

We could also make use of the `len()` function which returns the length of a value chain. Beware that you need to subtract 1:

In [33]:
a_sentence[len(a_sentence) - 1]

'a'

Or even more convenient:

In [34]:
a_sentence[-1]

'a'

Accordingly:

In [35]:
a_sentence[-2]

'g'

In [36]:
a_sentence[-3]

'o'

To access multiple values out of a character chain we use the `:` operator within the `[]` brackets. To retrieve the first 8 values:

In [37]:
a_sentence[0:7]

'I enjoy'

The more convenient way:

In [38]:
a_sentence[:7]

'I enjoy'

to get the last 4 letters:

In [39]:
a_sentence[-4:]

'soga'

The `:` operator is really powerful. For example, if you want to get every second letter, starting with the first one:

In [40]:
a_sentence[::2]

'Iejyraigsg'

You could also use the `::` syntax to get all letters in a flipped order:

In [41]:
a_sentence[::-1]

'agos gnidaer yojne I'

We want to conclude this chapter with an Exercise:

> **Exercise:** Check if the value stored in the variable `word` is a [palindrom](https://en.wikipedia.org/wiki/Palindrome). Make use of logical expressions as well as the index operator!

In [42]:
word = "kayak"

In [43]:
### your solution

In [1]:
from IPython.display import IFrame

IFrame(
    src="../../citations/citation_Soga.html",
    width=900,
    height=200,
)