# 01_datatypes

Open on binder:

[![Binder](https://binder.pangeo.io/badge_logo.svg)](https://mybinder.org/v2/gh/kevinsawade/start-science-here/HEAD?urlpath=%2Ftree%2F?filepath=python_tutorial%2Fbasic_python%2Fbasics_00_intro.ipynb)

This notebook is based on the tutorial series by **Rajath Kumar M P**. Visit his github to find out more: https://github.com/rajathkmp

## How to navigate jupyter notebooks


You are currently looking at a jupyter notebook. These notebooks are one of the possibilities how to write and execute python code. A jupyter notebook consists of separate code cells. A cell can be executed either by clicking on the play button in the tool bar on top or by pressing ```shift``` + ```enter```.


## The zen of python

This little snippet by Tim Peters contains 19 guiding principles how the python programming language should be used. We use it as a way to demonstrate how to execute code in jupyter-notebook cells. Click into the cell reading "import this" and run it by hitting "shift" and "enter" at the same time. Alternatively you can click on the "run" button in the top toolbar. After that continue in executing the cells. Have fun in learning python!

In [None]:
import this

## Antigravity

Just execute this cell. Trust me I'm a tutorial.

In [None]:
import antigravity

## Keywords

Some words are automatically recognized by your viewer (in this case jupyter notebook) and highlighted accordingly. Keywords are case sensitive. See the examples below:

In [None]:
import keyword
keyword.kwlist

## Variables

Values or other kinds of data in python is stored in variables. Variables are declared with an equal sign, which assigns the value on the right to the variable name on the left.

In [None]:
a = 3
b = 2
c = 'Hello World!'

Access the variables values this way. Note how the cell returns the string with the annotation ```Out [4]:``` or a different integer. However, this way you can only return the value of the last line in a code cell.

In [None]:
c

In [None]:
c
b
a

If you want to see the values of multiple variables at once you can use the print function. See the examples below:
```python
>>> print(a) # prints variable a
```
Notice how the ```Out [x]:``` is missing from the output of these cells. This is because the print() function returns ```None```.

In [None]:
print(a)
print(a + b)
print(c)

In [None]:
d = print(a)
print(d)

There are some best practices when declaring variables.
 - Variable names should say something about their content.
 - When the variable uses a built-in name (e.g. print, None) an underscore ('\_') can be appended.
 - Throwaway variables are most often called '\_\_' (double underscore)
 
In the next cell a string is declared as a variable called print_ (with trailing underscore)

In [None]:
print_ = 'This should be printed'
print(print_)

\begin{exercise}\label{ex:ex0}
Assign the number 1 to variable a and the number 1.2 to variable b.
Print the values of each individual variable and the value of their sum using the print function.
\end{exercise}

In [None]:
a = 1
b = 1.2
print(a)
print(b)
print(a + b)

## Comments

Comments are declared by hashes (\#). Python styleguides also dictate a space between the hash and the actual line. However, this is not a requirement for comments. Multi-line comments are declared by triple quotes ('''). If you like, you can un-comment the next lines and see how overwriting the print() function with a variable called print (without trailing underscore) breaks the system. If you want to resume you need to restart the Kernel. You can do this by clicking on "Kernel" in the top toolbar and then "Restart", or with button highlighted with a red box.

<img src="../restart_kernel.png" alt="Toc" width="1500"/>

In [None]:
'''
This is a multiline
comment.
'''

a = 3

# The next line breaks the system. Restart of the Kernel required.
# print = 'This should be printed'

print(a)

## Operators

### Arithmetic operations

Following operations are built into python.

| $\Large Symbol$ | $\Large Task\ Performed$ |
|-----------------|-------------------------|
| $\Large +$      | Addition                |
| $\Large -$      | Subtraction             |
| $\Large *$      | Multiplication          |
| $\Large /$      | Division                |
| $\Large \%$     | Modulo                  |
| $\Large //$     | Floor Division          |
| $\Large **$     | Power of                |

In [None]:
print(1 + 3)
print(3 ** 2)
print(5 - 2)
print(1 / 2) # The result of this operation is special. We'll learn later why.

Prepare for some modulo functions!

In [None]:
print(5 % 5)
print(5 % 2)
print(9 % 3)
print(9 % 5)

\begin{exercise}\label{ex:modulo}
What does the modulo operation calculate?
\end{exercise}

It computes the remainder of a division.

### Relational operators

Following relational operators are in python:

| $\Large Symbol$ | $\Large Task Performed$              |
|----------------|---------------------------------------|
| $\Large ==$    | True if euqal                         |
| $\Large !=$    | True if not equal                     |
| $\Large <$     | less than                             |
| $\Large >$     | greater than                          |
| $\Large <=$    | less than or equal to                 |
| $\Large >=$    | greater than or equal to              |
| $\Large is$    | Identity operator                     |

In [None]:
print(5 == 5)

In [None]:
print(4 <= 5)

In [None]:
print(5 != 5)

In [None]:
a = 5
b = 5.0
print(a == b)
print(a is b)

\begin{exercise}\label{ex:equality_operator}
Can you already guess, why the second relational operation is False?
\end{exercise}

I have to admit that this question requires a certain amount of prior knowledge. It boils down to datatypes. a is an integer and b is a float. Datatypes will be introduced later on. Stay tuned.

### Bitwise operators

Following bitwise operators are available:

| $\Large Symbol$ | $\Large Task Performed$              |
|----------------|---------------------------------------|
| $\Large \&$    | Logical And                           |
| $\Large |$     | Logical OR                            |
| $\Large \hat{}$| XOR (exclusive OR)                    |
| $\Large \sim$  | Negate                                |
| $\Large >>$    | Right shift                           |
| $\Large <<$    | Left shift                            |

In [None]:
a = 2 # 0b10 in binary
b = 3 # 0b11 in binary

In [None]:
print(bin(a))
print(bin(b))
print(a & b)
print(bin(a & b))

The result can be explained by remembering that '&' is a **bitwise** operator. So let's compare the binaries on a bitwise basis:

- First bits: 1 & 1 equals 1
- Second bits: 1 & 0 equals 0

In [None]:
print(1 & 1)
print(0 & 1)

In [None]:
print(5 >> 1)

0000 0101 -> 5

Shifting the digits by 1 to the right and zero padding

0000 0010 -> 2

In [None]:
print(5 << 1)

0000 0101 -> 5

Shifting the digits by 1 to the left and zero padding

0000 1010 -> 10

\begin{exercise}
Using bitwise operators do some logic calculations. Try to find out what 1 AND 1, 1 AND 0, 1 OR 0, 0 OR 1, 1 XOR 0, 1 XOR 1 are.
\end{exercise}

```python
# example for False AND False
print(0 & 0)
```

In [None]:
print(1 & 1)
print(1 & 0)
print(1 | 0)
print(0 | 1)
print(1 ^ 0)
print(1 ^ 1)

### Boolean operators

Boolean operators are used on booleans, that is truth, values. Following Boolen operators are available:

| $\Large Operator$ | $\Large Task Performed$               |
|-------------------|---------------------------------------|
| $\Large and$      | Logical AND                           |
| $\Large or$       | Logical OR                            |
| $\Large not$      | Negate                                |

Because python's booleans `True` and `False` are just different representations of `1` (or any other integer) and `0`, we can write the exercise from above with boolean values, but using bitwise operators:

In [None]:
print(True & True)
print(True & False)
print(True | False)
print(False | True)
print(True ^ False)
print(True ^ True)

In [None]:
print(False or True)
print(False and False)
print(True or False)
print(True or True)
print(not False)
print(not True)

## Floats and ints

There arw two datatypes to hold numercical values in python. The `int` (integer) datatype captures negative and positive numbers `(-2, -1, 0, 1, 2, 15)`. The `float` (floating point) datatype captures all numbers imaginable `(1.0, 2.3, 5.245, 3.141)`. Certain floats (`1.0`) can also be integers (you can omit the decimal zero and just use a dot to tell python, that the number is a float (`1.`). Python is pretty easy-going on datatypes (polymorphism) and can compare `int` and `float`. This would not be possible for some other programming languages.

In [None]:
1 == 1.

Addition of floats and ints also works like a charm and yields the higher order datatype.

In [None]:
print(1 + 2)
print(1.0 + 2.0)
print(1.5 + 2)

Also integer division (in contrast to other programming languages) works even when the result is not an integer. Some programming languages yield 2 as the result for `5 / 2`. In python, we can reproduce that behavior by using the floor division (`//`):

In [None]:
5 // 2

So don't worry about divisions:

In [None]:
5 / 2

However, some tasks require ints and floats are not allowes. Python will let you know, that it expected an `int` with a handy error message:

In [None]:
['a', 'b', 'c'][1.0]

## Built-in functions and printing

Bog-standard python comes with some already declared functions which we can use right from the start.

### The help() function

The ```help()``` function can be used to get information more information about objects, functions and methods.

In [None]:
help(print)

### Conversion between numbering systems

To differentiate our base-10 numbers from binary, hexadecimal, and octal numbering systems the other systems get prefixed with these prefixes:

- 0b: bindary
- 0x: hexadecimal
- 0o: octal

With the built-in funcions bin(), hex() and oct() numbers can be converted.

In [None]:
a = 5
b = 0b00001111
c = 0xff

In [None]:
print(a)
print(b)
print(c)

In [None]:
print(oct(a))
print(oct(b))
print(bin(c))

\begin{exercise}
Find the binary representation of the floating point number `0.1`
\end{exercise}

In [8]:
print(bin(0.1))

TypeError: 'float' object cannot be interpreted as an integer

### Using int() for conversion

The built-in function ```int()``` accepts two values when used for conversion, one is the value in a different number system and the other is its base. Note that input number in the different number system should be of string type.

In [None]:
print(int('010',8))
print(int('0xaa',16))
print(int('1010',2))

`int()` can also be used to get only the integer value of a float number or can be used to convert a number which is of type string to integer format. Similarly, the function str( ) can be used to convert the integer back to string format

In [None]:
print(int(7.7))
print(int('7'))

## chr()


Also note that function `bin()` is used for binary and `float()` for decimal/float values. `chr()` is used for converting ASCII to its alphabet equivalent, `ord()` is used for the other way round.

In [None]:
print(chr(98))
print(ord('A'))
print(chr(27))

Never seen the symbol of the ASCII-character number 27? That's because it's a so-called control-character. These go back to the times of Teleprinters, which were used to communicate via text. These control-characters mark stuff like beginning and end of a transmission. ASCII-character 27 is the escape control-character (ESC).

<img src="../Fernschreiber_T100_Siemens.jpg" alt="Toc" width="500"/>

Image taken from wikipedia

## Simpifying arithmetic operations

The `round()` function rounds the input value to a specified number of places or to the nearest integer.

**Note:** Python rounding is not the same as mathematical rounding. In python even numbers are rounded down and uneven are rounded up. For mathematical rounding you will need the built-in module math.

In [None]:
print(round(5.5))
print(round(6.5))

In [None]:
print(round(6.5))
print(round(7.5))

In [None]:
print(round(5.6231))
print(round(4.55892, 2))

`complex()` is used to define a complex number and `abs()` outputs the absolute value of the same.

In [None]:
c = complex('5+2j')
print(abs(c))

`divmod(x,y)` outputs the quotient and the remainder in a tuple (you will be learning about tuples in further chapters) in the format (quotient, remainder).

In [None]:
print(divmod(9,2))
print(9 // 2)
print(9 % 2)

`isinstance()` returns True, if the first argument is an instance of that class. Multiple classes can also be checked at once.

In [None]:
instances = (int, float)

print(isinstance(1, int))
print(isinstance(1.0, int))
print(isinstance(1.0, instances))
print(isinstance(1, instances))

`pow(x,y,z)` can be used to find the power $x^y$ also the mod of the resulting value with the third specified number can be found i.e. : ($x^y$ % z).

In [None]:
print(pow(3,3))
print(pow(3,3,5))

The `range()` function outputs the integers of the specified range. It can also be used to generate a series by specifying the difference between the two numbers within a particular range. The elements are returned in a list (will be discussing in detail later.)

In [None]:
print(range(3))
print(range(2,9))
print(range(2,27,8))

## Accepting user inputs
`input()` accepts input and stores it as a string. Hence, if the user inputs a integer, the code should convert the string to an integer and then proceed.

In [None]:
abc = input("Type something here and it will be stored in variable abc \n")

The built-in function type returns the type of a variable.

In [None]:
print(type(abc))
print(type)

## Printing and Strings

There are three ways to define strings in python.

- Single quotes (`'string'`) are often used for single characters, key-words and fixed strings.
- Double quotes (`"This is a string"`) are often used when working with natural language.
- Triple quotes (`"""This is a special string called docstring. The doc means documentation"""`) are often used to annotate functions and classes, so that other people can refer to this doctstring when they try to comprehend your code.

Strings are often used to print useful information. If your script runs into an error while looking for a file, because the file is not there you could make the script print an error message that helps you to identify the problem. 

```python
if file does not exist:
    print('The file does not exist')
    exit()
```

**There are several ways to help you format your strings and print them:**

- plain string printing: ```print('This is a string')```
- printing of multiple values:
```python
a = 'The value of b is'
b = 3
print(a, b)
```
- string concatenation:
```python
filename = '/path/to/file.txt'
print('The file ' + filename + ' does not exist.'
```
- %-formatted strings (python2): ```print('The value of b is %s' % b)```
- formatted strings (python3): ```print('The value of a is {} and of b is {}'.format(a, b))```
- f-strings (new in python3.6): ```print(f'The value of a is {a} and of b is {b}')```

Let's take a look at all these different string formats.

In [None]:
print("Hello World")

In [None]:
print("""Multi-line strings can only be

written with the triple-quoted string

because it preserves line breaks""")

# a single-quoted string can not be broken into lines
# print('This can not be
#       printed')

In [None]:
print("If you want to have line breaks\nin single-quoted strings,\nyou need to add a newline character(\\n)")

If you have long strings and don't want line breaks you can put multiple strings inside parentheses (like a tuple), but without commas:

In [4]:
s = ('This is a long string, which does not break over multiple '
     'lines. The line breaks are handled by the program displaying '
     'the string (so-called soft-wraps). Writing strings like this is '
     'espacially useful if you don\'t want to write never-ending '
     'lines of code. Most styleguides also recommend to break '
     'lines of python code after 79 character. This is talked '
     'about in PEP8: https://www.python.org/dev/peps/pep-0008/')
print(type(s))
print(s)

<class 'str'>
This is a long string, which does not break over multiple lines. The line breaks are handled by the program displaying the string (so-called soft-wraps). Writing strings like this is espacially useful if you don't want to write never-ending lines of code. Most styleguides also recommend to break lines of python code after 79 character. This is talked about in PEP8: https://www.python.org/dev/peps/pep-0008/


Depending on the width of the printed text above, some words might be broken over the lines. This can be prevented with the built-in package `textwrap` (python 3.9):

In [7]:
import textwrap
print('\n'.join(textwrap.wrap(s)))

This is a long string, which does not break over multiple lines. The
line breaks are handled by the program displaying the string (so-
called soft-wraps). Writing strings like this is espacially useful if
you don't want to write never-ending lines of code. Most styleguides
also recommend to break lines of python code after 79 character. This
is talked about in PEP8: https://www.python.org/dev/peps/pep-0008/


Strings can be assigned to variable say string1 and string2 which can called when using the print statement.

In [None]:
string1 = 'World'
print('Hello', string1)

string2 = '!'
print('Hello', string1, string2)

String concatenation is the "addition" of two strings. Observe that while concatenating there will be no space between the strings.

In [None]:
print('Hello' + string1 + string2)

This works only with strings, because python can't add a non-string object to a string

In [None]:
b = 3
print("Variable b is " + b)

But we can make b a string, by calling the built-in function str() on it.

In [None]:
b = 3
print("Variable b is " + str(b))

This way of constructing strings is not very readable, if you look at the code. That's what f-strings are for.

In [None]:
distance = 100
time = 2
velocity = distance / time

print("A vehicle travelling at " + str(velocity) + " km/h will cover a distance of " + str(distance) + " km in " + str(time) + " hours.")

In [None]:
print(f"A vehicle travelling at {velocity} km/h will cover a distance of {distance} km in {time} hours.")

## PrecisionWidth and FieldWidth

Numbers can be formatted using this special notation:

- :s -> string
- :d -> Integer
- :f -> Float
- :o -> Octal
- :x -> Hexadecimal
- :e -> exponential

In [None]:
d = 18
print(f"Actual Number = {d:d}")
print(f"Float of the number = {d:f}")
print(f"Octal equivalent of the number = {d:o}")
print(f"Hexadecimal equivalent of the number = {d:x}")
print(f"Exponential equivalent of the number = {d:e}")

Fieldwidth is the width of the entire number and precision is the width towards the right. One can alter these widths based on the requirements.

The default Precision Width is set to 6.

In [None]:
d = 3.121312312312
print(f'{d:f}')

Notice upto 6 decimal points are returned. To specify the number of decimal points, ```'{:(fieldwidth).(precisionwidth)f}'``` is used.

In [None]:
print(f'{d:.5f}')

If the field width is set more than the necessary than the data right aligns itself to adjust to the specified values.

In [None]:
print(f'{d:9.5f}')

Zero padding is done by adding a 0 at the start of fieldwidth.

In [None]:
print(f'{d:020.5f}')

For proper alignment, a space can be left blank in the field width so that when a negative number is used, proper alignment is maintained.

In [None]:
d_neg = -d
print(f' {d:9f}')
print(f' {d_neg:9f}')

The '+' sign can be returned at the beginning of a positive number by adding a + sign at the beginning of the field width.

In [None]:
print(f' {d:+9f}')
print(f' {d_neg:9f}')

As mentioned above, the data right aligns itself when the field width mentioned is larger than the actualy field width. But left alignment can be done by specifying a negative symbol in the field width.

In [None]:
print(f' {d:-9.3f}')
print(f' {d:9f}')

## Exercise: Printing a date and a time

In [None]:
DD = 3
MM = 5
YYYY = 2012.0
hh = 3
mm = 20
print("DD.MM.YYYYY hh:mm")

\begin{exercise}\label{ex:datetime}
Use the print function to format the following variables to a datetime format of DD.MM.YYYY hh:mm.

Use zero-padding to change "3.5.2012.0" to "03.05.2012"
\end{exercise}

In [None]:
print(f"{DD:02}.{MM:02}.{YYYY:.0f} {hh}:{mm}")

# Data structures

Data structures are a collection of data in a specific format. If you want to store multiple variables which give the temperature of a probe overt time it would be better to store them in a list instead of giving every temperature its own variable.

## Lists

Lists are the most commonly used data structure. Think of it as a sequence of data that is enclosed in square brackets and data are separated by a comma. Each of these data can be accessed by calling it's index value.

Lists are declared by just equating a variable to '[ ]' or list.

In [None]:
a = []
print(type(a))

You can create a list by directly declaring it:

In [None]:
x = ['apple', 'orange', 1, 1.1, 'hello', 'list', "Even longer strings are possible"]

### Indexing

In python, Indexing starts from 0. Thus now the list x, will have apple at 0 index and orange at 1 index.

The zero based indexing will lead to problems when data from other programs (atomic coordinate files) are loaded. **Always check if you transition from 1-based to 0-based indexing!**

In [None]:
x[0]

Indexing can also be done in reverse order. That is the last element can be accessed first.

In [None]:
x[-1]

\begin{exercise}\label{ex:list_indexing}
Return the element "hello" from the list via indexing.
\end{exercise}

In [None]:
print(x[4])

### Nested lists

A list can itself be an element of a list

In [None]:
x = ['apple', 'orange']
y = ['carrot','potato']
z  = [x,y]
print(z)

Indexing in nested lists can be quite confusing if you do not understand how indexing works in python. So let us break it down and then arrive at a conclusion.

Let us access the data 'apple' in the above nested list. First, at index 0 there is a list ['apple','orange'] and at index 1 there is another list ['carrot','potato']. Hence z[0] should give us the first list which contains 'apple'.

In [None]:
z0 = z[0]
print(z0)

Now observe that z0 is not at all a nested list thus to access 'apple', z0 should be indexed at 0.

In [None]:
z0[0]

In [None]:
z[0][0]

### Slicing

Indexing is only limited to accessing a single element, Slicing on the other hand is accessing a sequence of data inside the list.

Slicing is done by defining the index values of the first element and the last element **+1** from the parent list that is required in the sliced list. It is written as parentlist[ a : b ] where a,b are the index values from the parent list. If a or b is not defined then the index value is considered to be the first value for a if a is not defined and the last value for b when b is not defined.

In [None]:
num = [0,1,2,3,4,5,6,7,8,9]

In [None]:
print(num[0:4])

See, how the 4th element is excluded from the list. The lower index is inclusive, the higher index is exclusive. This is done, so slicing the list with ```[4:]``` returns the remainder of the list.

In [None]:
print(num[4:])

You can also use a step to only return every nth element of the list

In [None]:
print(num[1:9:4])
print(num[::2])

\begin{exercise}\label{ex:list_slicing}
Try to get the list [1, 3, 5] by slicing the list num.
\end{exercise}

In [None]:
print(num[1:6:2])

### Built-in list functions

To find the length of the list or the number of elements in a list, ```len()``` is used.

In [None]:
len(num)

If the list consists of all numerical elements then ```min()``` and ```max()``` gives the minimum and maximum value in the list.

In [None]:
print(min(num))
print(max(num))

Lists can be concatenated by adding with '+'. The resultant list will contain all the elements of the lists that were added. The resultant list will not be a nested list.

In [None]:
[1,2,3] + [5,4,7]

There might arise a requirement where you might need to check if a particular element is there in a predefined list. This can be done with the ```in``` logic operator

In [None]:
elements = ['Earth', 'Air', 'Fire', 'Water']
print('Earth' in elements)
print('Aether' in elements)

To a list of strings, ```max()``` and ```min()``` can also be applied. ```max()``` would return a string element whose ASCII value is the highest and vice versa. Note that only the first index of each element is considered each time and if the value is the same then the second index considered so on and so forth.

In [None]:
mlist = ['bzaa','ds','nc','az','z','klm']
print(max(mlist))
print(min(mlist))

Here the first index of each element is considered and thus z has the highest ASCII value thus it is returned and minimum ASCII is a. But what if numbers are declared as strings?

In [None]:
nlist = ['1','94','93','1000']
print(max(nlist))
print(min(nlist))

Even if the numbers are declared in a string the first index of each element is considered and the maximum and minimum values are returned accordingly.

But if you want to find the ```max()``` string element based on the length of the string then another parameter 'key=len' is declared inside the ```max()``` and ```min()``` function.

In [None]:
print(max(elements, key=len))
print(min(elements, key=len))

But even 'Water' has length 5. The ```max()``` or ```min()``` functions return the first element when there are two or more elements with the same length.

Any other built in function can be used or the ```lambda``` function which will be discussed in 02_functions_and_classes.

A string can be converted into a list by using the ```list()``` function.

In [None]:
list('hello')

```append()``` is used to add a element at the end of the list.

In [None]:
lst = [1,1,4,8,7]
lst.append(1)
print(lst)

```count()``` is used to count the number of a particular element that is present in the list.

In [None]:
lst.count(1)

If you want to combine two lists you can also use the ```extend()``` built-in function. See how the ```append()``` creates a nested list.

In [None]:
lst = [1,1,4,8,7]
lst1 = [5,4,2,8]
lst.append(lst1)
print(lst)

In [None]:
lst = [1,1,4,8,7]
lst1 = [5,4,2,8]
lst.extend(lst1)
print(lst)

```index()``` is used to find the index value of a particular element. Note that if there are multiple elements of the same value then the first index value of that element is returned.

In [None]:
print(lst.index(1))
print(lst.index(4))

```insert(x,y)``` is used to insert a element y at a specified index value x.

In [None]:
lst.insert(5, 'name')
print(lst)

```insert(x,y)``` inserts but does not replace an element. If you want to replace the element with another element you simply assign the value to that particular index.

In [None]:
lst[5] = 'Python'
print(lst)

The ```pop()``` function returns the last element in the list. This element is also removed from the list

In [None]:
element = lst.pop()
print(element)
print(lst)

An index value can be specified to pop a ceratin element corresponding to that index value.

In [None]:
lst.pop(0)

```pop()``` is used to remove an element based on it's index value which can be assigned to a variable. One can also remove element by specifying the value of the element using the ```remove()``` function.

In [None]:
lst.remove('Python')
print(lst)

An alternative to remove function but with using index value is ```del```

In [None]:
del lst[1]
print(lst)

Python offers built in operation ```sort()``` to arrange the elements in ascending order.

In [None]:
lst.sort()
print(lst)

To reverse the order the argument reverse of the method sort can be set to True

In [None]:
lst.sort(reverse=True)
print(lst)

In [None]:
elements.sort(key=len)
print(elements)
elements.sort(reverse=True, key=len)
print(elements)

### Copying a list

Most python newcomers make this mistake so be careful.

In [None]:
lista = [2,1,4,3]
listb = lista
print(listb)

Now we perform some random operations on lista

In [None]:
lista.pop()
print(lista)
lista.append(9)
print(lista)

In [None]:
print(listb)

listb has also changed though no operation has been performed on it. This is because you have assigned the same memory space of lista to listb. So how do fix this?

If you recall, in slicing we had seen that parentlist[a:b] returns a list from parent list with start index a and end index b and if a and b is not mentioned then by default it considers the first and last element. We use the same concept here. By doing so, we are assigning the data of lista to listb as a variable.

In [None]:
lista = [2,1,4,3]
listb = lista[:]
print(listb)

In [None]:
lista.pop()
print(lista)
lista.append(9)
print(lista)

In [None]:
print(listb)

## Tuples

Tuples are similar to lists with one big difference. The elements inside a list can be changed but in tuples they cannot be changed. Think of tuples as something which has to be True for a particular something and cannot be True for no other values. For better understanding, Recall the ```divmod()``` function

In [None]:
xyz = divmod(10,3)
print(xyz)
print(type(xyz))

Here the quotient has to be 3 and the remainder has to be 1. These values cannot be changed whatsoever when 10 is divided by 3. Hence divmod returns these values in a tuple.

To define a tuple, A variable is assigned to paranthesis ( ) or the function `tuple()` is called on the variable.

In [None]:
tup = () # this is an empty tuple
tup2 = tuple()

If you want to directly declare a tuple it can be done by using a comma at the end of the data.

In [None]:
27,

27 when multiplied by 2 yields 54, But when multiplied with a tuple the data is repeated twice.

In [None]:
2 * (27,)

Values can be assigned while declaring a tuple. It takes a list as input and converts it into a tuple or it takes a string and converts it into a tuple.

In [None]:
tup3 = tuple([1,2,3])
print(tup3)
tup4 = tuple('Hello')
print(tup4)

Tuples follow the same indexing and slicing as Lists.

In [None]:
print(tup3[1])
tup5 = tup4[:3]
print(tup5)

### Mapping one tuple to another

In [None]:
(a,b,c)= ('alpha','beta','gamma')

In [None]:
print(a,b,c)

In [None]:
d = tuple('AGPeterTutorial')
print(d)

### Built-in tuple functions

The functions `count()` and `index()` are working just like they did with lists.

In [None]:
print(d.count('a'))
print(d.index('a'))

## Sets

Sets are mainly used to eliminate repeated numbers in a sequence/list. They are also used to perform some standard set operations (Mengenlehre).

Sets are declared with either a sequence in braces { } or the `set()` function which will initialize an empty set. Also `set([sequence])` can be executed to declare a set with elements.

In [None]:
set1 = set()
print(type(set1))

Here, elements 2,3 which are repeated twice are seen only once. Thus in a set each element is distinct.

### Built-in functions

In [None]:
set1 = {1,2,3}
set2 = set([2,3,4,5])

The `union()` function returns a set which contains all the elements of both the sets without repetition.

In [None]:
set1.union(set2)

`add()` will add a particular element into the set. Note, that the index of the newly added element is arbitrary and can be placed anywhere not necessarily at the end.

In [None]:
set1.add(0)
set1

The `intersection()` function outputs a set which contains all the elements that are in both sets.

In [None]:
set1.intersection(set2)

The `difference()` function ouptuts a set which contains elements that are in set1 and not in set2.

In [None]:
set1.difference(set2)

The `symmetric_difference()` function ouputs a function which contains elements that are in one of the sets.

In [None]:
set2.symmetric_difference(set1)

`issubset()`, `isdisjoint()`, `issuperset()` are functions returning boolean (truth) values. They are used to check if the set1/set2 is a subset, disjoint or superset of set2/set1, respectively.

In [None]:
set1.issubset(set2)

In [None]:
set2.isdisjoint(set1)

In [None]:
set2.issuperset(set1)

`pop()` is used to remove an arbitrary element in the set

In [None]:
set1.pop()
print(set1)

The `remove()` function deletes the specified element from the set.

In [None]:
set1.remove(2)
set1

`clear()` is used to clear all the elements and make that set an empty set.

In [None]:
set1.clear()
set1

## Strings

Strings are ordered text based data which are represented by enclosing them in quotes.

In [None]:
String0 = 'Python is awesome'
String1 = "Python is awesome"
String2 = '''Python
is
awesome'''

In [None]:
print(String0 , type(String0))
print(String1, type(String1))
print(String2, type(String2))

String Indexing and Slicing are similar to Lists which was explained earlier.

In [None]:
print(String0[4])
print(String0[4:])

### Built-in functions

The `find()` function returns the index of a given value in the string. If not found `find()` returns -1. Remember to not confuse the returned -1 for reverse indexing value.

In [None]:
print(String0.find('on is'))
print(String0.find('am'))

The index returned is the index of the first element in the input value.

In [None]:
print(String0[4])

One can also tell the `find()` function between which index values it has to search.

In [None]:
print(String0.find('i',1))
print(String0.find('i',1,3))

`capitalize()` is used to capitalize the first element in the string.

In [None]:
String3 = 'observe the first letter in this sentence.'
print(String3.capitalize())

`center()` is used to center align the string by specifying the field width.

In [None]:
String0.center(70)

One can also fill the left out spaces with any other character.

In [None]:
String0.center(70,'-')

`zfill()` is used for zero padding by specifying the field width.

In [None]:
String0.zfill(30)

`expandtabs()` allows you to change the spacing of the tab character. '\t' which is by default set to 8 spaces.

In [None]:
s = 'h\te\tl\tl\to'
print(s)
print(s.expandtabs(1))
print(s.expandtabs())

The `index()` function works the same way as the `find()` function. The only difference is find returns '-1' when the input element is not found in the string but the `index()` function throws a ValueError

In [None]:
print(String0.index('Python'))
print(String0.index('is',0))
print(String0.index('is',10,20))

The `endswith()` function is used to check if the given string ends with a specific character.

In [None]:
String0.endswith('y')

The start and stop index values can also be specified.

In [None]:
print(String0.endswith('l',0))
print(String0.endswith('M',0,5))

The `count()` function counts the number of a character in the given string. The start and the stop index can also be specified or left blank.

In [None]:
print(String0.count('a',0))
print(String0.count('a',5,10))

The `join()` function is used add a character in between the elements of the input.

In [None]:
'-'.join(String0)

The `join()` function can also be used to convert a list into a string.

In [None]:
a = list(String0)
print(a)
b = ''.join(a)
print(b)

Before converting it into a string `join()` function can be used to insert any char in between the list elements.

In [None]:
c = '/'.join(a)[18:]
print(c)

The `split()` function converts a string into a list using a substring as a n input.

In [None]:
d = c.split('/')
print(d)

In the `split()` function one can also specify the number of times one wants to split the string or the number of elements the new returned list should contain. The number of elements is always one more than the specified number this is because of zero based indexing.

In [None]:
e = c.split('/',3)
print(e)
print(len(e))

`lower()` converts any capital letter to small letter.

In [None]:
print(String0)
print(String0.lower())

`upper()` converts any small letter to capital letter.

In [None]:
String0.upper()

The `replace()` function replaces the given substring with another string.

In [None]:
String0.replace('awesome','super awesome')

The `strip()` function is used to delete leading and trailing characters. If no argument is provided it removes whitespaces (space and tab).

In [None]:
f = '    hello      '
print(f.strip())

When a character is provided `strip()` removes all occurrences of this character. If this character is not at the edges of the string the string is left unchanged.

In [None]:
f = '   ***----hello---*******     '
print("This string is for you to see the leading whitespaces.")
print(f.strip('*'))

If more characters are provided all occurrences of these characters are stripped from the string.

In [None]:
f = '   ***----hello---*******     '
print(f.strip(' *-'))

The `lstrip()` and `rstrip()` function have the same functionality as the `strip()` function but `lstrip()` deletes only towards the left side and `rstrip()` towards the right.

In [None]:
print(f.lstrip(' *'))
print(f.rstrip(' *'))

The `isalpha()` method checks, whether a string contains only letters.

In [None]:
'Hello'.isalpha()

In [None]:
s = '112a'
s1 = '1'
print(s.isdigit(), s1.isdigit())
print(s.isnumeric())
print(s.isdecimal())
print(s.isalnum()) # alphanumeric
print('Hello!'.isalnum())

### Strings are immutable

In contrast to lists, strings are immutable, meaning you can not change parts in a string with assignments and you can't change the string in place.

```python
s = 'Hello Spam!'
s[3] = 'T'
# This will throw an error!
```

In [None]:
s = 'Hello Spam!'
s[1] = 'T'

In [None]:
s = 'Hello Spam!'
_ = s.lower()
print(_, s) # s is unchanged

If you want to change some letters in a string, you have to first create a mutable object. You can do this with the built-in function `list()` or converting the string to a bytearray, which is mutable.

In [None]:
s = 'Hello Spam!'
s = list(s)
s[1] = 'T'
s = ''.join(s)
s

In [None]:
s = bytearray(b'Hello Spam!')
print(s)
s.extend(b' And Eggs!')
s = s.decode()
print(s)

## Dictionaries

Dictionaries are like a database because here you can index a particular sequence with your user defined string. To declare a dictionary you can use braces { } or the `dict()` function. The difference to declaring a set with braces is the use of colons in the expression.

In [None]:
d0 = {} # Empty dictionary NOT empty set
set0 = set()
print(type(d0))
print(type(set0))

In [None]:
d0 = {'One': 1, 'OneTwo': 12}
d0['OneTwoThree'] = 123
d0['ThreeFour'] = 34
print(d0)

The `keys()` built-in method gives you the keys of the dictionary.

In [None]:
print(d0.keys())

To access elements from a dict you use the keys to index the dict.

In [None]:
d0['One']

Two lists which are can be made into a dict using the built-in function `zip()`. Since python3 the `zip()` function is not directly executed. This saves computing time by holding the actual execution of the function back until it is really needed.

To make a dictionary out of this zip object you can call the `dict()` function on it.

In [None]:
keys = ['One', 'Two', 'Three', 'Four', 'Five']
numbers = [1, 2, 3, 4, 5]
d2 = zip(keys, numbers)
print(d2)
d2 = dict(d2)
print(d2)

### Built-in functions

The `clear()` function is used to erase the dictionary.

In [None]:
d2.clear()
print(d2)

A dict can also be built using loops. Loops is a part of control flow, which we will look at shortly.

In [None]:
for i in range(len(keys)):
    d2[keys[i]] = numbers[i]
print(d2)

The `values()` function returns a list with the values of the dictionary.

In [None]:
d2.values()

The `keys()` function gives the keys.

In [None]:
d2.keys()

The `items()` returns a list of tuples, where tuple[0] is the keys and tuple[1] is the value.

In [None]:
d2.items()

The `pop()` function is used to get the value of a key and remove that key from the dict.

In [None]:
popped_value = d2.pop('Four')
print(popped_value)
print(d2)

# Control Flow

Control Flow defines the order on which a program is executed. The most common control flow elements are the two loops (for and while) and the if-elif-else logic. Additionally python has more advanced control flow with break, continue, and pass.

## If

If checks in python **must** follow this formatting:

```python
if condition:
    do_this
```
Note the colon and the indentation in the next line. In python an indentation consists of either a tab character or 4 spaces. No more, no less. Let's try a very simple example.

In [None]:
if True:
    print('This is true')

if False:
    print('This is false')

Note, how only the first if-statement is executed, because the condition of the second is False. Let's go one step further and assign a truth value to a variable.

In [None]:
a = 12
b = 10
truthvalue = a > b
print(truthvalue)

In [None]:
if truthvalue:
    print("a is greater than b")

However, this is not how it's done most of the times. Most often the condition is directly plugged into the if statement. this would look something like this:

In [None]:
x = 12
if x > 10:
    print("Hello")

## If with multiple conditions - Operators

You can use the operators to tailor your if-statements.

In [None]:
x = 15
if x > 10 and x <= 20:
    print("variable lies between 10 and 20")

In [None]:
x = 2
if not x == 1:
    print("x is not 1")

In [None]:
x = 5
if x == 5 or x == 10:
    print("x is either 5, or 10")

## If-else

If an unindented `else:` follows the indent of an if statement. The indented block below the else statement is executed, if the condition inside the if statement is false.

In [None]:
x = 12
if x > 10:
    print("hello")
    print("Multiple lines can be executed here. They all need to be indented")
    a = 2**2
    print(f"With {a} spaces. No more, no less.")
else:
    print("world")

## If-elif-else

If you want to check for multiple conditions you can use a if-elif-else construction.

In [None]:
x = 10
y = 12
if x > y:
    print("x>y")
elif x < y:
    print("x<y")
else:
    print("x=y")

There is also the possibility to nest if statements. Correct indentation becomes very important here.

In [None]:
x = 10
y = 12
if x > y:
    print("x>y")
elif x < y:
    print("x<y")
    if x==10:
        print("x=10")
    else:
        print("invalid")
else:
    print("x=y")

**Here's an exercise for you:**

\begin{exercise}\label{ex:if-else-elif}
Write an if-elif-else statement, that checks whether a number equals zero, is greater or less than zero. Check the numbers 0, -5, 1000, -0.5, 28.9 with this statement.
\end{exercise}

In [None]:
num = 0

if num > 0:
    print(str(num), "is a positive number")
elif num == 0:
    print("It is a zero")
else:
    print(str(num), "is a negative number")

## For loops

The task of repeating all those numbers seemed a bit tedious. Here's a way to automate this: `for`-loops. With for loops you can execute the indented block as often as you'd like. `for`-loops follow a syntax like thos:

```python
for i in iterable:
    run code
    run this code also
    run code until iterable is exhausted
```

Note the colon and the indentation. The `iterable` in this syntax is an iterable object. Some objects, like lists are iterable and can be plugged into the `for`-statement. There's also the built-in function range, which is often used in this context.

In [None]:
for i in [0, 1, 2, 3]:
    print(i)

In [None]:
for i in range(4):
    print(i)

Note, how the built-in function takes the integer 4 as input, but only returns 3 as the highest integer. That's because the range function returns 4 integers, of which 0 is also one. Let's simplify the last exercise using a for loop.

In [None]:
for i in [0,-5,1000,-0.5,28.9]:
    if i > 0:
        print(str(i), "is a positive number")
    elif i == 0:
        print("It is a zero")
    else:
        print(str(i), "is a negative number")

If nested lists are plugged into the for-loop, the outermost list is used as the iterable.

In [None]:
list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
for list1 in list_of_lists:
        print(list1)

If you want to access the lists on lower levels, you can write a nested for-loop.

In [None]:
list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
for list1 in list_of_lists:
    for x in list1:
        print(x)

## For-else

In python you can have an **`else`** statement after the completion of a for-loop to execute something once.

In [None]:
for i in [0, 1, 2, 3]:
    print(f"At iteration {i}. This line is executed mutiple times.")
else:
    print(f"For loop finished. This line is executed one. Variable i is still {i}")

## While loops

While-loops are similar to for-loops. They execute the indented code. However, while-loops execute code as long as a conditional is true. The syntax looks something like this.

```python
condition = True
while condition:
    execute this code
    execute this also
    change the condition
```

Notice how the condition is changed inside the while-loop? This a very important part of while-loops. If you don't change the condition in the while loop, it will be executed indefinitely and eat up a lot of computer resources.

In [None]:
i = 1
count_to = 6

while i < count_to:
    print(i)
    i = i + 1
print(f"Loop finished. counted to {i}")

## Break

`Break` can break for and while-loops. It is often used with an if-statement.

In [None]:
for i in range(100):
    print(i)
    if i>=7:
        break

## Else in loops

There is also the possibility to use the `else` statement in a loop. The code of the else statement is executed, when the loop is terminated through exhaustion of its iterator. It is **not** executed, if a `break` is executed.

In [None]:
for n in range(2, 10):
    for x in range(2, n):
        if n % x == 0:
            print(f"{n} equals {x} * {n//x}. Breaking the loop.")
            break
    else:
        # loop fell through without finding a factor
        print(n, 'is a prime number. Iterator exhausted.')

\begin{exercise}\label{ex:division}
What does the n//x in the f string do?
\end{exercise}

Floor division.

## Continue

Continue continues with the next itreration of the loop.

In [None]:
for i in range(10):
    if i>4:
        print("The end.")
        continue
    elif i<7:
        print(i)

\begin{exercise}\label{ex:understanding_loops}
Change the code in the previous cell, so that it prints "The end." only once.
\end{exercise}

In [None]:
for i in range(10):
    if i>4:
        print("The end.")
        break
    elif i<7:
        print(i)

## Pass statements

The `pass` statement does nothing. It allows you to fulfill the syntax without executing code. Have a look at the next code cell.

In [None]:
for i in [0, 1, 2]:
    pass

for i in [3, 4, 5]:

Check the Error-message:

```
  File "<ipython-input-268-5a99e96149af>", line 4
    for i in [3, 4, 5]:
                       ^
SyntaxError: unexpected EOF while parsing
```

This tells us, that the second for-loop (the one iterating over the list [3, 4, 5]) has a syntax error. The first for-loop however doesn't throw an Error. It also does nothing due to the pass statement.

# Moving on

Please visit the notebook 02_comprehensions_functions_classes.ipynb for further tutorials.

Floor division. See the section on arithmetic operators.