# Introduction

In this introduction, you will learn how to use Python. We will start with practicalities on some basic operations of Python. In the next sessions, we will be using some more advanced libraries that will make things easier.

<div class="alert alert-info">
This Python notebook is intended to be used as an exercise. We have prepared it for you to include many details, but at some parts we will ask you to fill in some of the blanks. Exercises where you are asked to do something, or to think about something, will be indicated like this. If you need to execute and write your own code, we provide empty space below to do so.
</div>

<div class="alert alert-warning">
If you need any help with anything, please don't hesitate to ask.
</div>

<div class="alert alert-success">
    A Jupyter notebook is made up of various cells, each containing a piece of text or some code that you can run. You can move from one cell to another by using the arrow keys or by clicking a cell with the mouse. In order to execute the code in a cell you have to press <code>Ctrl-Enter</code> while selecting the code cell. Alternatively, you can press the "<i class="fa fa-step-forward"></i> Run" button at the top of the screen. This also moves to the next cell at the same time. Using <code>Shift-Enter</code> instead of <code>Ctrl-Enter</code> will also execute the code and move to the next cell at the same time.
</div>

In [1]:
'Hello world'

'Hello world'

<div class="alert alert-success">
    If you have executed that code cell correctly, you should now see <code>In [1]:</code> in front of it. While the code in a cell is being executed it is marked by an asterisk <code>*</code>. Each cell of executed code will be numbered in the order in which you execute it. If you execute it again, it will be numbered <code>2</code>, et cetera.
</div>

<div class="alert alert-info">
    Calculate what is $2 \times 3$ by executing <code>2 * 3</code>.
</div>

<div class="alert alert-success">
To start typing in the cell below, select the cell using the mouse, or select it using the arrows on the keyboard and press <code>Enter</code>.
</div>

In [3]:
2*3

6

# Variables

As any other language, you can use variables in Python and perform all types of calculations and operations on them. Assigning something to a variable it straightforward in Python. You can simply type a variable name and use `=` to assign it some value.

In [6]:
x = 2
y = 3

<div class="alert alert-success">
Note that there is no output of this cell: it will only print something by default if the last line of the cell is not assigned to any variable. 
</div>

If you have executed the cell above, the variable `x` is now assigned the value of `2` and `y` is assigned the value of `3`. You can now use the variables `x` and `y` to do some calculations instead of `2` and `3` to calculate something. For example:

In [7]:
x * y

6

In Python source code you sometimes want to comment on some code. In Jupyter notebooks, you can often comments on the code in other cells, but nonetheless it might sometimes be useful to include some comments in the code itself. You can do that by using `#`: anything trailing the `#` is ignored by Python, and you can include some comments there.

In [8]:
x * y # Calculate the multiplication of x and y

6

You can also assign multiple variables at the same time using Python.

In [9]:
x = y = z = 3

<div class="alert alert-info">
Calculate $x \times y \times z$.
</div>

In [10]:
x * y * z

27

Sometimes you want to explicitly output something. You can do that by using `print`.

In [11]:
print('Hello world')
print(x)

Hello world
3


There are essentially four important basic types of variables in Python:
1. Integers: `int`
2. Floating point numbers: `float`
3. String (i.e. textual data): `str`
4. Boolean (`True`, `False`): `bool`

In Python you don't need to be explicit about the type of a variable. You can re-assign values to existing variables, even from different types.

In [12]:
x = 'Hello world'
print(x)

Hello world


<div class="alert alert-warning">
    You can use both a single quote <code>'</code> or a double quote <code>"</code> to create strings in Python, it does not matter what you use (as you long as you use matching pairs of course).
</div>

You can perform typical calculations on numbers (i.e. `int` and `float`):
1. Addition `+`
2. Subtraction `-`
3. Multiplication `*`
4. Division `/`
5. Exponentiation `**`

Some of these operations can also be used on strings:

In [13]:
2**3 #this is for exponents

8

In [14]:
x = 'Hello'
y = 'world'
z = x + " " + y
print(z)

Hello world


Sometimes you can even mix an operation with different types:

In [15]:
x = '-'
y = 40
z = x * y
print(z)

----------------------------------------


Sometimes you want to explicitly convert one type to another type:

In [18]:
x = 1.7
y = int(x)
z = round (x)
print(x)
print(y)
print (z)

1.7
1
2


Instead of simply converting a floating point number to an integer, you can also use `round`

In [None]:
x = 1.7
y = round(x)
print(x)
print(y)

Sometimes, you explicitly want to convert a number to a string. For example, you cannot add `42` to `'Hello'`:

<div class="alert alert-warning">
    The cell below will raise an error when executed.
</div>

In [19]:
x = 'Hello'
y = 42
x + y

TypeError: can only concatenate str (not "int") to str

You can add `'42'` to `'Hello'` though:

In [20]:
x + str(y)

'Hello42'

<div class="alert alert-info">
Create a new string that says <code>'Cuckoo'</code> 3 times. Now create a string that says <code>'Cuckoo'</code> 20 times.
</div>

In [24]:
cuckoo3='Cuckoo'*3
cuckoo20='Cuckoo'*20
print(cuckoo3)
print(cuckoo20)

CuckooCuckooCuckoo
CuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckooCuckoo


There is one special value in Python to indicate that a certain variable does not have any value. Such a value can for example be used to indicate a missing value. This value is called `None`.

In [25]:
x = None
print(x)

None


# Collections

One thing that is very easy to do in Python it to use collections of items. There are essentialy three types of collections in Python:
1. `list` and `tuple`. This is simply a list of items.
2. Dictionary (`dict`). This is a list of key-value pairs.
3. `set`. This is, well, a set of items, meaning that the items have to be unique.

## List and tuples

Both a `list` and a `tuple` essentially do the same thing: they store a list of items. That is, both contains items in some order. However, a `list` can be changed, whereas as `tuple` cannot. Once a `tuple` is created, it cannot be changed. A `list` is probably more often useful, and we will treat that in more detail.

### Tuple

You can create a tuple by enclosing multiple values between parentheses `(` and `)`.

In [26]:
x = (1, 2, 3) # Make a tuple of three values

You can access individual items as follows:

In [29]:
x[2] #rember that the tuple starts by 0, therefore the 3rd position is 2

3

You might have expected a `2` as output. Note that Python starts counting from `0`, such that the first item of a list or a tuple has index `0`, the second item has index `1`, et cetera. As said, you cannot change the value of a tuple:

<div class="alert alert-warning">
The below will generate an error, this is intended.
</div>

In [30]:
x[2] = 10

TypeError: 'tuple' object does not support item assignment

### List

You can create a list by enclosing multiple values between brackets `[` and `]`.

In [31]:
x = [1, 2, 3] # Make a list of three values

You can similarly access individual elements as follows:

In [32]:
x[2]

3

You can change the content of a list:

In [33]:
x[2] = 10
print(x)

[1, 2, 10]


You can also simply refer to other variables in a list, and you can mix types of variables in any way you like.

In [34]:
x = "Hello"
y = [10, x, 1.3]
print(y)

[10, 'Hello', 1.3]


Note that variables are in a sense simply labels. That is, if we refer by the same list using two labels, if we change something in one list, it will also change in the other list.

In [35]:
x = [1, 2, 3]
y = x
y[2] = 10
print(x)

[1, 2, 10]


If you really want to *copy* a list, instead of just having the same label, you need to be explicit about it.

In [37]:
x = [1, 2, 3]
y = x.copy()
y[2] = 10
print(x)
print(y)

[1, 2, 3]
[1, 2, 10]


This is the first time we are encountering a *method*. This is some function of an object that you can execute. In this case, the method `copy` returns, well, a copy of the list.

<div class="alert alert-success">
Now it is time to introduce you a little trick: you can get a list of all methods of some variable by simply pressing <code>Tab</code>. For example, you can type <code>x.</code>, including the <code>.</code> and then press <code>Tab</code> (make sure the cursor is located after the <code>.</code>). If you then start typing the name of the function you are looking for and press <code>Tab</code> again, Python will automatically finish it as much as possible. This is something general: whenever you press <code>Tab</code> Python will try to <em>autocomplete</em> whatever you are typing.

One other trick: if you have selected a function and press <code>Shift-Tab</code> you will get documentation of what this function does. You can press the <code>+</code> to find out more.
</div>

<div class="alert alert-info">
What other methods of a list start with a c? (Use the hint above)
</div>

In [40]:
len(x)

3

You can also access slices of items instead of individual elements:

In [41]:
x = [1, 2, 3, 4, 5]
x[:3]

[1, 2, 3]

The notation `[:3]` means all items up to, but not including, `3`, i.e. index `0`, `1` and `2`. In other words, it simply lists the first 3 items. You can also get some specific slice, starting not with the first, but with the second item:

In [42]:
x[1:3]

[2, 3]

<div class="alert alert-info">
    Get the first 4 items of <code>x</code>.
</div>

In [43]:
x[:4]

[1, 2, 3, 4]

One neat trick is that you can also get a slice up until the last item. This uses negative indexing, where `-1` refers to the before last item. For example you can get all items except the last one:

In [44]:
x[:-1]

[1, 2, 3, 4]

Similarly, you can get the last three items using

In [45]:
x[-3:]

[3, 4, 5]

<div class="alert alert-info">
    Get the middle 3 items of <code>x</code> in two different ways, once using positive indices and once using negative indices.
</div>

In [46]:
print(x[1:4])
print(x[-4:-10]) #this didn't work out for me
print(x[1:-1])

[2, 3, 4]
[]
[2, 3, 4]


You can add items to a list by using `append`.

<div class="alert alert-warning">
If you (accidentally) run the cell below multiple times, multiple items will be added.
</div>

In [None]:
x.append(6)
print(x)

You can actually also easily concatenate two lists by using `+`.

In [49]:
x = [1, 2, 3]
y = ['one', 'two', 'three']
x + y

[1, 2, 3, 'one', 'two', 'three']

<div class="alert alert-info">
    Try out: what does <code>*</code> do when combining an integer number and a list?
</div>

In [50]:
x = 10
y = [1, 2, 3]
print (x*y)

[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]


It is no coincide that `*` does the same thing on strings and on lists. Essentially, strings are just lists of single characters.

In [51]:
x = 'Hello world'
print(x[6:])

world


You can get the length of a list using `len`

In [52]:
len(x)

11

You can also sum all elements in a list, and combining this you can calculate the average

In [53]:
x = [1, 2, 3]
print(sum(x))         # Total
print(sum(x)/len(x))  # Average

6
2.0


## Set

A `set` is simply a list of items that all have to be unique. Similar to a list, the items can be anything you like. A `set` is unordered, meaning that it does not maintain the ordering in which items are added.

You can construct a `set` by using braces `{` and `}`.

In [55]:
x = {3, 'Netherlands', 5}
print(x)

{3, 'Netherlands', 5}


A `set` is dynamic, and you can easily add items to it:

In [57]:
x.add('Germany')
print(x)

{3, 'Netherlands', 5, 'Germany'}


If you add an item to a set that is already included, it is silently ignored.

In [None]:
x.add(3)
print(x)

<div class="alert alert-warning">
    The cell below will raise an error when executed.
</div>

You can also use typical set operations on sets, such as union or intersection.

In [58]:
y = {3, 'United State', 6}
print(x.union(y))
print(x.intersection(y))

{3, 5, 6, 'United State', 'Netherlands', 'Germany'}
{3}


# Dictionaries

Dictionaries are collections that contains pairs of keys and values. That is, instead of a numeric index (e.g. index `2`) you can use any index (e.g. `Netherlands`). As with a list, an index in a dictionary can refer to only one value, so if `Netherlands` is already in the dictionary, it will always refer to the same index. The indices are called the keys, and the content for all indices are called the values.

You construct a dictionary by using braces `{` and `}`, where you separate the key and the value using `:`.  Note that this notation is similar to a set, but with the additional notation of assigning of a value to a key. Indeed, the keys of a dictionary also have to be unique, and are hence similar to a set.

In [59]:
x = {'Netherlands': 30, 'United States': 50}
print(x['Netherlands'])

30


You can separately list the keys and the values, these are a sort of lists or sets.

In [60]:
print(x.keys())
print(x.values())

dict_keys(['Netherlands', 'United States'])
dict_values([30, 50])


You can simply add an item by using an index that was not used before:

In [61]:
x['United Kingdom'] = 40
print(x)

{'Netherlands': 30, 'United States': 50, 'United Kingdom': 40}


The key and value can be of any type, and you can freely mix stuff. Whether that makes things clear is another question though.

In [62]:
x[100] = 'France'
print(x)

{'Netherlands': 30, 'United States': 50, 'United Kingdom': 40, 100: 'France'}


You can easily combine lists and dictionaries in various ways. For instance, below we create a dictionary of lists.

In [63]:
x = {'Netherlands': [3, 5, 6], 'Germany': [10, 8, 20]}

<div class="alert alert-info">
    Create a list of dictionaries, feel free to simply make up something.
</div>

In [66]:
x={'cwts': 1, 'Germany':10}, {'netherlands':5, 'germany':8}
print (x)

({'cwts': 1, 'Germany': 10}, {'netherlands': 5, 'germany': 8})


# Control structure

As many other programming languages, Python has control structures such as `if .. else` and `for`. There are other relevant control structures, such as `while`, but we will not treat them here.

### `if .. else`

An `if .. else` statement does something conditional. Depending on whether something is true or false, it will execute either one statement of the other.

In [None]:
x = 3
y = 4
if x == y:
    print('x and y are the same')
else:
    print('x and y are different')

If you need to consider another possibility still, you can use `elif`, which is short for "else if". This else if is executed only if the first conditional is false.

In [None]:
if x == y:
    print('x and y are the same')
elif x < y:
    print('x is smaller than y')
elif x > y:
    print('x is larger than y')
else:
    print('This should not be possible')

<div class="alert alert-warning">
One noticeable thing about Python is that it is very (very) picky about the indentation. In some programming languages, the indentation is irrelevant, but in Python it recognizes where a "block" of code stops because of the indentation.
</div>

Normaly you can test for equality using `==`, but this is different when comparing to `None`. You should test using `x is None` instead:

In [None]:
x = None
if x is None:
    print('x is None')

<div class="alert alert-warning">
    In this case you can also test using <code>x == None</code>, but this is not guaranteed to always work correctly. There is a good reason for this, but this is beyond the current introduction. You should also use `is` for some other comparisons, but this is again beyond the current introduction.
</div>

Testing whether something is *not* equal is done using `!=`, or `is not` when comparing to `None`.

In [None]:
x = 3
y = 4
if x != y:
    print('x and y are different')

You can combine different comparisons using the the keywords `and` and `or` and negate something using `not`. For example:

In [None]:
x = y = 3
z = 4
if x == y and x < z:
    print('Hurray!')

Finally, you can test whether something is part of a collection using `in`.

In [None]:
x = ['Netherlands', 'Germany', 'United States']
y = 'Netherlands'
if y in x:
    print('Included in the list')

## `for` loops

`for` loops are very convenient for looping through elements of things, for example lists. You always use it like `for .. in ..`. An example is probably most instructive:

In [67]:
y = [1, 2, 3, 4, 5]
for x in y:
    print(x*x)

1
4
9
16
25


You can also simply loop over some numbers using `range`:

In [68]:
for x in range(5):
    print(x*x)

0
1
4
9
16


You can also use `for` to create lists. This is called "list comprehension"

In [69]:
z = [x*x for x in y]
print(z)

[1, 4, 9, 16, 25]


You can also loop over items in a dictionary.

In [70]:
y = {'Netherlands': 30, 'Germany': 50, 'United Kingdom': 40}
for country, value in y.items():
    print(country + ' has ' + str(value))

Netherlands has 30
Germany has 50
United Kingdom has 40


The above cell illustrates something that you had not encounted before. You can "unpack" values from collections in Python, simply by listing more than a single variable in front of the `=`.

In [71]:
x = [1, 2]
y, z = x
print(y)
print(z)

1
2


Similar to lists you can also use `for` to create dictionaries. This is called "dictionary comprehension".

In [72]:
y = {'Netherlands': 30, 'Germany': 50, 'United Kingdom': 40}
z = {c : v*v for c, v in y.items()}
print(z)

{'Netherlands': 900, 'Germany': 2500, 'United Kingdom': 1600}


Sometimes you want to not only have the actual item of a list, you also want to have its index. You can use `enumerate` for this. This will return two values in each iteration: the enumerated index and the value.

In [73]:
y = [x*x for x in range(5)]
for i, x in enumerate(y):
    print(str(i) + '*' + str(i) + '=' + str(x))

0*0=0
1*1=1
2*2=4
3*3=9
4*4=16


You can use `any` or `all` to check if some condition holds for any or all items in a collection.

In [74]:
x = [1, 2, 3, 4, 5]
print(all(v < 5 for v in x)) # Check if all values are smaller than 5
print(any(v < 5 for v in x)) # Check if any value is smaller than 5

False
True


# Functions

Sometimes it can be easier to put some functionality in a separate function, especially when the functionality is more complicated. You can simply create functions by using `def`. Let us consider the following example.

In [75]:
def square(x):
    return x*x

This function is called `square` and accepts one argument, called `x` and returns a single value, namely the square of `x`. You can now use the function `square`:

In [76]:
square(10)

100

You can also define functions with multiple values

In [77]:
def repeat(x, n):
    return x*n
print(repeat('-', 10))

----------


In this case, there are two arguments, `x` and `n`. From the function itself, it is not directly clear what argument should do what, you could clarify that in comments (or in actual documentation, but we will not treat that here).

You can use a function in any other place, and also combine functions.

In [78]:
[repeat('o', square(n)) for n in range(10)]

['',
 'o',
 'oooo',
 'ooooooooo',
 'oooooooooooooooo',
 'ooooooooooooooooooooooooo',
 'oooooooooooooooooooooooooooooooooooo',
 'ooooooooooooooooooooooooooooooooooooooooooooooooo',
 'oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo',
 'ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo']

# Reading a file

You can open a file for reading using `open`. You can simply read things line by line, or you can iterate through all remaining lines of a file simply using `for`.

In [79]:
with open('../data/publications.txt') as f:
    header = f.readline()
    lines = []
    for line in f:
        lines.append(line)

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 7101: character maps to <undefined>

<div class="alert alert-warning">
    Using <code>with</code> ensures that the file is also correctly closed. If you don't use <code>with</code> you have to remember to close the file manually, by calling <code>f.close()</code>.
</div>

In fact, there's a specialised function which does exactly this for you:

In [None]:
with open('../data/publications.txt') as f:
    header = f.readline()
    lines = f.readlines()

Let's see how many lines we have read

In [None]:
len(lines)

Let us take a look at the first line, the header

In [None]:
header

The `\t` you see are tabs, this is actually a tab-delimited file. You also see an `\n` at the end of the line, this is a so-called "newline", and it indicates that this is the end of the line.

You can split the header using the function `split`.

In [None]:
header.split('\t')

<div class="alert alert-info">
    This leaves an annoying <code>\n</code> in the last item of the list. Can you remove it?
</div>

<div class="alert alert-info">
    Can you similarly try to split the first line in <code>lines</code>?
</div>

<div class="alert alert-info">
    Can you split all lines, and store it in the variable <code>splitlines</code>?
</div>