# Introduction to Python - Data Types II, Functions and Packages

In [2]:
# Authors: Matthias Huber (huber@ifo.de), Alex Schmitt (schmitt@ifo.de)

import datetime
print('Last update: ' + str(datetime.datetime.today()))

Last update: 2017-04-21 18:03:31.354790


## TO BE ADDED

- more on sets: methods, examples
- more on dictionaries: methods
- writing own modules/scripts
- doctest, unittest ???



## Sets

*Sets* are similar to tuples in that they are immutable. However, elements are not in a particular order, hence you cannot use indices for sets. Additionally, there are not duplicates, so they can be called an "Unordered collections of unique elements" 

In [47]:
# set
B = {4,5,6}
print (B)

{4, 5, 6}


In [48]:
#print(b[0])  # will throw an error!

As for arrays, the length (number of items) can be measured with the function len()

In [49]:
print (len(B))

3


### Methods for Sets

The equivalent to the **append()** method for sets is the **add()** method. Unsurprisingly, adding an element to a set which is already in there will not change the set.

In [20]:
B.add(3)
print(B)
# BTW: adding an element to a set whic/h is already in there will change the set
B.add(4)
print(B)

{3, 4, 5, 6}
{3, 4, 5, 6}


Further methods of sets are the major mathematical/logical operations. Important ones are:
* **intersection:** $A \cap B$
* **union:** $A \cup B$
* **issubset:** $A \subseteq B$
* **issuperset:**$A	\supseteq B$
* **difference:** $A \setminus B$

In [42]:
# Define a second set, "A"
A={1,2,3,4}
# Intersect and union A with B
print (A.intersection(B))
print (A.union(B))

{3, 4}
{1, 2, 3, 4, 5, 6}


In [43]:
# Test whether A is subset or superset of B
print (A.issubset(B))
print (A.issuperset(B))

False
False


In [46]:
# Compute the differences of the sets
print (A.difference(B))
print (B.difference(A))

{1, 2}
{5, 6}


We can also define a new set that is built by an operation of two existing sets, e.g. $C=A \cup B$

In [45]:
C=A.union(B)
print (C)

{1, 2, 3, 4, 5, 6}


In [40]:
# The union of the sets A and B is a superset of B/A now and B/A are subsets of the new set C 
print (C.issuperset(B))
print (C.issubset(B))
print (B.issubset(C))

True
False
True


## Dictionaries

A very important and useful type of arrays are *dictionaries*. Dictionaries are similar to lists, but its entries (*values*) are indexed by names (*keys*) rather than numbers. In other words, dictionaries are *key-value mappings*: they map a key (e.g. 'name') to a value (e.g. the string 'Alex'). Note that both the keys and the values in a dictionary can be of different types (integers, floats, strings, booleans, arrays etc.).

In [52]:
# dictionary
info = {'name': 'Alex', 'age': 34, 'likes_football': True, 'interests': ['Python', 'Economics', 'Game of Thrones']}

print(info)
print(info['name'])
print(info['age'])
print(info['likes_football'])
print(info['interests'])

{'name': 'Alex', 'interests': ['Python', 'Economics', 'Game of Thrones'], 'likes_football': True, 'age': 34}
Alex
34
True
['Python', 'Economics', 'Game of Thrones']


You can add new key-value pairs to an existing (or an empty) dictionary. 

In [58]:
# add a new entry to an existing dictionary
info['height'] = 1.82
print(info)

# create an empty dictionary and fill it 
residents = dict()
residents['Munich'] = 1.5e+6
residents['Berlin'] = 3.5e+6
residents['London'] = 8.5e+6

print(residents)

{'name': 'Alex', 'interests': ['Python', 'Economics', 'Game of Thrones'], 'height': 1.82, 'likes_football': True, 'age': 34}
{'Munich': 1500000.0, 'Berlin': 3500000.0, 'London': 8500000.0}


The value of entries can be changed:

In [62]:
residents['Munich'] += 100000
residents['London'] = residents['London']*0.9
print(residents)

{'Munich': 1700100.0, 'Berlin': 3500000.0, 'London': 3825000.0}


Like for sets, len() can also be used to determine the lengths(number of elements) of a dictionary

In [63]:
print (len(residents))

3


### Methods for Dictionaries

Both the complete list of keys and of values of a dictionary can accessed by using the **.keys()** and **.values()** methods, respectively:

In [18]:
print(info)

print(info.keys())
print(info.values())

{'name': 'Alex', 'interests': ['Python', 'Economics', 'Game of Thrones'], 'age': 34, 'likes_football': True}
dict_keys(['name', 'interests', 'age', 'likes_football'])
dict_values(['Alex', ['Python', 'Economics', 'Game of Thrones'], 34, True])


Another helpful method for dictsionaries is **.update(other)**. It is used to extend(update) an existing dictionairy.

In [18]:
de_en={'blau':'blue','grün':'green'}
de_en_add={'rot':'red','gelb':'yello'}
print(de_en)
de_en.update(de_en_add)
print(de_en)
de_en.update([('grau', 'newblue'),('schwarz','black')])
print (de_en)

{'blau': 'blue', 'grün': 'green'}
{'gelb': 'yello', 'blau': 'blue', 'grün': 'green', 'rot': 'red'}
{'schwarz': 'black', 'grau': 'newblue', 'gelb': 'yello', 'blau': 'blue', 'grün': 'green', 'rot': 'red'}


In [21]:
print(de_en['schwarz'])

black


## Writing Functions

In order to get an intuition behind the idea of a function in Python (or in any other programming language), recall what a function in Math does, for example $y = f(x) = x^2$: it is a mapping that takes a number $x$, performs some operation on it -- here multiplies it with itself -- and then returns the result as "output" $y$. A function in programming does the same, with the difference that inputs and outputs can be anything, not just numbers.

Functions come in two varieties: built-in functions are contained in the Standard Library or some package, and be used right away (if they are part of a module, this has to be imported first, see below). We already encountered some functions, namely **print()**, **type()**, **len()** and **range()**. The full list of built-in functions can be found here: 

In [147]:
url = 'https://docs.python.org/3/library/functions.html'
webbrowser.open(url)

True

In addition, you can (and should) write your own functions. Here are three examples. The first one, called 'sum_squared' translates the math function $ f(x,y) = x^2 + y^2$ into Python code: it takes two numbers (int or float) as inputs and returns the sum of its squares. The second function, 'reverse_order', takes a list and returns it in reverse order. The third example just prints a string:

In [30]:
def sum_squared(x, y):
    return x**2 + y**2

def reverse_order(ls):
    return ls[::-1]

def all_men_must_die():
    print("Valar Morghulis!")

print(sum_squared(8, 3))

names = ['Daenerys', 'Tyrion', 'Arya', 'Samwell']
names_reverse = reverse_order(names)
print(names)
print(names_reverse)

all_men_must_die()
print(all_men_must_die())

73
['Daenerys', 'Tyrion', 'Arya', 'Samwell']
['Samwell', 'Arya', 'Tyrion', 'Daenerys']
Valar Morghulis!
Valar Morghulis!
None


Some comments about the syntax for writing functions:
1. A function always starts with the keyword **def** (for define or definition). This is followed by the *function name*, which is the choice of the programmer and can be virtually anything - be careful though not to use names that are *already used for built-in functions*!
2. The function name is followed by *parentheses* containing the names for the inputs into the function. Sometimes functions may not take inputs, in which cases the parentheses are left empty (as seen in the third example above).
3. As it is the case with loops and if-statement, the function definition is concluded with a semi-colon (**:**).
4. After the semi-colon follows the code block where you define the operations that the function should perform. The rules about *indentation* that we discussed in the context of loops above apply here as well. The code block can consist of a single return statement or many lines of code. 
5. The output that a function gives is determined by the **return** statement. If there is no return statement, the function returns "None". Note that a function can have arbitrarily many return statements; execution of the function terminates when the first return is hit:

In [148]:
def f(x):
    if x < 0:
        return 'negative'
    return 'nonnegative'

print(f(-3))

negative


Functions are an extremely important tool in computing. User-defined functions help improving the clarity and readability of your code by
- separating different strands of logic
- facilitating code reuse

In other words, very often a (large) computational problem is broken up into smaller subproblems, which are coded up as functions. The main program then coordinates these functions, calling them to do their job at the appropriate time.

In order to increase the clarity of your code, it is good practice to include a description about what the function does. Inserting regular comments using "#" would do the trick, but a better way is using *doc strings*, as in the following example. The great advantage is that you can access the description without actually opening the function (this is very useful when your function is stored in a different file or in an imported package): 

In [2]:
def reverse_order(ls):
    """
    Takes a list and returns it in reverse order
    """
    return ls[::-1]


In [4]:
reverse_order?

## Importing Modules and Packages

So far, we have used data types and functions which are part of the core language and which you can use without any additional code. In addition to this core functionalities, the Python standard library also contains *modules*. Modules are basically files that contain additional functions and definitions. In order to use the functions provided by a module, you need to **import** it. We have done this already at the beginning of this tutorial when we imported the module *webbrowser* in order to open a webpage. As a second example, the following cell imports a module called *random*, which you can use, among other things, to draw a random number from a uniform distribution. Importing the whole module makes all functions available for use in your program. In this case, the name of the function - e.g. *uniform* - must be preceded by the name of the module, i.e. *module_name.function_name*.

In [7]:
import random   # import module

print(random.random()) # draws a random number from a uniform distribution between 0 and 1
print(random.uniform(0,1)) # draws a random number from a uniform distribution between 0 and 1
print(random.uniform(-5,5)) # draws a random number from a uniform distribution between -5 and 5
print(random.randrange(0, 11))  # draws a random integer from 0 to 10 (i.e. excluding the given endpoint)

0.6865121121796949
0.6358618210048055
-0.9321606090870684
2


Alternatively, you can import individual functions from a module. Then, calling the name of the function is sufficient. However, I would avoid this syntax for the most part since it may cause potential conflicts with respect to the variable or function names. 

In [8]:
from random import uniform   # import module

print(uniform(0,1)) # draws a random number from a uniform distribution between 0 and 1

0.784059537922489


In addition to the functions and modules contained in the standard library, there is a large number of *packages* or *external libraries*. Those are usually written and maintained by external developers and consist of one or more modules. If you have installed the Anaconda distribution of Python, many packages are automatically included, which means you just need to import them. If you have only the core package installed or if you want to use a package that is not part of Anaconda, you will need to download and install it first. 

## Writing and importing own modules
Python allows you to define individual modules that can be imported in the same way as modules from the core language or from other developers. As modules will be changed during their development, they should be updated also in the current working environment. 

In [3]:
from importlib import reload
#reload(module_name)

In [2]:
# Alternatively There is a so-called magic function that enables automatic reloading/updating of modules:
# %load_ext autoreload
# %autoreload 2

As a simple example, we put the functions that were defined above in an external file, our first module that we call "firstmodule". The file can be written in each text editor and has to be save with the ending ".py".

In [12]:
import firstmodule as fm

In [13]:
fm.sum_squared(2,3)

13

In [None]:
fm.all_men_must_die()

After changing the module, it has to be reloaded to activate the changes

In [4]:
reload(firstmodule)

NameError: name 'firstmodule' is not defined

## Doctests