<div style="width:image width px; font-size:75%; text-align:right;">
    <img src="img/everyone_adi-goldstein_unsplash.jpg" width="width" height="height" style="padding-bottom:0.2em;" />
    <figcaption>Photo by Adi Goldstein on Unsplash</figcaption>
</div>

# Introduction to Python 2

**Applied Programming - Summer term 2020 - FOM Hochschule für Oekonomie und Management - Cologne**

**Lecture 04 - April 02, 2020**

*Dennis Gluesenkamp*

### Table of contents
* [Recap](#recap)
* [Loops](#loops)
* [Collections](#collections)
    * [Lists](#lists)
    * [Tuples](#tuples)
    * [Sets](#sets)
    * [Dictionaries](#dictionaries)
* [Modules](#modules)
* [Packages](#packages)
* [Third-party modules/packages](#thirdpartymodpack)
    * [sqlite3](#sqlite3)
    * [pandas](#pandas)
* [Exercise for the next lecture](#exerciselecture)
* [References](#references)

### Recap<a class="anchor" id="recap"></a>

To briefly repeat what you learned in the previous lecture, please answer the following questions or write the Python code according to the instructions
1. Write your solutions under this cell by creating more new cells. Introduce the solutions with a header of the fourth level.
2. What is the function to determine that 'foo' is a string and 3.1415 is a floating number?
3. What is the difference between an expression and a statement?
4. What is the problem with the following statement? Correct the code so that it is runnable.
```python
12monkeys = 'film'
result = 3 * 12monkeys
print(result)
```
5. What are the basic types of packages and modules?
6. Calculate the quotient of 11 (dividend) and 7 (divisor) and round the result to four decimal places.
7. Create/define a custom function with three arguments that does the following: sum of the first and second argument, modulo third argument, and returns the result as a floating number. Set a default value of 2 for the third argument. 

### Loops<a class="anchor" id="loops"></a>
In the last lecture the ``if`` statement was shown as a possible element for programming conditions. Another typical element for the implementation of execution sequences in code are loops. The two possibilities ``while`` and ``for`` in Python will be shown here. [[1]](#downey2015)

In [None]:
def countdown(n = 10):
    while n > 0:
        print(n)
        n = n-1
    print('Lift off!')

In [None]:
countdown()

With the while loop it is important to do the programming carefully. The variable of the test condition should change so that the loop does not run infinitely  [[1]](#downey2015).

The for ``loop`` in Python works more like an iterator than typically found in other programming languages where it loops through a set of code for a certain number. Of course the following loop is correct and functional:

In [None]:
def countup(n = 10):
    for i in range(n):
        print(i+1)
    print('Finished!')

In [None]:
countup()

However, you can also (and not only) iterate over the elements of lists:

In [None]:
for i in ['A1', 'B2', 'C3']:
    print(i)

Or even strings:

In [None]:
for i in 'LoremIpsum':
    print(i)

### Collections<a class="anchor" id="collections"></a>
Python implements four types of data containers for general purposes which are called containers [[2]](#python2020a). These are [[3]](#w3schools2020):
* **List** is a collection which is ordered and changeable. Allows duplicate members.
* **Tuple** is a collection which is ordered and unchangeable. Allows duplicate members.
* **Set** is a collection which is unordered and unindexed. No duplicate members.
* **Dictionary** is a collection which is unordered, changeable and indexed. No duplicate members.

The property that elements of a collection can respectively cannot be changed is called mutable respectively immutable. If you want to go deeper into this topic, the article by Megha Mohan [[4]](#mohan2017) is recommended.

#### Lists<a class="anchor" id="lists"></a>
A list is a sequence of values, so it is ordered. The values of such a list are called items or elements and can be of any type - except a string which is also a list of characters. By the index you can access the items. Indices in Python start with 0. With negative indices the elements can be accessed backwards, so starting at the end of the list.

In [None]:
listA = ['Mother', 'Father', 'Daughter', 'Son']
type(listA)

In [None]:
listA[1]

In [None]:
listA[1:4]

In [None]:
listA[2:]

In [None]:
listA[-2]

In [None]:
listA[-4:-2]

In [None]:
# Append additional elements to the list
listA.append('Grandma')
listA

In [None]:
# Insert elements at a specific position
listA.insert(4, "Grandpa")
listA

In [None]:
# Change elements in the list, because lists are mutable
listA[5] = 'Baby'
listA

In [None]:
# Remove an element by value
listA.remove('Grandpa')
listA

In [None]:
# Remove an element by index (default is last element)
listA.pop()
listA

##### Exercises
1. Create a list of five fruits and variable name of your choice.
2. Make a copy of that list, so that the upcoming change in task 3 doesn't change the copy of your list. How can you do this? Explain in this context what a reference to an object is. (You may have to do some research on the Internet to find the solution.)
3. Remove the third element by index.
4. Join/concatenate your original list and the copy. There are multiple ways for this task.

#### Tuples<a class="anchor" id="tuples"></a>
The striking difference to lists is that tuples are not changeable - in other words *immutable*. However, their elements are also ordered by indexes.

In [None]:
tupleA = ('Mother', 'Father', 'Daughter', 'Son')
type(tupleA)

In [None]:
tupleA[1:4]

In [None]:
tupleA[-4:-2]

In [None]:
# Elements of the tuple, resp. the tuple as a whole is immutable
tupleA[1] = 'Men'

##### Exercises
1. Create a tuple of five car manufacturers and variable name of your choice.
2. Iterate through the tuple with a ``for`` loop and print the current manufacturer name in each loop iteration.

#### Sets<a class="anchor" id="sets"></a>
In contrast to tuples and lists, sets do not have an order and therefore the elements do not have indices. However iteration is possible and you can check whether an element exists in the set. The sets are mutable, whereas Python offers also a specific type of immutable sets: ``frozenset`` [[5]](#python2020b)

In [None]:
setA = {'Mother', 'Father', 'Daughter', 'Son'}
type(setA)

In [None]:
# Elements doesn't have an index
setA[1]

In [None]:
for element in setA:
    print(element) 

In [None]:
'Father' in setA

In [None]:
setA.add('Grandma')
setA

In [None]:
# Adding duplicates isn't possible but without error
setA.add('Father')
setA

In [None]:
setA.remove('Grandma')
setA

In [None]:
# Raises an error if element doesn't exist
setA.remove('Grandma')

In [None]:
# Different method for removing items without error raising
setA.discard('Grandma')

##### Exercises
1. Create a set of five cities and variable name of your choice.
2. Copy your set to a new variable and change two cities to other cities of your choice.
3. Research and test the set-related method ``intersection()``.
4. Which SQL-statement is equivalent to the ``intersection()`` method? Write the SQL-statement here in a Markdown cell by using appropriate syntax highlighting.

If you want to know more about the order in which sets are displayed, see [[6]](#aroutiounian2013).

#### Dictionaries<a class="anchor" id="dictionaries"></a>
Dictionaries consists of key-value-pairs which are unordered but the pairs have indices and can be changed - so dictionaries are *mutable*.

In [None]:
dictA = {
    'mother': 'Mary',
    'father': 'Pete',
    'daughter': 'Daisy',
    'son': 'Chuck'
}
dictA

In [None]:
# Access value by key, not vice versa
dictA['mother']

In [None]:
dictA.get('father')

In [None]:
dictA['mother'] = 'Deanna'
dictA

### Modules<a class="anchor" id="modules"></a>
Modules are one of the levels of abstraction in Python and technically nothing more than individual Python files (``*.py``). These files contain variables and functions that belong together in a certain sense and are therefore implemented or defined separately. [[7, p.61](#reitz2016); [8](#learnpython2020)]

To create a module, it is therefore only necessary to write the desired code in a Python file. This file can then be integrated into the current project, for example a Jupyter Notebook, by calling ``import`` followed by the module name which is the file name. It is possible to name the included module with an (abbreviated) name. This is done with the command ``as``.

In [None]:
import examplemodule as em

In [None]:
em.calcEuclideanDist([0, 0, 0], em.coord2)

### Packages<a class="anchor" id="packages"></a>
If the complexity and the structure of a project makes it necessary to combine several modules to a superior unit, so-called packages can be created. These are folders that contain module files. The only special feature is the existence of a ``__init__.py`` in this folder, which contains overall definitions for the whole package. [[7, p.65](#reitz2016); [8](#learnpython2020)]

A module of a package is included by the ``import`` command, preceded by the package name and separated from the module name by a period.

In [None]:
import examplepackage.examplemodule as pem

In [None]:
help(pem)

In [None]:
pem.calcEuclideanDist(pem.coord1, [-1, -1, -1])

### Third-party modules/packages<a class="anchor" id="thirdpartymodpack"></a>
Besides the very extensive built-in modules and packages that Python already comes with by default, an overwhelming number of third-party packages can be installed and used. For an overview of such additional packages already included in the Anaconda distribution, please refer to [[9]](#anaconda2020). Here you can also see whether these packages are already available with the installation of Anaconda or whether they need to be installed later via conda. If you are not using Anaconda, you can install additional packages via ``pip`` [[10]](#pip2020). 

At this point we want to take a first, superficial look at two third-party packages: ``sqlite3`` [[11]](#sqlite32020) and ``pandas`` [[12]](#mckinney2011). In the next lectures we will learn more packages/modules and their usage.

#### sqlite3

In [None]:
import sqlite3

In [None]:
con_obj = sqlite3.connect('dat/baseball.sqlite')
c = con_obj.cursor()

In [None]:
c.execute("""
SELECT   name_full, city
FROM     college
WHERE    state = 'TX'
""").fetchall()           # use fetchone() for single line of output

#### pandas

In [None]:
import pandas as pd

In [None]:
df = pd.read_sql_query("""
SELECT   park_name, city
FROM     park
WHERE    state = 'GA'
""", con_obj)

In [None]:
df

In [None]:
# Close the database connection
con_obj.close()

##### Exercises
1. Write an SQL-statement for the ``read_sql_query()`` method of the pandas module which copies the solution to the question t) of the live exercises/example in lecture 2. Store the result in a variable.
2. Solve the following tasks using the pandas-documentation or a cheat sheet:

    a) Delete the column ``park_name`` from the DataFrame.
    
    b) Store a filtered copy table in a new variable where rows with an empty ``city`` column are excluded.
    
    c) Add a new column ``adress`` to the just create table, which is composed of the information on ``park``, a comma followed by a blank and the ``city``.

### Exercise for the next lecture<a class="anchor" id="exerciselecture"></a>
The National Football League (NFL, US professional league for American football) consists of 32 teams divided into two conferences, the AFC and the NFC. Each of these Conferences is again divided into - more or less - regional divisions, named after the compass directions, e.g. NFC South. A tabular overview can be found at https://bit.ly/2UP8Cfi (English) or https://bit.ly/2xHHoPH (German).

**For the following tasks, please form teams of at least (and ideally also a maximum of) three people, whereby you determine a team lead.** *Please*, organize yourself in such a way that the previous knowledge is equally distributed in each team. Especially git/GitHub knowledge should be at least a little bit present in each team.

1. The team lead creates a repository for this mini-project on GitHub and invites the other team members as collaborators (Settings >>> Manage access >>> Invite a collaborator).
2. In the subfolder "modules" of the repository of this lecture you will find a template file named "template_l04.py". The team lead should download it and add it to your repository. It is up to you whether you do this in a subfolder or directly on the top level.
3. In the template, which will be a module for your project, you will find an initial structure for a dictionary and two methods. The team lead is responsible for filling the dictionary and the other two team members share the methods between themselves. (If there are more than three people in your team, more people can create separate dictionaries for other NFL divisions, thus increasing the database).
4. Each collaborator now creates his or her own branch in git/GitHub on which he or she will work.
5. When the development work is completed, the team lead is responsible for merging the individual results and testing. The result should be able to be integrated as a module or package into a Jupyter Notebook, where, if two teams are specified, a sentence can be displayed about which team is older.

### References<a class="anchor" id="references"></a>

[1]<a class="anchor" id="downey2015"></a> Downey, A. (2015). *Think Python - How to think like a computer scientist* (2nd ed.). Green Tea Press. Retrieved 2020-03-26 from http://greenteapress.com/thinkpython2/thinkpython2.pdf

[2]<a class="anchor" id="python2020a"></a> The Python Standard Library (2020). Collections. Retrieved 2020-03-31 from https://docs.python.org/3/library/collections.html

[3]<a class="anchor" id="w3schools2020"></a> w3schools (2020). Python lists. Retrieved 2020-03-31 from https://www.w3schools.com/python/python_lists.asp

[4]<a class="anchor" id="downey2015"></a> Mohan, M. (2017). Mutable vs Immutable Objects in Python. *Medium*. Retrieved 2020-03-26 from https://medium.com/@meghamohan/mutable-and-immutable-side-of-python-c2145cf72747

[5]<a class="anchor" id="python2020b"></a> The Python Standard Library (2020). Built-in types. Retrieved 2020-03-31 from https://docs.python.org/2.7/library/stdtypes.html#frozenset

[6]<a class="anchor" id="aroutiounian2013"></a> Aroutiounian, E., Machavity, Pieters, M. (2013). Why is the order in dictionaries and sets arbitrary? *Stack Overflow*. Retrieved 2020-03-31 from https://stackoverflow.com/questions/15479928/why-is-the-order-in-dictionaries-and-sets-arbitrary/15479974#15479974

[7]<a class="anchor" id="reitz2016"></a> Reitz, K., & Schlusser, T. (2016). *The Hitchhiker's Guide to Python: Best Practices for Development*. O'Reilly Media, Inc.

[8]<a class="anchor" id="learnpython2020"></a> learnpython.org (2020). Modules and Packages. Retrieved 2020-03-26 from https://www.learnpython.org/en/Modules_and_Packages

[9]<a class="anchor" id="anaconda2020"></a> Anaconda, Inc. (2020). Anaconda package lists. Retrieved 2020-03-31 from https://docs.anaconda.com/anaconda/packages/pkg-docs/

[10]<a class="anchor" id="pip2020"></a> Python Software Foundation (2020). pip 20.0.2. Retrieved 2020-03-31 from https://pypi.org/project/pip/

[11]<a class="anchor" id="sqlite32020"></a> The Python Standard Library (2020). sqlite3 — DB-API 2.0 interface for SQLite databases. Retrieved 2020-03-31 from https://docs.python.org/3/library/sqlite3.html

[12]<a class="anchor" id="mckinney2011"></a> McKinney, W. (2011). pandas: a foundational Python library for data analysis and statistics. *Python for High Performance and Scientific Computing, 14*(9). [See also: https://pandas.pydata.org/]