<div style="width:image width px; font-size:75%; text-align:right;">
    <img src="img/code_chris-ried_unsplash.jpg" width="width" height="height" style="padding-bottom:0.2em;" />
    <figcaption>Photo by Chris Ried on Unsplash</figcaption>
</div>

# Introduction to Python 2

**Applied Programming - Summer term 2022 - FOM Hochschule für Oekonomie und Management - Cologne**

**Lecture 04 - April 08, 2022**

*Dennis Gluesenkamp*

### Table of contents
* [Recap](#recap)
* [Loops](#loops)
* [Collections](#collections)
    * [Lists](#lists)
    * [Tuples](#tuples)
    * [Sets](#sets)
    * [Dictionaries](#dictionaries)
* [Modules](#modules)
* [Packages](#packages)
* [Versioning with git and GitHub](#versioning)
    * [Basic aspects of version control](#versioning_basics)
    * [Commands for git and GitHub usage](#versioning_commands)
* [References](#references)

### Recap<a class="anchor" id="recap"></a>

To briefly repeat what you have learned in the previous lecture, please answer the following questions or write the Python code according to the instructions
1. Write your solutions under this cell by creating more new cells. Introduce the solutions with a header of the fourth level.
2. What is the function which can determine that 'foo' is a string and 3.1415 is a floating number?
3. What is the difference between an expression and a statement?
4. What is the problem with the following statement? Correct the code so that it is runnable.
```python
12monkeys = 'film'
result = 3 * 12monkeys
print(result)
```
5. What are the basic types/property of packages and modules?
6. Calculate the quotient of 11 (dividend) and 7 (divisor) and round the result to four decimal places.
7. Create/define a custom function with three arguments that does the following: sum of the first and second argument, modulo third argument, and returns the result as a floating number. Set a default value of 2 for the third argument.

#### Solutions

### Loops<a class="anchor" id="loops"></a>
In the last lecture the ``if`` statement was shown as a possible element for programming conditions. Another typical element for the implementation of execution sequences in code are loops. The two possibilities ``while`` and ``for`` in Python will be shown here. [[1]](#downey2015)

In [None]:
def countdown(n = 10):
    while n > 0:
        print(n)
        n = n-1
    print('Lift off!')

In [None]:
countdown()

With the while loop it is important to do the programming carefully. The variable of the test condition should change so that the loop does not run infinitely  [[1]](#downey2015).

The ``for`` loop in Python works more like an iterator than typically found in other programming languages where it loops through a set of code for a certain number. Of course the following loop is correct and functional:

In [None]:
range(10)

In [None]:
def countup(n = 10):
    for i in range(n):
        print(i+1)
    print('Finished!')

In [None]:
countup()

However, you can also (and not only) iterate over the elements of lists:

In [None]:
for i in ['A1', 'B2', 'C3']:
    print(i)

Or even strings:

In [None]:
for i in 'LoremIpsum':
    print(i)

##### Exercises
1. Define a header for a function called ``get_types``, which takes a list as argument.
2. In the body of the function, add the functionality to iterate across the elements of that list and print out the type of each element.
3. Test your function with the list ``test_list``, defined below.

### Collections<a class="anchor" id="collections"></a>
Python implements four types of data containers for general purposes which are called containers [[2]](#python2020a). These are [[3]](#w3schools2020):
* **List** is a collection which is *ordered* and *changeable*. Allows *duplicate members*.
* **Tuple** is a collection which is *ordered* and *unchangeable*. Allows *duplicate members*.
* **Set** is a collection which is *unordered*, *changeable* and *unindexed*. *No duplicate* members.
* **Dictionary** is a collection which is *unordered*, *changeable* and *indexed*. *No duplicate* members.

The property that elements of a collection can respectively cannot be changed is called *mutable* respectively *immutable*.

#### Lists<a class="anchor" id="lists"></a>
A list is a sequence of values, so it is **ordered**. The values of such a list are called items or elements and can be of any type. By the index you can access the items. Indices in Python start with 0. With negative indices the elements can be accessed backwards, so starting at the end of the list. The items of a list can be changed (**mutable**) and there can be **identical items multiple times**.

In [None]:
listA = ['Mother', 'Father', 'Daughter', 'Son']
type(listA)

In [None]:
listA[1]

In [None]:
listA[1:4]

In [None]:
listA[2:]

In [None]:
listA[-2]

In [None]:
listA[-7:-2]

In [None]:
# Append additional elements to the list
listA.append('Grandma')
listA

In [None]:
# Insert elements at a specific position
listA.insert(4, "Grandpa")
listA

In [None]:
# Change elements in the list, because lists are mutable
listA[5] = 'Baby'
listA

In [None]:
# Remove an element by value
listA.remove('Grandpa')
listA

In [None]:
# Remove an element by index (default is last element)
listA.pop(2)
listA

In [None]:
listA.append('Father')
listA

In [None]:
listA.remove('Father')
listA

##### Exercises
1. Create a list of five fruits and variable name of your choice.
2. Make a copy of that list, so that the upcoming change in task 3 doesn't change the copy of your list. How can you do this? Explain in this context what a reference to an object is. (You may have to do some research on the Internet to find the solution.)
3. Remove the third element by index.
4. Join/concatenate your original list and the copy. There are multiple ways for this task.

#### Tuples<a class="anchor" id="tuples"></a>
The striking difference to lists is that tuples are **not changeable** - in other words **immutable**. However, their elements are also **ordered** by indices and there can be **duplicates**.

In [None]:
tupleA = ('Mother', 'Father', 'Daughter', 'Son')
type(tupleA)

In [None]:
tupleA[1:4]

In [None]:
tupleA[-4:-2]

In [None]:
# Elements of the tuple, resp. the tuple as a whole is immutable
tupleA[1] = 'Men'

##### Exercises
1. Create a tuple of five car manufacturers and variable name of your choice.
2. Iterate through the tuple with a ``for`` loop and print the current manufacturer name in each loop iteration.

#### Sets<a class="anchor" id="sets"></a>
In contrast to tuples and lists, sets do **not have an order** and the elements do **not have indices**. However iteration is possible and you can check whether an element exists in the set. The sets are **mutable**, whereas Python offers also a specific type of immutable sets: ``frozenset`` [[5]](#python2020b). **Duplicate items are not allowed** in sets.

Additional information: If you want to know more about the order in which sets are displayed, see [[6]](#aroutiounian2013).

In [None]:
setA = {'Mother', 'Father', 'Daughter', 'Son'}
type(setA)

In [None]:
# Elements doesn't have an index
setA[1]

In [None]:
for element in setA:
    print(element)

In [None]:
'Men' in setA

In [None]:
setA.add('Grandma')
setA

In [None]:
# Adding duplicates isn't possible but without error
setA.add('Father')
setA

In [None]:
setA.remove('Grandma')
setA

In [None]:
# Raises an error if element doesn't exist
setA.remove('Grandma')

In [None]:
# Different method for removing items without error raising
setA.discard('Grandma')

##### Exercises
1. Create a set of five cities and variable name of your choice.
2. Copy your set to a new variable and change two cities to other cities of your choice.
3. Research and test the set-related method ``intersection()``.
4. Which SQL-statement is equivalent to the ``intersection()`` method? Write the SQL-statement here in a Markdown cell by using appropriate syntax highlighting.

#### Dictionaries<a class="anchor" id="dictionaries"></a>
Dictionaries consists of key-value-pairs which are **unordered** but the pairs have indices and can be changed - so dictionaries are **mutable**. As in sets, **duplicates are not allowed**.

In [None]:
dictA = {
    'mother': 'Mary',
    'father': 'Pete',
    'daughter': 'Daisy',
    'son': 'Chuck'
}
dictA

In [None]:
# Access value by key, not vice versa
dictA['mother']

In [None]:
dictA['Mary']

In [None]:
dictA.get('father')

In [None]:
dictA['mother'] = 'Deanna'
dictA

### Modules<a class="anchor" id="modules"></a>
Modules are one of the levels of abstraction in Python and technically nothing more than individual Python files (``*.py``). These files contain variables and functions that belong together in a certain sense and are therefore implemented or defined separately. [[7, p.61](#reitz2016); [8](#learnpython2020)]

To create a module, it is therefore only necessary to write the desired code in a Python file. This file can then be integrated into the current project, for example a Jupyter Notebook, by calling ``import`` followed by the module name which is the file name. It is possible to assign a (abbreviated) name to the included module. This is done with the command ``as``.

The examplemodule, imported below, contains some example coordinates in 3-dimensional space. Furthermore it defines a function to calculate the euclidean distance between two points in 3-dimensional space, according to

$d(x,y) = \sqrt{\left(x_1 - y_1\right)^2 + \left(x_2 - y_2\right)^2 + \left(x_3 - y_3\right)^2}$

In [None]:
import examplemodule as em

In [None]:
em.coord1

In [None]:
em.calcEuclideanDist([1, 1, 1], em.coord2)

In [None]:
em.calcEuclideanDist([1, 0, 0], [0, 0, 0])

##### Exercises
1. Add a docstring to the euclidean distance function in examplemodule.py, which explains the operating principle of the function.
2. Re-import the module and check your docstring of the function via ``help(calcEuclideanDist)``.

### Packages<a class="anchor" id="packages"></a>
If the complexity and the structure of a project makes it necessary to combine several modules to a superior unit, so-called packages can be created. These are folders that contain module files. The only special feature is the existence of a ``__init__.py`` in this folder, which contains overall definitions for the whole package. [[7, p.65](#reitz2016); [8](#learnpython2020)]

A module of a package is included by the ``import`` command, preceded by the package name and separated from the module name by a period.

In [None]:
import examplepackage.examplemodule as pem

In [None]:
help(pem)

In [None]:
pem.calcEuclideanDist(pem.coord1, [-1, -1, -1])

### Versioning with git and GitHub<a class="anchor" id="versioning"></a>
Now that we have learned about structuring code using modules and packages, an interlude on versioning with git and GitHub is a good idea. We will thus test and practice collaborative work on code components (to a limited extent).

#### Basic aspects of version control<a class="anchor" id="versioning_basics"></a>
Version control systems (VCS) are key in companies or environments which plan to or work on larger software development projects. Of course, this includes Data Science and Machine Learning cases also. A VCS allows you to track code changes and enables collaboration across teams and their members. What else is there to know about VCS:
* Developers can work together, but decouple certain tasks through **branching**
* The history of changes is maintained by the VCS
* All code, branches, changes, and the history is managed and stored in a **repository**
* VCS can operate locally, centralized or distributed

#### Commands for git and GitHub usage<a class="anchor" id="versioning_commands"></a>
Before using git, you have to set up personalized information with
```bash
git config --global user.name '[name]'
git config --global user.email '[email address]'
```
After that, you can navigate to an **existing folder** or create a **new one** in a way of your choice. In the terminal/command prompt you now move to this folder and can **initialize** it via
```bash
git init
```
to a folder versioned with git.

Another possibility is to clone an **online available repository** to your own computer - for example from GitHub. To do this, use the following command:
```bash
git clone [url]
```

The **current state of the repository** can now be displayed by calling
```bash
git status
```
in the directory. This means that the status of the local copy, whether there have been changes or updates, is displayed for the folder and the files. We now assume that **files have been newly added** to the directory or existing files have been **modified**. These files are now to be versioned, i.e. added to the repository. To do this,
```bash
git add [filenames]
```
is called by specifying the explicit filenames or by using an abbreviated name using ``*`` (You can use ``*`` and ``.`` to add all files). This way the mentioned files are added to the **staging area**. Staging can be undone by git.
```bash
git reset [filenames]
```
In the *local* repository, these **changes have to be committed**. Ideally this is done in combination with a comment on what has been added/changed. Git provides this command for this purpose:
```bash
git commit -m '[comment on updates]'
```
Up to this point, everything has happened in the local environment. Assuming that we are working with an online repository (for example, GitHub) that also involves other developers, these committed, locally versioned changes are now to be **transferred to the remote repository**.
```bash
git push [alias] [branch]
```
As an alias, only the default name given by git to the server from which the repository was cloned is relevant for us at this point: ``origin``. For further information about aliases, please refer to [[9]](#git2021a) and [[10]](#git2021b). A branch can be understood as an independent development line in the project. It is therefore a new combination of working directory, staging area and project history, where new commits are only stored in the history of the active branch. The following graph illustrates this process.

<div style="width:image width px; font-size:75%; text-align:right;">
    <img src="img/branches.png" width="500" style="padding-bottom:0.2em;" />
    <figcaption>Branches in VCS [11]</figcaption>
</div>

More on branching can be found in [[12]](#git2021c). The primary branch in git or GitHub is called ``main`` (formerly also ``master``) by default. This means that the command
```bash
git push origin main
```
transfers the locally committed changes to the remote repository.

##### Exercises
1. Log in to GitHub with your account and create a new repository with an arbitrary name. Include a gitignore for Python and a ReadMe file.
2. Search the repository on GitHub for the URL to clone the directory. Use this URL to download and set up a local copy of the main branch on your machine.
3. Using the Jupyter Browser, navigate to the repository and create a Jupyter Notebook. This notebook should now exist as an ``.ipynb`` file in the folder. Add the blank notebook directly to the repository and commit this first version with git, for example with the comment "initial, blank version".
4. Push the locally staged and committed notebook to the remote repository with git command on the terminal and observe the changes on GitHub in the browser.

### References<a class="anchor" id="references"></a>

[1]<a class="anchor" id="downey2015"></a> Downey, A. (2015). *Think Python - How to think like a computer scientist* (2nd ed.). Green Tea Press. Retrieved 2020-03-26 from http://greenteapress.com/thinkpython2/thinkpython2.pdf

[2]<a class="anchor" id="python2020a"></a> The Python Standard Library (2020). Collections. Retrieved 2020-03-31 from https://docs.python.org/3/library/collections.html

[3]<a class="anchor" id="w3schools2020"></a> w3schools (2020). Python lists. Retrieved 2020-03-31 from https://www.w3schools.com/python/python_lists.asp

[4]<a class="anchor" id="downey2015"></a> Mohan, M. (2017). Mutable vs Immutable Objects in Python. *Medium*. Retrieved 2020-03-26 from https://medium.com/@meghamohan/mutable-and-immutable-side-of-python-c2145cf72747

[5]<a class="anchor" id="python2020b"></a> The Python Standard Library (2020). Built-in types. Retrieved 2020-03-31 from https://docs.python.org/2.7/library/stdtypes.html#frozenset

[6]<a class="anchor" id="aroutiounian2013"></a> Aroutiounian, E., Machavity, Pieters, M. (2013). Why is the order in dictionaries and sets arbitrary? *Stack Overflow*. Retrieved 2020-03-31 from https://stackoverflow.com/questions/15479928/why-is-the-order-in-dictionaries-and-sets-arbitrary/15479974#15479974

[7]<a class="anchor" id="reitz2016"></a> Reitz, K., & Schlusser, T. (2016). *The Hitchhiker's Guide to Python: Best Practices for Development*. O'Reilly Media, Inc.

[8]<a class="anchor" id="learnpython2020"></a> learnpython.org (2020). Modules and Packages. Retrieved 2020-03-26 from https://www.learnpython.org/en/Modules_and_Packages

[9]<a class="anchor" id="git2021a"></a> git (2021). 2.5 Git Grundlagen - Mit Remotes arbeiten. Retrieved 2021-03-25 from https://git-scm.com/book/de/v2/Git-Grundlagen-Mit-Remotes-arbeiten

[10]<a class="anchor" id="git2021b"></a> git (2021). 2.7 Git Grundlagen - Git Aliases. Retrieved 2021-03-25 from https://git-scm.com/book/de/v2/Git-Grundlagen-Git-Aliases

[11]<a class="anchor" id="noble2021"></a> Noble Desktop (2021). Git Branches: List, Create, Switch to, Merge, Push, & Delete. Retrieved 2021-03-25 from https://www.nobledesktop.com/learn/git/git-branches

[12]<a class="anchor" id="git2021c"></a> git (2021). 3.1 Git Branching - Branches auf einen Blick. Retrieved 2021-03-25 from https://git-scm.com/book/de/v2/Git-Branching-Branches-auf-einen-Blick