In [1]:
# Python Fundamentals

> Make easy things easy and hard things possible.
> 
> \- A slogan of Perl (a predecessor language to Python)

## Applied Review
### Python and Jupyter

- **Python** is a flexible, general-purpose language that is popular in many fields, but particularly in data science.

- **Jupyter** is an IDE, or *Integrated Development Environment*, that lets us view and run code in **notebooks**.

- We are using Jupyter via the [Azure Machine Learning Workbench](https://ml.azure.com/) in this course.

## Python at Its Simplest: Basic Data Types and Math
While Python can be used to write very complicated programs, one of its strengths is that easy things are still easy.
For example, Python can be a **calculator**.

In [3]:
1 + 2

3

In [4]:
12 * 4

48

Python allows you to *comment* your code -- to leave notes for yourself or others about the code.
Comments start with a `#` and are ignored by Python when it runs your code.

In [2]:
# The ** operator is exponentiation.
2 ** 3

8

Once you start doing math, you may want to keep the values you calculate for later use.

Python allows you to do this with *variables* -- words that you choose to represent values you've stored.

In [5]:
# Place the result of "5 * 2" in a variable called "x".
x = 5 * 2

This process of storing something in a variable is often called **variable assignment**, or simply "assignment" for short.
You can assign almost anything to a variable.

In [6]:
# "Assign" the value 42 to the variable "answer".
answer = 42

You can then use the stored values in new calculations.

In [7]:
answer + 5

47

In [8]:
ten = 10
eleven = 11
ten + eleven

21

Python lets you name your variables whatever you want – the only rule is that they must be composed of numbers, letters, and underscores, and they cannot begin with a number.

It's a good idea to take advantage of this flexibility and name your variables with descriptions that help you remember what they contain.

For example, calling your variables `x`, `y`, and `z` is likely to lead to forgetting what you've stored where (unless you're working with coordinates, a domain where those names have meanings).

More descriptive names, like `number_of_items` or `size_of_container`, are better.

In [9]:
# Perfectly good variable name
my_3rd_favorite_number = 18

In [10]:
# Legal, but undescriptive, variable name
a = 7

In [11]:
# Illegal variable name -- it starts with a number
4_plus_1 = 4 + 1®

SyntaxError: invalid decimal literal (1894465819.py, line 2)

If you try to name a variable something illegal, Python will gently remind you to follow the rules with a `SyntaxError` and an arrow indicating the location of the error.

<font style="color:#800;">
    <strong>Caution</strong>:<br><em>Sometimes Python doesn't pinpoint the error very well, and the error will not be in the same place as the arrow.</em>
</font>

## Your Turn


<img src="images/exercise.png" style="width: 1000px;"/>

<font class="your_turn">
    Your Turn
</font>


1. Moderne HD Monitore haben in der Regel eine Auflösung von 3840x2160. Erstelle zwei Variablen, `width` (Breite) und `height` (Höhe) und speichere die beiden Werte 3840 und 2160 darin.
2. Wie viele Pixel hat ein HD Monitor mit der oben genannten Auflösung? Benutze Python um die Zahl der Pixel zu berechnen.

*Tipp: Ersetze die Platzhalter in der Code Vorlage mit den oben definierten Variablen Namen:*  
`pixels = ___ * ___` 

<font class="your_turn">
    Your Turn
</font>

1. 4k monitors, counterintuitively, typically have a resolution of 3840x2160. Create two variables, `width` and `height`, and store 3840 and 2160 in them (respectively).
2. How many total pixels are in a display with this resolution? *Hint: fill in the blanks with variable names:* `pixels = ___ * ___`

## Beyond Integers

Fortunately, Python can handle values beyond integers.
It's happy to work with decimal numbers.

In [None]:
1 / 3

In [None]:
1.5 * 1.5

In computer science lingo, decimal numbers are often called **floating point numbers**, or **floats** for short.

The name refers to how such numbers are stored by a computer internally, but you don't need to worry about that.
Just be aware that many people on the internet and in data science industry will speak in terms of "floats" and "ints" when they refer to numbers in Python.

Python also can work with text data, like words and sentences.

In [None]:
my_name = 'Arno'
my_hobbies = 'reading, nature, meditation'

In Python, these bits of text are called **strings** and are enclosed in quotation marks.
Both single quotes (`'`) and double quotes (`"`) are fine.  

Many Pythonistas prefer single quotes as they can be typed in a single keystroke and that is what I usually do.  

That said automatic code-formatting tools have a preference towards double-quotes. So often my single quotes eventually end up as double quotes anyway. :-)   
(See [link](https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#strings) if you are curious about the details.)

Conveniently, Python lets you "add" strings together to compose longer strings.

In [None]:
'Monty' + ' ' + 'Python'

In [None]:
first_name = 'Guido'
last_name = 'van Rossum'
# Remember to add a space between words!
first_name + ' ' + last_name

The last kind of value that we'll talk about is a **boolean**, or a True/False value.

Python recognizes the words `True` and `False` as **keywords** -- words that have an implicit meaning in the language.

That means you can assign them to variables as you can with other data types.

In [None]:
is_the_moon_made_of_cheese = False
is_this_the_best_python_class = True

## Your Turn

<img src="images/exercise.png" style="width: 1000px;"/>

<font class="your_turn">
    Your Turn
</font>


1. Erstelle zwei Variablen, `first_name` (Vorname) und `last_name` (Nachname) und speichere deinen Namen in den Variablen ab. Dann führe den Code `first_name + ' ' + last_name` aus und überprüfe, dass das Ergebnis mit deinen Erwartungen übereinstimmt.
2. Was passiert wenn du versuchst zwei verschiede Arten von Werten zu addieren? Zum Beispiel ein `integer` und einen `string`? Ist es in deinen Augen sinnvoll, dass Python so reagiert?

<font class="your_turn">
    Your Turn
</font>

1. Overwrite the `first_name` and `last_name` variables with your name, and run `first_name + ' ' + last_name` again -- make sure it produces what you expect!
2. What happens when you try to add together two different kinds of values, like an integer and a string? Does this behavior make sense?

## Lists and Dictionaries

So far we've worked with single values: numbers, strings, and booleans.
But Python also supports more complex data types, sometimes called *data structures*.

The two most common complex data types are **lists** and **dictionaries**.

Note: The term _complex_ here does not imply difficult. _Complex_ here refers to the fact, that these data types can contain other data within them.

### Lists

As you might expect, a list is an ordered collection of things.
Lists are represented using brackets (`[]`).

In [None]:
# A list of integers
numbers = [1, 2, 3]
numbers

In [None]:
# A list of strings
strings = ['abc', 'def']
strings

Lists are highly flexible.
They can contain heterogeneous data (i.e. strings, booleans, and numbers can all be in the same list) and lists can even contain other lists!

In [None]:
combo = ['a', 'b', 3, 4]
combo_2 = [True, 'True', 1, 1.0]

In [None]:
# Note that the last element of the list is another list!
nested_list = [1, 2, 3, [4, 5]]
nested_list

Individual elements of a list can be accessed by specifying a location in brackets.
This is called **indexing**.

Beware: Python is **zero-indexed**, so the first element is element 0!

In [None]:
letters = ['a', 'b', 'c']
letters[0]

In [None]:
letters[2]

Specifying an invalid location will raise an error.

In [None]:
letters[4]

<font style="color:#800;">
    <strong>Caution</strong>:<br><em>Most programming languages are zero indexed, so a list with 3 elements has valid locations [0, 1, 2]. But this means that there is no element #3 in a 3-element list! Trying to access it will cause an out-of-range error. This is a common mistake for those new to programming (and sometimes it bites the veterans too).</em>
</font>

Not only can you read individual elements using indexing; you can also *overwrite* elements.

In [None]:
greek = ['alpha', 'beta', 'delta']
greek[2] = 'gamma'
greek

### Dictionaries

Dictionaries are collections of **key-value pairs**.
Think of a real dictionary -- you look up a word (a *key*), to find its definition (a *value*).
Any given key can have only one value.

This concept has many names depending on language: map, associative array, dictionary, and more. 

In Python, dictionaries are represented with curly braces. Colons separate a key from its value, and (like lists) commas delimit elements.

In [None]:
ferdinand = {'first_name': 'Ferdinand',
         'last_name': 'Fuchs',
         'age' : 90,
         'nationality' : 'German',
         'employer': 'Bausparkasse Schwäbisch Hall AG',
         'zip_code': 74523}
ferdinand

In [None]:
guido = {'first_name': 'Guido',
        'last_name': 'van Rossum',
        'age' : 65,
        'nationality' : 'Dutch',
        'employer': 'Retired',
         'zip_code': 45385}
guido

Values can be looked up and set by passing a key in brackets.

In [None]:
ferdinand['zip_code']

In [None]:
guido['employer']

In [None]:
guido['employer'] = 'Microsoft'
guido

Dictionaries, like lists, are very flexible.
Keys are generally strings (though some other types are allowed), and values can be anything -- including lists or other dictionaries!

## Your Turn

<img src="images/exercise.png" style="width: 1000px;"/>

<font class="your_turn">
    Your Turn
</font>


1. Erstelle eine Liste, welche die ersten 10 geraden Zahlen enthält. Benutze den Index Syntax `[]`, um die vierte gerade Zahl auszulesen. Vorsicht: Behalte in Erinnerung, dass die vierte Zahl sich am Index mit der Ziffer 3 befindet, da der Index bei 0 beginnt.
2. Angenommen du benötigst eine Methode um schnell den CEO eines Unternehmens herauszufinden. In Python könntest du hierfür ein Dictionary, `company_bosses_dict`, nutzen und den jeweiligen CEO via `company_bosses_dict['Apple']` (in diesem Fall `Tim Cook`) herausfinden. Erzeuge das `company_bosses_dict` dictionary und füge einige weitere Einträge hinzu. Z.b. Bob Iger (CEO von Disney), Reinhard Klein (CEO der Bausparkasse Schwäbisch Hall)...

Code Vorlage:
```python
company_bosses_dict = {'Apple': 'Tim Cook',
        'Microsoft': 'Satya Nadella'}
```

<font class="your_turn">
    Your Turn
</font>

1. Create a list of the first 10 even numbers. Use indexing to find the 4th even number. *Remember that the 4th element is at location 3 because of zero-indexing!*
2. Imagine you need a way to quickly determine a company's CEO given the company name. You could use a dictionary such that `company_bosses_dict['Apple'] = 'Tim Cook'`. Try to add a few more keys to this starter dictionary. For example, Bob Iger is the CEO of Disney.

```python
company_bosses_dict = {'Apple': 'Tim Cook',
        'Microsoft': 'Satya Nadella'}
```


## DataFrames
In data science, the most important complex data structure is the **DataFrame**.
DataFrames are a collection of tabular data -- you might think of them as *tables*. Similar to a database table or an excel sheet.

Let's take a look at one.

In [None]:
# Don't worry about this "boilerplate" code for now.
import pandas as pd
movies = pd.read_csv('../data/movies.csv')
pd.set_option('display.max_columns', 10)

In [None]:
# Asking for the "head" of a DataFrame will show you the first 5 rows.
movies.head()

DataFrames have **column names** (title, year, country, etc) and **row indexes** (the bold numbers on the left, starting at zero).

In [None]:
# Providing a number n to the "head" method will show the first n rows
movies.head(3)

The values (elements) within the DataFrame are the Python types we covered above: integers, floats, strings, and booleans.

<font style="color:#008;">
    <strong>Question</strong>:<br><em>Which of these columns are strings?</em>
</font>

Because DataFrames can hold almost any kind of data and support powerful *data wrangling* features, they have become the basic unit of data science work.

## Determining What Type of Data Structure Something Is

How can you determine the type of a Python object?
Pass it to the `type` function (we'll talk more about functions later).

In [None]:
x = 5
type(x)

In [None]:
type(movies)

You can even pass values directly to the `type` function.

In [None]:
type(7.2)

In [None]:
type('arno')

In [None]:
type([1,2,3])

In [None]:
type(True)

# Questions

Are there any questions before we move on?

<img src="images/any_questions.png" style="width: 1000px;"/>