<a href="https://colab.research.google.com/github/flixie24/Hands-On-Data-Analysis-with-Pandas-2nd-edition/blob/master/Week01_PythonBasics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python basics

## Goal
+ Giving some basic definitions
+ Provides the absolute basics $\rightarrow$ more information in the book and on several websites that are linked
+ Make sure, you know the commands/examples we show here

## Variables
* In python, a variable is a like tag or reference that points to an object in the memory.
* Python has no command for declaring a variable (unlike in other programming languages such as C++). A variable is created the moment you first assign a value to it.
* Can have a short name (like x and y) or a more descriptive name (age, carname, total_volume). Rules for Python variables:
    + Must start with a letter or the underscore character
    + Cannot start with a number
    + Can only contain alpha-numeric characters and underscores (A-z, 0-9, and _ )
    + Are case-sensitive (age, Age and AGE are three different variables)

In [None]:
n = 15
name = 'Matthias'
surname = 'Baumann'

## Data types
* A data type is a set of possible values and a set of allowed operations on it.
* We will mainly focus on four data types during the course

<style>
	.demo {
		border:1px solid #C0C0C0;
		border-collapse:collapse;
		padding:5px;
	}
	.demo th {
		border:1px solid #C0C0C0;
		padding:5px;
		background:#F0F0F0;
	}
	.demo td {
		border:1px solid #C0C0C0;
		padding:5px;
	}
</style>
<table class="demo">
	<caption>Table 1</caption>
	<thead>
	<tr>
		<th>type</th>
		<th>python</th>
		<th>pandas<br></th>
		<th>numpy</th>
	</tr>
	</thead>
	<tbody>
	<tr>
		<td>&nbsp;integer</td>
		<td>int <br></td>
		<td>&nbsp;int64</td>
		<td>&nbsp;int_,int8,int16,int32,int64</td>
	</tr>
	<tr>
		<td>&nbsp;float</td>
		<td>&nbsp;float</td>
		<td>&nbsp;float64</td>
		<td>&nbsp;float_,float16,float32,float64</td>
	</tr>
	<tr>
		<td>&nbsp;string</td>
		<td>&nbsp;str</td>
		<td>&nbsp;object</td>
		<td>&nbsp;string_</td>
	</tr>
	<tr>
		<td>&nbsp;bolean</td>
		<td>&nbsp;bool</td>
		<td>&nbsp;bool</td>
		<td>&nbsp;bool_</td>
	</tr>
	</tbody>
</table>




* In python, *lists*, *tuples*, *sets*, and *dictionaries* (all described later) are considered data types as well

In [None]:
x = 1
y = 2.8
name = 'Matthias'
male = True

* We can *ask* / test python for a data type of a variable

In [None]:
type(y)

## Operators
Operators are used to do something with variables and values. *Python* divides the operators in the following groups:
1. Arithmetic operators $\rightarrow$ e.g., $+, -, *, /$
2. Assignment operators $\rightarrow$ e.g., $=$, $+=$
3. Comparison operators $\rightarrow$ e.g., *=*, *!=*
4. Logical operators $\rightarrow$ *and, or, not*
5. Identity operators $\rightarrow$ *is, is not*
6. Membership operators $\rightarrow$ *in, not in*
7. Bitwise operators $\rightarrow$ e.g., *&, |*

In [None]:
x = 1
y = 2.8
name = 'Matthias'
male = True

In [None]:
z = x + y
z

## Lists
* Often, we will work in the class with collections of data --> primarily *lists*, and *dictionaries*
* A list is an **ordered** collection of items that can be accessed via their index $\rightarrow$ the first item has the index 0!
* A list is dynamic: one can add multiple elements to it and update or delete elements. Also, one doesn’t need to predefine the size of the list. One can insert any number of items in the list, and it can dynamically increase its size internally.
* Lists are written with square brackets

In [None]:
list_of_ints = [11, 13, 26, 90, 5, 22, 13]
list_of_ints

In [None]:
# access item
list_of_ints[0]

In [None]:
# add item to list
list_of_ints.append(87)
list_of_ints

In [None]:
# remove item from list --> here '3' is the index of the item we want to remove
list_of_ints.pop(3)
list_of_ints

In [None]:
# Use 'remove' to remove the first instance of a value
list_of_ints.remove(13)
list_of_ints

* A list can also contain lists with data $\rightarrow$ important when an item is composed of several pieces of information.
* One can access each individual item by checking the index

In [None]:
ints_and_strings = [['item1', 4], ['item2', 4], ['item3', 5]]
ints_and_strings[0][0]

* Wherever we are looking for a heterogeneous data structure that can dynamically change its size, keep elements ordered, and contain duplicates, a list is a perfect choice.
* Disadvantage of lists: access of individual list items only possible through direct indexing

## Dictionaries
* Dictionaries are collections which are unordered, changeable and indexed. They cannot contain duplicates.
* Dictionaries are important for certain data types (json, geojson, ee.FeatureCollection() )
* A dictionary keeps the elements in key-value mapping format and internally uses hashing for it; therefore, we can get a value from the dictionary by its key very quickly.
* To create a dictionary, we can use curly brackets

In [None]:
profs = {"Dirk": ['EOL', 8],
             "Matthias": ['Biogeo', 4]}
profs

**Important points regarding dictionaries**
* Keys are always unique in the dictionary
* Once a key-value pair is added in the dictionary then it cannot modify the key itself, although we can change the value associated with it.

* We can update a dictionary adding additional entries

In [None]:
profs['Patrick'] = 'EOL'
profs

In [None]:
profs.update({'Tobias': ['Biogeo', 9]})
profs

* Dictionaries may sound not so intuitive at the beginning, but you have a more detailed look at them $\rightarrow$ they make a lot of sense


--> add tupel for next year

## Loops
*loops* are great to repeat the same operation several times. *python* offers two general types of loops:

### *while*-loop
* within a *while*-loop we can execute a set of operations as long as a condition is true.
* A while statement starts with a while keyword followed by a condition and a colon in the end. After the while statement, the block of the while loop starts. It includes a group of statements with one indent level.

In [None]:
i = 1
g = 15
while i < g:
    print('i equals', str(i), 'which is smaller than ', str(g))
    i += 1
print('Now, i equals', str(i), 'which is not smaller than ', str(g))

### for*-loop
* a *for*-loop in python is useful when iterating over a sequence of elements
* “for” keyword is followed by a variable, then the “in” keyword, then a sequence, and in last, a colon. After the for statement starts the suite of for loop, i.e., a group of statements with one indent level, it is also called the block of for loop.

In [None]:
# Let's get back to our example of ints
list_of_ints

In [None]:
# Mutliply each element with 5, print
for i in list_of_ints:
    print(i, i*5)

* Iterating over a dictionary is a bit more complicated

In [None]:
for key, value in profs.items():
    print(value)

## List comprehension
List comprehensions are a elegant way to compress a list-building for loop into a single line. It is comparable to the mathmatic notation of sets $\rightarrow$ in mathmatics, the quadratics of natural numbers are defined as: $\{x^{2} | \text{ x } \in \mathbb{N}\}$

In [None]:
[n ** 2 for n in range(1,12)]

* The statement is almost readable in plain English: “construct a list consisting of the square of n for each n up to 12”.
* This basic syntax is: * [**expr** for **var** in **iterable**] *, where
    * **expr** is any valid expression
    * **var** is a variable name
    * **iterable** is any iterable Python object.
* We are free to do whatever we want in the **expr**

## Functions
* Functions are generally helpful when you realize that you reuse the same bits of code over and over. Functions have the advantage that they can (a) be easily transfered between scripts, (b) developed towards "tools".
* A function always starts with *def* followed by the function name. In the parentheses come the arguments that the function require; these are followed by a colon.
* Inside the function (with an indent) we write the block of the function. The *return*-statement defines which value the function should return

In [None]:
def Square(x):
    sq = x * x
    return [x, sq]
Square(9)

* We can also loop over a list and execute the function

In [None]:
for i in list_of_ints:
    print(Square(i))

## if/else-clauses
* In python, by default, statements are executed in sequential order, i.e., one after another. For example:

In [None]:
x = 18                              # Statement 1
print('Value of x is: ', x)         # Statement 2
print('x is less than 10')          # Statement 3
print('x is a single digit number') # Statement 4
print('This is last line')          # Statement 5

* Often, we don't want to run all links sequentially, but we want to introduce conditions (i.e., do some decision making) and execute specific statements at the correct time (i.e., only when the condition is fullfilled.
* *if*-statements are allowing for this. Thereby, the “if” keyword is always be followed by a conditional expression, which should evaluate to a bool value, i.e., either True or False. If the condition evaluates to True, then the interpreter executes the statements in “if” suite, i.e., the code statements in the if-block. Whereas if the condition evaluates to False, then the interpreter skips the lines in the if-block and jumps directly to the end of if-block.

In [None]:
x = 6
print('Value of x is: ', x)
if x < 10:
    print('x is less than 10')
    print('x is a single digit number')
print('This is last line')

* We can do this also with multiple conditions (exemplify above)

* The *if/else* clause is an extension of the *if*-statement. Instead of skipping the code block, it introduces and alternative codeblock (pathway) for the code to run

In [None]:
x = 4
if x < 10:
    print('x is less than 10')
else:
    print('x is larger than 10')