# Python Basics

Python is a programming language. This means that you give commands that a computer understands, and the computer will execute those commands. Most (but [not all](https://scratch.mit.edu/)) programming languages are text-based: You type your commands. Computers are polyglots: They can understand multiple languages, one if which is Python.

## Commands

In their simplest form, commands in Python have the form `verb(object)`. So a basic command looks like this:

In [1]:
print('Hello World!')

Hello World!


Here, `print` is the verb, the instruction what the computer should do. `'Hello World!'` is the object, the text that the computer should print. Some commands do not require an object:

In [2]:
credits()

    Thanks to CWI, CNRI, BeOpen.com, Zope Corporation and a cast of thousands
    for supporting Python development.  See www.python.org for more information.


And some can take multiple objects:

In [3]:
round(3.14159, 2)

3.14

These commands are called “functions.” Some commands also include a subject in the form of `subject.verb(object)`. They are called “methods” (of the subject, which confusingly is called “object” in Python).

In [4]:
'Hello World!'.replace('e', 'o')

'Hollo World!'

## Types of data

When using Python for humanities research, one of the most important aspects is working with data. The task that David Mimno calls [data carpentry](http://www.mimno.org/articles/carpentry/) comprises getting data, cleaning data, enrichting data, and converting data from one form into another. Most of this can be done manually in principle, but it is mostly time consuming routine work that is not intellectually challenging. So it is a good task for a computer.

Before we deal with data formats commonly used in the humanities, we should learn a bit about data types in Python. We already learned about three important data types: “String,” which basically means “text”, “integer,” and “float,” which both are numbers, but one can contain fractions. The quotes (both `'` and `"` work) distinguish text from numbers, the decimal point distinguishes integers from floats. The distinction is important, since some functions and methods work only with certain types of data, or they behave differently. Take this example:

In [5]:
3 + 3

6

In [6]:
'3' + '3'

'33'

Fortunately, we can convert data from one type to another:

In [7]:
3 + int('3')

6

In [8]:
str(3) + '3'

'33'

*Note:* In this example, I use `'` to mark strings. If a sentence contains `'` as a normal character, one can use `"`:

```python
"Isn't this nice"
```

If a text contains both `'` and `"`, one can use three quotes, i.e. either `'''` or `"""`, to mark strings:

```python
'''He said: "Isn't this nice?"'''
```

## Variables and Functions

Usually, we want our commands not to work only for given data, but for any data that we might work with. This is why we usually work with placeholders, called variables, rather than with fixed values. That’s why we learned to work with “functions” in math: $3^2 = 9$ is rather specific, while $f(x) = x^2$ works for any value of $x$. We can use the same principle. Instead of this:

In [9]:
'Hello ' + 'World' + '!'

'Hello World!'

we can write this:

In [10]:
name = 'Frederik'

'Hello ' + name + '!'

'Hello Frederik!'

In our code, whe replace the parts that can change by a variable, in this case `name`. In order to run our code with a different value, we only need to change the variable and can leave the rest of our code intact.

In [11]:
name = 'Sarah'

'Hello ' + name + '!'

'Hello Sarah!'

Now we still had to copy and paste the actual code, even though we did not change it. Copy and paste is a bad thing: If we improve our code, we have to copy and paste the improved version all over the place. So we should use a function that we write once and that we can use several times. $f(x) = x^2$ is a function, and in Python, it would look like this:

In [12]:
def f(x):
    return x**2

In Python, in contrast to many other languages, space is important information. The indentation of the second line is a signal: This part belongs to the definition of the function. So this is not the same like

```python
def f(x):
return x**2
```

It is a convention to always use four spaces for indentation, but in many editors (and in a Python notebook), you can simply press the Tab button: ↹

Now we can call this function with different values of $x$:

In [13]:
f(3)

9

In [14]:
f(4)

16

We’re not mathematicians, but we can use the same principle:

In [15]:
def hello(name):
    return 'Hello ' + name + '!'

hello('hackers')

'Hello hackers!'

### Task

Write a function that notifies you of new e-mails. Use the template below and complete it.

In [16]:
def new_mail(name, number):
    return ''

The result should be this:

In [17]:
new_mail('Frederik', 3)  # => 'Hello Frederik, you have 3 new e-mails.'

''

## Complex data types

Often, we don’t have single values, but rather lists of values: Texts are lists of words, columns in a spreadsheet are lists of values, etc. In Python, we can group multiple single values in containers:

In [18]:
names = ['Maja', 'Willi', 'Flip']

Each element in the list has an “address.” In Python, counting starts at zero, which can be confusing in the beginning. Knowing that, the address of the first name is `0`. So you can get the first element like this:

In [19]:
names[0]

'Maja'

So how to get the last name? Since the number of elements could can change, it is better not just to use `2` to get the third element (starts from 0, remember?), but to determine its position dynamically. First, you need to know how long the list is, then you can get the element.

In [20]:
len(names)

3

In [21]:
length = len(names)
last = length - 1  # Why -1? Because 0-based indexing!
names[last]

'Flip'

You can also mix all the commands in one line. While it sometimes makes sense to be shorter, be careful not to create too complex code that you don’t understand after a day or so!

In [22]:
names[len(names) - 1]

'Flip'

In fact, there is a shorter, but less readable way to get the last element: You can count backwards from the end.

In [23]:
names[-1]

'Flip'

A second important container type is a dictionary. A dictionary contains information about an entry:

In [24]:
animals = {'Maja': 'bee',
           'Willi': 'bee',
           'Flip': 'grasshopper'}

In a dictionary, the address for an element is not its position, but the dictionary entry or key:

In [25]:
animals['Maja']

'bee'

In [26]:
def hello(name):
    return 'Hello ' + name + ', you are a ' + animals[name] + '!'

hello('Flip')

'Hello Flip, you are a grasshopper!'

## Control flow

### Loops

Working with lists and similar sequence types is often handy. One of the main possibilities when writing code is to repeat a task. Usually, we want to do something not only once, but several times. And this often depends on the number of items we deal with, e.g. read the title of each file in a folder. This way of repeating a task is called a loop. In a loop, we do something ***for*** every item ***in*** a list:

In [27]:
for name in names:
    print('Hello ' + name + '!')

Hello Maja!
Hello Willi!
Hello Flip!


In this case, `name` is a variable which you can choose freely. Every time the code is called again, the variable `name` gets a new value, until all values in the list have been dealt with.

In case you need to know in which loop you are, you can enumerate your list:

In [28]:
for i, name in enumerate(names, start=1):  # Otherwise, it would start with 0 again, which is less pretty.
    print(str(i) + '. ' + name)

1. Maja
2. Willi
3. Flip


Here, we have two variables: The name, but also a counter that tells us the number of the loop. It often gets called `i` (like the index in $\mbox{names}_{i}$), but you can call it anything you want. Just remember that it is an integer, so you have to convert it before you can add it to a string.

### Conditions

A second important tool when writing code are conditions: Sometimes, you only want to execute some code only ***if*** a certain condition is met.

In [29]:
if len(names) == 2:
    print('I see, there are two of you.')
if len(names) == 3:
    print('I see, there are three of you.')
if len(names) == 4:
    print('I see, there are four of you.')

I see, there are three of you.


An `if` clause can also be combined with an `else` clause that is executed if the condition is ***not*** met:

In [30]:
if len(names) == 0:
    print('Nobody there.')
else:
    print('Somebody there.')

Somebody there.


The `==` is used to check if two values are the same. Since the single `=` is already used to assign a value to a variable, it cannot be used for comparisons. Other types of comparisons are `<` (smaller than), `>` (larger than), `<=` (smaller or equal), `>=` (larger or equal), and `!=` (not equal).

In [31]:
if len(names) < 1:
    print('Where is everybody?')
if len(names) == 1:
    print('You are alone!')
if len(names) >= 3:
    print('You are a group!')
if len(names) > 1:
    print('Your are a couple!')

You are a group!
Your are a couple!


Usually, we want to execute only one block, so we can ensure we stop at the first condition that is met. This can be achieved by the ***else-if*** statement `elif`:

In [32]:
if len(names) < 1:
    print('Where is everybody?')
elif len(names) == 1:
    print('You are alone!')
elif len(names) >= 3:
    print('You are a group!')
elif len(names) > 1:
    print('Your are a couple!')

You are a group!


We can also combine loops and conditions. If you nest multiple blocks like loops or conditions, you just have to indent the inner block more:

In [33]:
for name in names:
    if animals[name] == 'bee':
        print('Welcome, ' + name + '!')
    else:
        print('Only bees, please!')

Welcome, Maja!
Welcome, Willi!
Only bees, please!


### Task

Now write a combination of a loop and conditional statements so that your code prints a nicer greeting compared to our original one:

    Hello Maja,
    hello Willi,
    hello Flip!

In [34]:
# Enter your code here.