# Python For The Digital Humanities, Part 1: Basic Python
Manuel Huth, 2026

This file is part of a Python online course consisting of a reader and several scripts. You can find the entire course on GitHub (https://github.com/talant26). The course aims to introduce scholars of the humanities with no prior knowledge to the basics of Python, demonstrating typical applications in the humanities, such as extracting and visualising information from texts.


## Helpful Tutorials and Websites

**Websites**

* https://wiki.python.org/
* https://www.w3schools.com/python/
* https://www.geeksforgeeks.org/python-if-else/
* https://automatetheboringstuff.com

**Youtube Tutorials**
* https://www.youtube.com/watch?v=_uQrJ0TkZlc
* https://www.youtube.com/watch?v=qwAFL1597eM

**Markdown Guide for Google Colab**
* https://colab.research.google.com/notebooks/markdown_guide.ipynb



## Recommendations for working with Google Colab:
* Tutorial: https://www.tutorialspoint.com/google_colab/index.htm
* Change the editor so that line numbers are displayed. (Tools -> Settings -> Editor -> show line numbers)
* Google Colab does not permanently store files. To keep your files, download them or save them to Google Drive (https://www.tutorialspoint.com/google_colab/google_colab_saving_work.htm).
* If errors occur when executing individual scripts, please click the 'Run all' button, which can be found either at the top of the menu bar or in the 'Runtime' > 'Run all' tab. This will execute all scripts in the correct order. This helps to avoid errors that may occur if the session has been inactive for too long, for example, or if a script accesses variables that have not been set previously.
* For the scripts to work, make sure you have uploaded the files 'Copyright.txt', 'Mainpart.txt' and 'DoubleNames.csv'.
* If this does not work, you can try to restart a session (Runtime $>$ Restart Session).

# Python Basics

## Comments
A *comment* is text that is ignored by the interpreter. It starts with a hashtag. In your editor, comments look like this:

In [None]:
# This is a comment

### Why you should use comments
* Use comments to explain your code
* Use comments to structure your code
* Comments allow you to (temporarily) disable parts of your code. This is a great help when debugging (i.e. searching for errors).

### Multiline comments
You can define multiple lines as a comment by enclosing them in three quotation marks (either single or double quotation marks).



In [None]:
'''This is
a
multiline
comment'''

"""This is
a
multiline
comment
as
well"""

## The print function
You can use the ***print function*** to display text. Just type the word "print" and then type the text you want to display in parentheses and quotation marks. You can use either single or double quotes.

In [None]:
# The print command using single quotation marks
print('Hello')

# The print function using double quotation marks
print("Hello")

If you want to display a number, you don't need quotation marks.

In [None]:
print(7)

This is because numbers are interpreted as a different data type, as we will see later (see the chapter Basic data types)

### Exercise
Here is an exercises you can do to try out the things we just learned.

In [None]:
# Task 1: The print function
# Use the print function to display the following words: This is my first script
# Use the print function to display the number 54

# Space four your code:

**Solution**

In [None]:
print('this is my first script')
print(54)

### Question
Now lets assume we want to display the following Words:

*And he said: "Luke, I am your father."*

**How can we accomplish this?**

Try it out **in** the following code block:

In [None]:
# Space four you to try it out:


**Solution**

In [None]:
print('And he said: "Luke, I am your father."')

# also possible:

print("And he said: 'Luke, I am your father.'")

# So you can use double quotes to enclose single quotes
# or you can use single quotes to enclose double quotes
# Single quotes cannot enclose single quotes
# Double quotes cannot enclose double quotes

## Arithmetic Operators
When it comes to coding, we need to do at least some mathematical operations. In this workshop, we will use only the four basic arithmetic operators you learned in elementary school:
* Addition: +
* Subtraction: -
* Multiplication: *
* Division: /

For further arithmetic operators (like Modulus), see 'Further References'

In [None]:
print(7+2)
print(7-4+3-1+27)
print(2*2)
print(6/2)

We can use the "+" operator to concatenate words/texts (this can be an important tool if, for example, you want to automatically replace certain parts of a text):

In [None]:
print('Hi' + ' ' + 'there' + '.')

### Further References
* https://www.w3schools.com/python/gloss_python_arithmetic_operators.asp
* https://www.geeksforgeeks.org/python-arithmetic-operators

## Variables
Variables are like containers that can store different types of information (such as numbers or text) for you. In Python you declare variables by writing the name of the variable, an equal sign, and the value you want to assign to it. You can then access the variable in other functions (such as the print function):

In [None]:
# We assign the value 7 to the variable a.
# We do not need quotes, because 7 is a number
a = 7

# We assign the value 'Hi' to the variable b.
# We need quotes, because 'Hi' is a word / text
b = 'Hi'

print(a)
print(b)

You can name variables anything you like. You can even combine letters and numbers, but variables cannot begin with a number. It is also recommended to avoid special characters. So you could name a variable "RogueOne", "Rogue1", but not "1Rogue".

You can change the content of variables any time. Note how the interpreter reads line by line and how the output changes:

In [None]:
# We assign the value 7 to the variable a and display it.
a = 7
print(a)

# Now we assign another value to the variable a and display it.
a = 9
print(a)

We can define a variable relative to other variables. And unlike in mathematics, we can even define or change a variable relative to itself. In fact, this is an important operation, as we will see, when it comes to while loops (see the chapter on whileloops):

In [None]:
# We assign the value 7 to the variable a and display it.
a = 7
print('Value of a:')
print(a)

# Now we define b in relation to a
# Then we display a and b
b = a
print('Values of a and b after equalization:')
print(a)
print(b)

# Now we change b in relation to itself and increment it by 2
b = b + 2
print('Values of a and b after changing b:')
print(a)
print(b)

**As you can see, even after equalizing the variables, we can still change them independently!**

Like the words / texts we combined in the print function you can also combine text variables:

In [None]:
# We assign the value 'Hi' to the variable a and 'you' to the variable b.
a = 'Hi'
b = 'you'

# Now we define the variable c as the combination of a, a white space and b
c = a + ' ' + b

# Now we disply c
print(c)

### Further References
* https://www.w3schools.com/python/python_variables.asp
* https://www.geeksforgeeks.org/python-variables/



### Exercises with solutions
Here are three exercises you can do to try out the things we just learned. If you forgot one command, do not hesitate to look it up, or search for it on w3schools: https://www.w3schools.com/python/


**Exercise 1**

* Create a variable 'a' and store the number 123 in it.
* Then create another variable 'b' and store the number 234 in it.
* Multiplicate both variables and store the result in the variable 'c'.
* Display the content of the variable 'c' with the print function.

In [None]:
# Space four your code:

**Solution**

In [None]:
# Space four your code:
a = 123
b = 234
c = a * b
print(c)

**Exercise 2**

* Create a variable 'a' and store the word 'Hello' in it.
* Then create another variable 'b' and store the word 'world' in it.
* Create a variable 'c': its content shall be the word stored in variable a, then a white space (' ') and then the word stored in variable 'b'.
* Display the variable 'c' with the print function.

The output should be: 'Hello world'

In [None]:
# Space four your code:

**Solution**

In [None]:
# Space four your code:

a = 'Hello'
b = 'world'
c = 'Hello' + ' ' + b

print(c)

**Exercise 3: Defining variables**

Part 1:
* Create a variable 'number1' with the value 4.
* Create a variable 'number2' with the value 3
*Display both variables with the print function.

Part 2:
* Now make 'number2' equal to 'number1'.
* Now increment the value of 'number1' by 1.
* Display both variables with the print function.

In [None]:
# Space four your code:


**Solution**

In [None]:
# Space four your code:
number1 = 4
number2 = 3

print(number1)
print(number2)

number2 = number1
number1 = number1 + 1

print(number1)
print(number2)

## Basic data types
If you try to combine a variable, that contains a number with a variable that contains a word or text, you will get an error message. This is a common error. The reason is, that you are trying to combine **different data types**. The interpreter does not know whether to concatenate strings (=words or texts) or do an addition:

In [None]:
# We assign the value 'I am' to a, 9 to b and 'years old' to c
a = 'I am '
b =  9
c = ' years old'

# Now we define the variable c as the combination of a, b and c
print(a+b+c)

### What are data types?
Data types are the kind of information that is stored inside a variable. Depending on the type you can do certain operations (for example do additions). In Python *data types* are automatically assigned when you declare a variable.

Type  | Abbreviation | Description | Example
-------------------|------------------|---------|---------------------------
string                 | str   | A sequence of charaters like a word or text | "The Mandalorian"
integer                | int   | Whole numbers (positive or negative) | 42
floating point numbers | float | decimal numbers | 3.2
boolean                | bool  | True or False | True
list                   | list  | A list of values (see chapter Lists) | [ 9, 2, 7 ]
dictionary             | dict  | A dictionary of key-value pairs, see chapter 2.12 | {"name": "Luke", "profession": "Jedi"}

Notes
* For more datatypes see "Further References".
* Calculating with *floating point numbers* is imprecise with Python. This is due to rounding to certain digits. The *math module* is required to perform precise mathematical calculations.

### The type function
With the ***type function*** we can read the data types of variables.

In [None]:
# We declare a variable a
a = 'hi'

# Now we use the type function to read the type
print(type(a))

### Enforcing data types
You can enforce a *data type* using the above mentioned abbreviations:

In [None]:
# We declare the variable a and read its type
a = 9
print(type(a))

# We enforce the datatype 'str' and read the data type of the variable a
a = str(a)
print(type(a))

### Further References
* https://www.w3schools.com/python/python_datatypes.asp
* https://www.geeksforgeeks.org/python-data-types/

## The input function
The *input function* can be used to store user input inside a variable, making thus programs more flexible. The text in parentheses is displayed to the user.

In [None]:
age = input('How old are you?\n')
print('Your age is: ' + age)

The characters \n mark a line break. This allows the user to start writing on a new line.

### Further References
* https://www.geeksforgeeks.org/python-input-function

### Exercises with solution

**Exercise 1: User input**

Write a script that prompts the user to enter his name. The script then greets the user by his name (e.g. 'Hello, John')

In [None]:
# Space for your code



**Solution**

In [None]:
# Space for your code

name = input('Please enter your name\n')

print('Hello ' + name)

**Exercise 2: Data types**

There are two errors in the code below. Let us try to  find and fix them. Do not hesitate to try out the code or change it, as you see fit

In [None]:
# The goal of the function is to display the square of the number entered by the user:
# So if the user types in 12, the output should be: The square number is 144
number = input('Please enter a number. The program will return the square of it:\n')
square = number * number
print('The square number is ' + square)

**Solution**

In [None]:
# The goal of the function is to display the square of the number entered by the user:
# So if the user types in 12, the output should be: The square number is 144
number = int(input('Please enter a number. The program will return the square of it:\n'))
square = number * number
print('The square number is ' + str(square))

In [None]:
# The goal of the function is to display the square of the number entered by the user:
# So if the user types in 12, the output should be: The square number is 144
number = input('Please enter a number. The program will return the square of it:\n')
number = int(number)
square = number * number
print('The square number is ' + str(square))

## If statements and indentation




### If statements
* If a condition is true, one or more commands are executed.
* An indentation (= one tab or 4 spaces at the beginning of a line) make it clear which code belongs to the *if statement* and which does not (see below for an example)
* In google colab one tab equals 2 spaces by default (so you can use 2 or 4 spaces)

I recommend using the **tab key** instead of 4 spaces. This is faster and easier. It also minimizes the risk of miscounting spaces. You can undo an indentation by pressing **shift + tab**.

### Comparison operators
Comparison operators are used to compare values:

Operator  | Explanation
-------------------|------------------
a == b | a equals b   
a != b | a does not equal b  
a < b  | a is smaller than b   
a <= b | a is smaller than or equal to b   
a > b  | a is bigger than b   
a >= b | a is bigger than or equal to b

### Simple If statements


In [None]:
# We define the variable 'a' that we want to compare
a = 4

# We check if the variable 'a' is equal to 4
if a == 4:

  # The following indented text is only executed, if the condition is true.
  print(a)

# This code is independent of the if statement, because there is no indentation.
# It will always be executed
print('The program continues...')

### Elif and else

elif = elseif

In [None]:
a = 6 # we declare the variable 'a'
b = 5 # we declare the variable 'b'

if a < 3: # We check if a is smaller than b

    print(str(a) + ' is smaller than ' + str(b))

elif a < 5: # We check if a is equal to b

    print(str(a) + ' equals ' + str(b))

else: # If a is neither smaller than nor equal to b, the following code
      # is executed:

    print(str(a) + ' is bigger than ' + str(b))

In [None]:
name1 = 'John'
name2 = 'Maria'

if name1 == name2:

  print(name1)

else:

  print(name2)

You can also use if-statements within if-statements.

In [None]:
a = 5 # We declare the variable 'a'

if a < 5: # We check if a is smaller than 5

    print(str(a) + ' is smaller than 5')

else: # If a is not smaller than 5, the following code is executed:

    if a == 5: # We check if a equals 5

        print('a equals 5')

    elif a == 6: # We check if a equals 6

        print('a equals 6')

    else: # a must be bigger than 6

        print('a is bigger than 6')

### Further References
* https://www.w3schools.com/python/gloss_python_if_statement.asp
* https://www.geeksforgeeks.org/python-if-else/

### Exercises with solutions


**Exercise 1**

Write a script that prompts the user to enter a number. The script then multiplies the number by 10 and tells the user whether the result is less than, equal to, or greater than 100.

Tip: Pay attention to the data type

In [None]:
# Space for your code:


**Solution**

In [None]:
# Space for your code:

number = input('Please type in your number\n')
number = int(number) * 10

if number < 100:

  print('your number multiplied by 10 is: ' + str(number) + ' .It is smaller than 100')

elif number == 100:

  print('your number multiplied by 10 is: ' + str(number) + ' It is equal to 100')

else:

  print('your number multiplied by 10 is: ' + str(number) + ' It is bigger than 100')

**Exercise 2**

Write a script that prompts the user to enter his age. If his age is under 18, it displays the message 'Access denied. You are too young'. If his age is 18 or above, it displays the message 'Access granted'. And if his age is above 100, it displays the message: 'Haha, very funny..."

Tip: Pay attention to the data type

In [None]:
# Space for your code:


**Solutions**

In [None]:
# Space for your code:

age = int(input('enter your age\n'))

if age < 18:

  print('access denied')

elif age < 100:

  print('access granted')

else:

  print('very funny')

In [None]:
# Space for your code:

age = int(input('enter your age\n'))

if age < 18:

  print('access denied')

elif age >= 100:

  print('very funny')

else:

  print('granted')

In [None]:
# Space for your code:

age = int(input('enter your age\n'))

if age < 18:

  print('access denied')

else:

  if age < 100:

    print('access granted')

  else:

    print('Very funny')

## While loops
While loops are similar to if statements (see chapter 2.8). But instead of asking if a condition is true, they ask how long the condition is true. That is, as long as (or: while) a condition is true, certain commands are executed.

In [None]:
i = 1 # We create a counter, that we want to increment each time we loop
      # through the following function. When it reaches a certain value,
      # we want the while loop to stop.

while i < 6: # We check if the counter is smaller than 6

  print('The counter is ' + str(i) + '.') # display the counter
  i = i + 1 # increase the counter by 1, then we will loop again
            # through the function

**Do not forget to increment the counter or you will create an infinite
loop!**

Just like in if statements, you can combine while with else:

In [None]:
i = 1         # our counter

while i < 6:  # check if counter is smaller than 6

  print('The counter is ' + str(i) + '.') # display the counter
  i = i + 1   # increase the counter by 1

else:

  print('The counter is no longer less than 6')

There are some commands that give us more control over the iterations:
* **continue**: skips the current iteration
* **break**: terminates the entire loop

### Break

In [None]:
i = 1 # our counter

while i < 6: # check if counter is smaller than 6

  if i == 3: # if the counter equals 3

    break # the entire loop is immediately terminated

  print('The counter is ' + str(i) + '.') # display the counter
  i = i + 1  # increase the counter by 1

### Continue

In [None]:
i = 1 # our counter

while i < 6: # check if counter is smaller than 6

  if i == 3: # if the counter equals 3

    i = i + 1 # increase the counter by 1
    continue # the current iteration is skipped and the following code is executed

  print('The counter is ' + str(i) + '.') # display the counter
  i = i + 1 # increase the counter by 1

**Note that we incremented the counter before the continue statement,
otherwise we would have created an infinite loop.**

### Further References
* https://www.w3schools.com/python/python_while_loops.asp
* https://www.geeksforgeeks.org/python-while-loop/

### Exercise
Write a **countdown script** that counts down from 20 to 1 and displays the respective number. After the loop ends, display a success message.

In [None]:
# Space for your code:



### Solution

In [None]:
# Space for your code:

i = 20

while i <= 20:

    if i == 0:

      print('success')
      break

    print(i)
    i = i - 1

print('Happy new year')

## Lists
"List" is a data type (like *string* *integer*, *float*} …). A python list is a list of numbers or strings and it can be stored inside a variable. See the following code example for further explanations.


In [None]:
# Example for a simple list
list1 = [ "Wookie", "Ewok", "Java", "Rodian" ]

# Lists can have duplicates or be empty:
list2 = [1, 3, 5, 7, 9, 7, 7, 7, 7, 7]
list3 = [ ]

# display the three lists
print(list1)
print(list2)
print(list3)

# with the len function (length function) you get the amount of elements inside a list
lengthList1 = len(list1)
print(lengthList1)

# There is a method for sorting lists. It will sort the list in an ascending
# order, either alphabetically (if the list contains strings) or numerically
# (if the list contains numbers).
list1.sort()
list2.sort()
print(list1)
print(list2)

### List indexes
Lists are ordered and indexed. That means every entry of a list has an indexnumber, by which it can be accessed. The first entry has the index 0, the second one the index 1 and so on (In computer science we start by counting from 0 instead of 1). See the following code for examples and further explanations:

In [None]:
starwarslist = [ "Wookie", "Ewok", "Java", "Rodian" ]
# Indexnumber       0         1       2       3

print(starwarslist[0])  # the word "Wookie" will be displayed
print(starwarslist[1])  # the word "Ewok" will be displayed
print(starwarslist[2])  # the word "Java" will be displayed
print(starwarslist[3])  # the word "Rodian" will be displayed

# You can count backwards using negative indexes

starwarslist = [ "Wookie", "Ewok", "Java", "Rodian" ]
# Indexnumber       -4       -3      -2       -1

print(starwarslist[-1])  # the word "Rodian" will be displayed
print(starwarslist[-2])  # the word "Java" will be displayed
print(starwarslist[-3])  # the word "Ewok" will be displayed
print(starwarslist[-4])  # the word "Wookie" will be displayed

You can also access a range of indexes (i.e., a portion of a list).
* The first number (before the colon) indicates the starting point
* The second number (after the colon) indicates the end point. Note that the end point is not included.

See the following examples:

In [None]:
starwarslist = [ "Wookie", "Ewok", "Java", "Rodian" ]
  # Indexnumbers:   0         1       2       3
  # Indexnumbers:  -4        -3      -2      -1

# Entry one and two will be displayed
print(starwarslist[1:3]) # Output: [ "Ewok" , "Java" ]

# Entry zero and entry one will be displayed
print(starwarslist[:2]) # Output: [ "Wookie", "Ewok" ]

# The entire list starting with entry 1 will be displayed
print(starwarslist[1:]) # Output: ["Ewok", "Java", "Rodian" ]

# The entire list without the last two entries will be displayed
print(starwarslist[:-2]) # Output: [ "Wookie", "Ewok" ]


  ### String indexes
  What we said about the indexes of *lists*, also applies to *strings*, which are basically nothing else than a list of characters. Therefore you can access the characters of a string the same way you would access the entry of a list.

In [None]:
examplestring = 'Mandalorian' # we declare a variable with the datatype string

print(examplestring[0])         # displays the first character of the
                                # variable 'examplestring'.
                                # The output therefore is: 'M'

print(examplestring[:-1])       # displays every character of the variable
                                # 'examplestring' except for the last
                                # character. The output therefore is:
                                # 'Mandaloria'

### Erratum
In the reader occured a small error in the first line of the script. There should be no brackets:
* Wrong: *examplestring = ['Mandalorian']*
* Correct: *examplestring = 'Mandalorian'*

### The *in* and *not in* operators
You an check if an item is in a list with the **in operator**:




In [None]:
starwarslist = [ "Wookie", "Ewok", "Java", "Rodian" ]

if "Wookie" in starwarslist:
    print("'Wookie' is in the list.")

That also applies to strings. That means you can check if a string is part of another string with the **in operator**:

In [None]:
longString = 'Mos Eisley, you will never find a more wretched hive of scum and villainy.'

if 'scum' in longString:
   print('Yes, the long string contains the word "scum".')

if 'scu' in longString:
  print('Yes, the long string contains the word "scu".')

To check if an item is not in a list / a string is not in another string, type *not in* instead of *in*.

### Adding / removing items
You can add items to a list with the *append method*. If you want to remove an item, you can either use the *pop method* (which uses the index number of the respective item) or the *remove method* (which uses the item itself).

In [None]:
starwarslist = [ "Wookie", "Ewok", "Java", "Rodian" ]

# Add items with the append method.
starwarslist.append("Duros")

# now the list is [ "Wookie", "Ewok", "Java", "Rodian", "Duros" ]

print(starwarslist)

# there are two methods to remove entries:
starwarslist.pop(0) # removes the first entry "Wookie"
starwarslist.remove("Duros") # removes the entry "Duros"

# Now the list is ['Ewok', 'Java', 'Rodian']
print(starwarslist)

### Copying lists
Note: Copying lists does the same way as with other variables. When you equate lists, they refer to the same location in the computer's memory. Any change you make to one list will also be made to the other. In fact, they are identical. If you want to copy lists so that you can edit them independently later, you need the *copy method*.

In [None]:
# Copying other variables:
a = 6
b = a       # 6
b = b + 1   # 7
print(a)    # The output is 6
print(b)    # The output is 7

# Wrong way of copying lists
starwarslist = [ "Wookie", "Ewok", "Java", "Rodian" ]
newlist = starwarslist # equates the lists
newlist.remove("Ewok") # removes the item "Ewok"

print(starwarslist)
print(newlist)
# in both cases the output is:
# [ "Wookie", "Java", "Rodian" ]

# Right way of copying lists with the copy method
starwarslist = [ "Wookie", "Ewok", "Java", "Rodian" ]
newlist = starwarslist.copy()
newlist.remove("Ewok") # removes the item "Ewok"

print(starwarslist)    # output: [ "Wookie", "Ewok", "Java", "Rodian" ]
print(newlist)         # output: [ "Wookie", "Java", "Rodian" ]

### Further References
* https://www.w3schools.com/python/python_lists.asp
* https://www.geeksforgeeks.org/python-lists/

## For loops
With for loops you can iterate over sequences (e.g. lists or strings). That means you can access every element of a list or a string one after the other. See the following code examples for further explanation.

In [None]:
# We define a list
starwarslist = [ "Wookie", "Ewok", "Java", "Rodian" ]

# We iterate over each item and display it.
for element in starwarslist:
  print(element)

# That means:
# We first look at the entry 'Wookie' and display it
# Then we look at the entry 'Ewok' and display it
# Then we do the same with 'Java' and 'Rodian'.

# In this example we used the variable "element", but we could have named the variable any way we like it (for example "item", "race")


# We can also loop through a string. Therefore we define the string "examplestring"
examplestring = "Mandalorian"

# Now we iterate over each item of the string and display it
for character in examplestring:
  print(character)

### Break and continue

You can use the same commands as in while loops to get more control over
iterations:
* **continue**: skips the current iteration
* **break**: terminates the entire loop

In [None]:
# We define a list
starwarslist = [ "Wookie", "Ewok", "Java", "Rodian" ]

# We iterate over the entries of the list
for entry in starwarslist:

  # We check, if the entry is equal to "Ewok"
  if entry == "Ewok":

    # We terminate the entire loop
    continue

  print(entry)

### Further References
- https://www.w3schools.com/python/python_for_loops.asp
- https://www.geeksforgeeks.org/loops-in-python/

### Exercises with solutions

**Exercise 1**:

Given is a list of 5 numbers (see below). Loop through the list with a for loop, multiply each element by 2, and then display it. But do not display it if it is 20 or greater. After the loop has finished, display a success message.

In [None]:
# The list of 5 numbers is already created:
listOfNumbers = [3, 4, 6, 10, 15]

# Space for your code:

**Solution**

In [None]:
# The list of 5 numbers is already created:
listOfNumbers = [3, 4, 6, 10, 15]

# Space for your code:

# The list of 5 numbers is already created:

# Space for your code:
for element in listOfNumbers:

  if element * 2 < 20:

    print(2 * element)

# Alternative way
for number in listOfNumbers:

  newnumber = number * 2

  if newnumber < 20:

    print(newnumber)

**Exercise 2:**

Given are two long strings 'longstring1' and 'longstring2' (see below). Write a script that prompts the user to type in a word. Then check if the word is in longstring1, longstring2, both or none of them and display an according message. Use nested if statements.

In [None]:
# given strings
longstring1 = 'Python is great'
longstring2 = 'Google colab is great as well'

# Space for your code:

**Solution**

In [None]:
# given strings
longstring1 = 'Python is great'
longstring2 = 'Google colab is great as well'

# Space for your code:
inputword = input('Type in your word\n')

if inputword in longstring1: # in Longstring 1

  if inputword in longstring2: # in Longstring 2

    print('your word is in both strings')

  else: # Not in longstring 2

    print('your word is only in longstring1')

else: # not in Longstring1

  if inputword in longstring2: # in Longstring 2

    print('your word is in longstring2 ')

  else: # not in longstring2

    print('your word is neither in longstring1 nor longstring2')

In [None]:
# given strings
longstring1 = 'Python is great'
longstring2 = 'Google colab is great as well'

# Space for your code:
inputword = input('Type in your word\n')

if inputword in longstring1 and inputword in longstring2:

  print('your word is in both strings')

elif inputword in longstring1:

  print('your word is in longstring1')

elif inputword in longstring2:

  print('your word is in longstring2')

else:

  print('your word is neither in longstring1 nor longstring2')

**Exercise 3: Eliminating white spaces**

Loop over the characters of the variable givenString (see below) and add each character to the variable newString, if it is not a white space (' '). Display the variable newString, after the loop is finished. This way we eliminate all white spaces.

In [None]:
# Given string
givenString = 'Python is great'

# White Space = ' '
# The output should be: Pythonisgreat
# Space four your code:

**Solution**

In [None]:
# Given string
givenString = 'Python is great'

# White Space = ' '
# The output should be: Pythonisgreat
# Space four your code:

newstring = ''

for character in givenString:

  if character != ' ': # we check if the character does not equal a white space

    newstring = newstring + character
    # Pythonis

print(newstring)

**Exercise (difficult, optional):**

Loop over the items of a given list (see below) using a **while loop** and print each item.

In [None]:
# given list
starwarslist = [ 'Wookie', 'Ewok', 'Java', 'Rodian' ]

# Space for your code:

**Solution**

In [None]:
# given list
starwarslist = [ 'Wookie', 'Ewok', 'Java', 'Rodian' ]

# Space for your code:

i = 0

while i < len(starwarslist):

  print(starwarslist[i])
  i = i + 1

## Dictionaries
A dictionary is a bit like a list, but instead of index numbers you have keys.
Like each index number in a list, each key in a dictionary has a value. This
means that a dictionary consists of key value pairs. You can use dictionaries
for example, to represent one entity in a database (e.g. a real person / thing):

Severin Göbel |   -
----|---
Profession | physician
Date of Birth | 1530
Place of Birth | Königsberg
Date of Death | 1612
Place of Death | Königsberg

*An entity, as you would find it in a typical database. On the left
side you can see the keys, on the right side the corresponding values.*

Now let us see, how we can create a dictionary that corresponds to the entity
’Severin Göbel’ from the image above.

In [None]:
# Dictionaries are enclosed in curly brackets. They are created according to the schema key : value. Note that we have to separate the key value pairs
# by commas.

# We create the dictionary 'severinGoebelDict' for the person Severin Göbel
severinGoebelDict = {
  'profession': 'physician',
  'dateOfBirth': 1530,
  'placeOfBirth': 'Königsberg',
  'dateOfDeath': 1612,
  'placeOfDeath': 'Königsberg'
}

# Now let us display the dictionary
print(severinGoebelDict)

### Access and add items
You can access / add items by referring to the key (just like we used the index
number of a list, when we wanted to know a certain value):

In [None]:
# We create the dictionary severinGoebelDict again
severinGoebelDict = {
  'profession': 'physician',
  'dateOfBirth': 1530,
  'placeOfBirth': 'Königsberg',
  'dateOfDeath': 1612,
  'placeOfDeath': 'Königsberg'
}

# Now let us display his profession:
print(severinGoebelDict['profession'])

# Using the same principle we can add a key-value pair to the dictionary.
# Let us say we want to add an entry about his wife
severinGoebelDict['wife'] = 'Ursula'

# Now lets us display the dictionary again
print(severinGoebelDict)

### Removing items
You can remove a key-value pair by pointing to the key with the pop method.

In [None]:
# We create the dictionary severinGoebelDict again
severinGoebelDict = {
  'profession': 'physician',
  'dateOfBirth': 1530,
  'placeOfBirth': 'Königsberg',
  'dateOfDeath': 1612,
  'placeOfDeath': 'Königsberg'
}
# Now let us remove the key value pair 'profession': 'physician',:
severinGoebelDict.pop('profession')

print(severinGoebelDict)

### Looping through a dictionary

By pointing to the keys you can loop through a dictionary just like you can loop
through a list:

In [None]:
# We create the dictionary severinGoebelDict again
severinGoebelDict = {
  'profession': 'physician',
  'dateOfBirth': 1530,
  'placeOfBirth': 'Königsberg',
  'dateOfDeath': 1612,
  'placeOfDeath': 'Königsberg'
}

# Now let us loop through the dictionary and display the keys
print('These are the keys:')
for x in severinGoebelDict:
  # we will display the key
  print(x)

# Now let us loop through the dictionary and display the values
print('These are the values:')
for x in severinGoebelDict:
  # we will display the values by using the keys
  print(severinGoebelDict[x])

There is also another (perhaps easier) way of looping through a dictionary. This is done with theitems method, which creates a list for all key-value pairs, where each pair is stored in a tuple (something similar to a list).




In [None]:
thisdict = {
  'profession': 'physician',
  'dateOfBirth': [1530, 1532],
  'placeOfBirth': 'Königsberg',
  'dateOfDeath': 1612,
  'placeOfDeath': 'Königsberg'
}

# With the print function we can see how the method works
print(thisdict.items())

# Here is how to use the method
for key, value in thisdict.items():
  #print(key, value)
  print('They key is:' + key + 'and the value is: ' + str(value))


### Further References
- https://www.geeksforgeeks.org/python-dictionary/
- https://www.geeksforgeeks.org/loops-in-python/

### Exercises with solutions



**Exercise 1:**

The following dictionary contains information about the novel 'At the mountains of madness' which was written by Howard Philipp Lovecrafts.

**Tasks**
* Add a new key 'topic' with the value 'horror'
* remove the following key-value pair: 'year': 1936
* Write a script that loops through the key-value pairs and displays them.


In [None]:
bookDict = {
  'title': 'At the Mountains of Madness',
  'author': 'Howard Phillips Lovecraft',
  'year': 1936
}

# Space for your code:

**Solution**

In [None]:
bookDict = {
  'title': 'At the Mountains of Madness',
  'author': 'Howard Phillips Lovecraft',
  'year': 1936
}

# Space for your code:

# We add the key value pair topic-horror
bookDict['topic'] = 'horror'

# We remove the key value pair year 1936
bookDict.pop('year')

# we loop trough the Dictionary
for key, value in bookDict.items():
  print(key + ': ' +  str(value))

# Alternative way of loopong
for key in bookDict:
  print(key + ': ' +  str(bookDict[key]))

**Exercise 2 (difficult, optional):**

Sometimes dictionaries are stored inside another dictionary or inside a list. The latter is the case, for example, when you want to display not one, but several entities of the same type within a database (for example, all persons):

Given is a list containing two dictionaries (see below)
* Display the professions for both list entries.
* The person who created the second entry made a mistake. Joachim Camerarius died in Leipzig, not in Königsberg. Correct the respective value and display the entire list
* now remove the first dictionary from the list
* now create a new dictionary for the humanist philipp melanchthon and add it to the list (he was born 1497 in Bretten and died 1560 in Wittenberg). His profession should be 'humanist'.
* display the list again


**Solution**

In [None]:
# Here is a list containing two dictionaries
listOfTwoDicts = [
                  {'name': 'Severin Göbel',
                   'profession': 'physician',
                   'dateOfBirth': 1530,
                   'placeOfBirth': 'Königsberg',
                   'dateOfDeath': 1612,
                   'placeOfDeath': 'Königsberg'
                   },
                  {'name': 'Joachim Camerarius',
                   'profession': 'humanist',
                   'dateOfBirth': 1500,
                   'placeOfBirth': 'Bamberg',
                   'dateOfDeath': 1574,
                   'placeOfDeath': 'Königsberg'
                   }
]

# Space for your code:

# First task:
# We loop trough the dictionary
for personDict in listOfTwoDicts:

  # we print the value of the key 'profession' for each dictionary
  print(personDict['profession'])

# Second task:
# we want to correct the second dictionary in the list of Dictionaries
# that means we want to access the indexnumber 1 ( = listOfTwoDicts[1] )
# Let us name it camerariusdDict

camerariusDict = listOfTwoDicts[1]

# Now let us change the placeOfDeath:
camerariusDict['placeOfDeath'] = 'Leipzig'

# An alternative way would be
# listOfTwoDicts[1]['placeOfDeath'] = 'Leipzig'

print(listOfTwoDicts)

# Third task:
listOfTwoDicts.pop(0)

# Fourth task:
melanchthonDict = {'name': 'Philipp Melanchthon',
                   'profession': 'physician',
                   'dateOfBirth': 1497,
                   'placeOfBirth': 'Bretten',
                   'dateOfDeath': 1560,
                   'placeOfDeath': 'Wittenberg'
                  }

listOfTwoDicts.append(melanchthonDict)

# Fifth task
print(listOfTwoDicts)

## Functions

A function is a block of code that is executed only when *called*, which means
you can define one or more commands as a function that can be executed on
demand. For this we use the keyword *def*:

In [None]:
# We define the function 'printfunction' that displays the word 'Hello'
def printfunction():
  print('Hello')

# Now we call the function. That means we execute it. If we would not call the function, nothing would happen.
printfunction()

You should always give functions descriptive names that accurately describe
their purpose (e.g., ”listComparison” instead of ”myfunction”). This makes the
code easier to read and helps other people understand what the program is do-
ing. I myself add a comment to all functions I write, explaining the purpose
and operation of the function. That way, even a year later, I know why I wrote
a function the way I did.

### Arguments / parameters
When you call a function, you can pass *arguments* / *parameters* to the function.
This means that each time you call a function, you can pass a different input to
the function:

In [None]:
# We define the function 'addWhiteSpaces'. It adds a white space after each character of a word, that is passed to the function as an argument.
def addWhiteSpaces(word):
  # We create a variable 'newWord' containing an empty string, where we will store our result
  newWord = ''
  # now we iterate over each character of the word:
  for character in word:
    # Now we add the character to the variable newWord and add a white space
    newWord = newWord + character + ' '
    # Now the loop is finished, but we have on white space too much at the end of 'newWord'. So we will display newWord without the last letter
  print(newWord[:-1])

# Now we can call the functions and pass different words to it as arguments
addWhiteSpaces('Hello')
addWhiteSpaces('You')
addWhiteSpaces('Hi')

The number of arguments passed to a function must match the number of
arguments defined in the function:

In [None]:
# We define a function 'combineWords'. It has two arguments: 'word1' and 'word2'
# The goal is to combine two words with a blank space between them.
def combineWords(word1, word2):
  print(word1 + ' ' + word2)

# Now we call the function and pass the words 'Hello' and 'You' as arguments
combineWords('Hello', 'You')

### Return values
Sometimes you just want a function to return a value so that you can move
on. This can be the case, for example, if you want to generate an intermediate
result to use in further code or in another function. This is done with the *return keyword*:

In [None]:
# We create the function "combineWords". It has two arguments: word1 and word2
def combineWords(word1, word2):
  # it returns a string, that is the concatenation of word1, a white space and word2
  return word1 + ' ' + word2

# We call the function combineWords and pass the words "Hello" and "you". The result is stored in the variable a
a = combineWords('Hello', 'you')
# We print the variable a
print(a)

### Local and global variables
A variable created outside a function is a *global variable*. A variable created
inside a function is a *local variable*. *Global variables* can be accessed from anywhere, *local variables* can only be accessed within their respective functions.

This is why the following code will not work:

In [None]:
# we define the helloFunction. It will display 'Hello'
def helloFunction():
  x = 'Hello' # We define x
  print(x)

 # We display x
helloFunction() # We call the function helloFunction
print(x)        # We try to display x, but x is not a global variable.
                # An error will occur

If you want to define a global variable inside a function, you can use the *global keyword*:

In [None]:
# we define the function globalVariableInsideAFunction. It will display 'Hi'
def globalVariableInsideAFunction():
  global x # We define x as global
  x = 'Hi' # We define x

# We call the function
globalVariableInsideAFunction()

# We display x
print(x)

### Further References
- https://www.w3schools.com/python/python_functions.asp
- https://www.w3schools.com/python/python_variables_global.asp
- https://www.geeksforgeeks.org/python-functions/
- https://www.geeksforgeeks.org/global-local-variables-python/

### Exercise

**Exercise:**

Write a script using a function. It prompts the user to type in a word and checks whether the word is a value in the dictionary camerariusDict (see below). A success- or failuremessage shall be displayed.

*Optional task: Show also where the word was found (i.e. the corresponding key(s))*


In [None]:
# Here is a list containing two dictionaries
camerariusDict =  {'name': 'Severin Göbel',
                   'profession': 'physician',
                   'dateOfBirth': 1530,
                   'placeOfBirth': 'Königsberg',
                   'dateOfDeath': 1612,
                   'placeOfDeath': 'Königsberg'
                   }

# Space for your code:

### Solution

In [None]:
# Here is a list containing two dictionaries
camerariusDict =  {'name': 'Severin Göbel',
                   'profession': 'physician',
                   'dateOfBirth': 1530,
                   'placeOfBirth': 'Königsberg',
                   'dateOfDeath': 1612,
                   'placeOfDeath': 'Königsberg'
                   }

# Space for your code:

inputWord = input('Please type in a word\n')

for key in camerariusDict:

  if inputWord == camerariusDict[key]:

    print('yes')
    print('The word was found as the value for the following key: ' + key)

## Interacting with files
Working with files can be very useful. I recommend using with open. It is similar to a function with a few arguments:
- First of all the filename in quotes (e.g. 'starwars.txt')
- Then we define the type of access we want to have. This is a single letter
that tells the interpreter, what we want to do. This means we declare, if
we want to read a file ('r'), write a file ('w') or append something to the
end of the file ('a'). If it does not already exist, we need to add + to the
letter (e.g. 'w+')
- Sometimes you need to specify the encoding (for example: encoding='utf-
8'). This will be explained in more detail during the course.

Lets us look at an example. First of all let’s open an already existing file for
writing:

In [None]:
# We want writing access to the existing file 'starwars.txt'
# It is encoded with 'utf-8'
# We store the accessed file in the variable 'outputfile'

with open('starwars.txt', 'w', encoding='utf-8') as outputfile:

  # Now we write 'A long time ago...' into the file
  outputfile.write('A long time ago...\n')
  outputfile.write('\n')
  outputfile.write('The end.')

Now we want to open an existing file and read its content. There are two ways
to do this: the *read method* and the *readlines method*. With the *read method*
we can store the entire content of the file in a single string variable, with the *readlines method* we can store all the lines of the file in a single list variable (a list, where each entry corresponds to a line of the file).

### The read method

In [None]:
# We want reading access to the existing file 'starwars.txt'
# It is encoded with 'utf-8'
# We store the accessed file in the variable 'inputfile'
with open('starwars.txt', 'r', encoding='utf-8') as inputfile:
  # Now we store the filecontent as a string in the variable content
  content = inputfile.read()
  print(content)

*Note: This script will not work if you haven't run the first script in the Interacting with Files section (you can't read a file that hasn't been created).*

### The readlines method

In [None]:
# We want reading access to the existing file 'starwars.txt'
# It is encoded with 'utf-8'

# We store the accessed file in the variable 'inputfile'
with open('starwars.txt', 'r', encoding='utf-8') as inputfile:
  # Now we store the filecontent as a string in the variable content
  content = inputfile.readlines()
  print(content)

  for element in content:
    print(element[:-1])

*Note: This script will not work if you haven't run the first script in the Interacting with Files section (you can't read a file that hasn't been created).*

Instead of using the *readlines method* you can just loop through the elements of the file:

In [None]:
# We want reading access to the existing file 'starwars.txt'
# It is encoded with 'utf-8'

# We store the accessed file in the variable 'inputfile'
with open('starwars.txt', 'r', encoding='utf-8') as inputfile:
  for line in inputfile:
    print('Content of the line:' + line[:-1])

*Note: This script will not work if you haven't run the first script in the Interacting with Files section (you can't read a file that hasn't been created).*

### Further References
- https://www.w3schools.com/python/python_file_handling.asp
- https://www.geeksforgeeks.org/file-handling-python/
- https://automatetheboringstuff.com/2e/chapter9/

### (Exercises)

We will train how to work with files in the practical part (see there).

## String functions

Strings are similar to lists. To access certain parts of a string or to check if a string is inside another string, see the chapter on *Lists*.


But there are also a number of methods specific to strings. Here are some
important ones:

Command    | Explanation
-----------|----------------------------------------------------
endswith() | Checks if a string ends with a specific value (e.g. a character).
find() | Finds a string inside another string and returns the indexnumber, where the string was found.
isdigit() | Checks if the characters of a string are digits.
islower() | Checks if the characters of a string are lowercase.
isupper() | Checks if the characters of a string are uppercase.
rfind() | See the find method, but rfind returns the indexnumber, where the string was found.
split() | Splits a string at a separator (like a comma or semicolon for example). The result is stored as a list.
splitlines() | Splits a string into lines. The result is stored as a list.
startswith() | Checks if a string starts with a specific value (e.g. a character).
strip() | Removes white spaces at the beginning and end of a string. Instead of white spaces other characters can be removed as well.


For further methods see Further References.


### Further References
- https://www.w3schools.com/python/python_strings_methods.asp
- https://automatetheboringstuff.com/2e/chapter6/

## Error handling

Sometimes you do not want your entire program to quit, when a small error
occurs. We can solve this problem with the statements *try* and *except*.



In [None]:
# We create a list with three words
wordlist = [ 'Maria', 'John', 'Joseph' ]

# We want to display the 5th character of each word, but the word 'John'
# has only 4 characters and would throw an error that would
# stop the entire program. So use 'try' and 'except'

# we iterate over the list
for word in wordlist:

  try: # we check, if we can display the 5th character
    print(word[4]) # remember: in computer science we start counting with 0

  except: # if this is not possible, we want to display a message
    print('This word has less than five characters.')

### Further References
- https://www.w3schools.com/python/python_try_except.asp
- https://www.geeksforgeeks.org/python-exception-handling/

### (Exercises)

We will train how to work with files in the practical part (see there).

## Importing modules

A module is a set of ready-to-use functions that you can import and start using
right away. This is done via the *import statement*.

Let us import the *time* module. With it we can make the program wait for a
certain number of seconds.

In [None]:
# We import the time module
import time

# We make the program wait for 5 seconds
time.sleep(5)

# Now we display 'Five seconds have passed.'
print('Five seconds have passed.')

In general, most modules should already be installed in Google Colabs. If a
module is not installed, you can install it with the following command, where
MODULENAME is the name of the package you want to install:

!pip install MODULENAME

### Further References
- https://www.w3schools.com/python/python_modules.asp
- https://www.geeksforgeeks.org/python-modules/

# Useful modules for the Digital Humanities

There are many useful modules for almost every case you can imagine. The
following list will show you some of the most useful ones for DH projects. Do
not hesitate to try them out. After all, the goal of this workshop is to open
up perspectives so that you can choose exactly what you need for your specific
project.

## RegEx
With **regular expressions** you can search for strings, that matter a specific
pattern, like for example phonenumbers, dates, certain types of names... They
can be very helpful and are essential, if you work with texts.

Tutorials:
- https://automatetheboringstuff.com/2e/chapter7/
- https://www.w3schools.com/python/python_regex.asp
- https://www.geeksforgeeks.org/regular-expression-python-examples/

## Interacting with CSV and JSON Files
**CSV** and **JSON** files are common formats when it comes to store or exchange
information. For each file type there is a certain module:

Tutorials:
- https://automatetheboringstuff.com/2e/chapter16/
- https://www.w3schools.com/python/python_json.asp
- https://www.geeksforgeeks.org/python-json/
- https://www.geeksforgeeks.org/working-csv-files-python/

## Machine Learning
**TensorFlow** is a good tool for machine learning.

Websites and Tutorials:
- https://www.geeksforgeeks.org/introduction-to-tensorflow/
- https://pypi.org/project/tensorflow/
- https://www.tensorflow.org/learn/

## Working with Websites and XML-Files
With the **request module** you can access websites, check if they are active and
download their source code.

- Short tutorial: https://www.w3schools.com/python/module_requests.
asp
- Longer tutorial: https://www.geeksforgeeks.org/python-requests-tutorial/

When you are working with HTML or XML files, **Beautiful Soup** is the module you want to have. It is very powerful and useful. It is also very good for **web scraping**.

Tutorials and Documentation:
- https://beautiful-soup-4.readthedocs.io/en/latest/
- https://www.geeksforgeeks.org/implementing-web-scraping-python-beautiful-soup/

## Natural language processing (NLP)
**NLTK** (Natural Language Toolkit): It is a tool for analyzing and working
with language using computational linguistics. Its data is based on ”50 corpora
and lexical resources”.

- Website and introduction: https://www.nltk.org/

**CLTK** (Classical Language Toolkit): It is the equivalent of the NLTK, but
for the languages of pre-modern Eurasia.
- Website and introduction: http://cltk.org/

## Database management
**mysql.connector** is a module that allows you to access MySQl databases.
- Tutorial: https://realpython.com/python-mysql/

**Pywikibot** is a module that allows you to access MediaWiki databases through
a bot. It can be a handy tool, but the installation is tricky.
- Website and Tutorial: https://www.mediawiki.org/wiki/Manual:

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## Data analysis and visualization
**Numpy** is the module to use when it comes to data analysis. It is incredibly
fast and offers various functions to perform all kinds of calculations.
- Website: https://numpy.org/
- Tutorial: https://www.geeksforgeeks.org/python-numpy/

**Pandas** is built on top of numpy, making the import and analyzation of data
easier. You can combine it with **matplotlib** to visualize your data.
- Data visualization tutorial:
https://www.geeksforgeeks.org/data-visualization-with-python/
- Matplotlib tutorial:
https://www.geeksforgeeks.org/matplotlib-tutorial/
- Pandas tutorial:
https://www.geeksforgeeks.org/pandas-tutorial/

## Network Visualization
**Pathpy** is a mighty and flexibel tool for network analysis and visualization.
- Website and tutorial https://www.pathpy.net/

# Outlook


## Possible Next Steps
Local installation
You may want to install python locally on your computer. There are many good
editors you can use, among them Visual Studio Code and PyCharm.

**Visual Studio Code** allows you to work with almost any programming language.
**Pycharm** is specifically designed for Python and has a free community version
as well as a paid professional version. Both are great editors.

You can also use other editors. But for security reasons, make sure they work
with **virtual environments** so that installing modules does not affect your system.

Intallation:
- Windows: Goto https://www.python.org/
- Options:
- Installation should be on path
- Intall pip




## Advanced Studies
Chapter 3 introduced us to many new modules that do not require professional
knowledge of Python. Do not hesitate to try them out.
However, this course did not cover some topics that are important if you
want to work with Python at a professional level. These include the following
not-so-simple topics:
- *Recursion*
- *Object Orientation*
If you want to get more than a basic understanding of Python, you should take
a closer look at these topics. For this, I recommend the websites and tutorials
mentioned in chapter 1.2.

I also recommend reading the following article on **clean code**:
https://www.geeksforgeeks.org/best-practices-to-write-clean-python-code/