
<h1>Welcome to Digital Methods in the Humanities</h1>

## This is a Jupyter Notebook

This document is a [Jupyter Notebook](https://jupyter-notebook.readthedocs.io/en/stable/notebook.html). It's a way of combining live computer code with rich content like instructions, links, images, sound, etc.

All the programming we do in Digital Methods will be carried out in Notebooks like this one, running on a remote server, and accessed through your favourite web browser.

### Cells, code and Markdown

A Notebook document is divided into **cells**. Each cell can contain either **code** or non-code content like this message. The non-code is written in **Markdown**, a mixture of plain text, special formatting codes, and HTML.

Cells containing code should have an annotation in the margin like `[ ]:`. The cell directly below this one is an example. Try typing or pasting the following code into it:

```python
print('Hello, world!')
```

Then run your code by choosing <samp>Run</samp> <i class="fa fa-arrow-right"></i> <samp>Run Selected Cell</samp> from the menu, or by clicking the <i class="fa fa-play"></i> <samp>Run</samp> button above.

You should see some output appear beneath your code:
```
     Hello, world!
```  
You should also see a little number appear in the margin, something like `[1]:`. This tells you that the cell has been run, and that it was the first cell that you ran in this document. The order which you run code makes a difference, so it's important to keep track!

### The Python 3 kernel

The code we'll be writing is in the Python 3 language. When you run a cell containing code, it's executed by a Python 3 interpreter that lives on the remote server. There's a special instance of the interpreter running just for your Notebook, called the **kernel**.

The kernel remembers what you've done so far, even in another cell. For example, in the cell below, type:

```python
foo = 3
```

You've created a **variable** and given it a value. Now, in the cell below, check that the Python kernel remembers your variable:

```python
print(foo)
```

Did it work?  Try changing your variable:

```python
foo = foo + 1
```

Check the new value:
```python
print(foo)
```

<div class="alert alert-warning">
If you need to stop a program that's gone wrong, you can use the <samp>Kernel</samp> <i class="fa fa-arrow-right"></i> <samp>Interrupt</samp> menu item. If you want Python to forget everything you've done so far, choose <samp>Kernel</samp> <i class="fa fa-arrow-right"></i> <samp>Restart</samp> or one of its variants. Don't worry, your code won't be deleted.
</div>

### Switching between code and Markdown

New cells default to code content, but you can write Markdown, too. Look for a dropdown that says <samp>Code</samp> and select <samp>Markdown</samp> instead. The `[ ]:` annotation should disappear.

You can type plain text into a markdown cell to make notes, explain things, or answer questions. Markdown has a bunch of [special codes](http://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html) for formatting your text, if you like, and you can also use raw HTML tags too. Try changing the cell below to Markdown and adding your own text.

# Automate the Boring Stuff With Python
## Chapter 1


The following exercises are stolen from <em>Automate the Boring Stuff</em>, Chapter 1.

Wherever you see the prompt <code>&gt;&gt;&gt;</code> in <em>ABS</em>, you should enter the corresponding text into its own cell in your notebook. For example,
<pre>
 &gt;&gt;&gt; 2 + 2 
 4
</pre>
becomes . . .

In [1]:
2 + 2

4

Try it below:

In Python, `2 + 2` is called an <em>expression</em>, which is the most basic kind of programming instruction in the language. Expressions consist of <em>values</em> (such as `2`) and <em>operators</em> (such as `+`), and they can always <em>evaluate</em> (that is, reduce) down to a single value. That means you can use expressions anywhere in Python code that you could also use a value.

In the previous example, `2 + 2` is evaluated down to a single value, `4`. A single value with no operators is also considered an expression, though it evaluates only to itself, as shown here:

```python
2
```

<div class="alert alert-secondary">
<h5>Errors are Okay!</h5>

<p>Programs will crash if they contain code the computer can’t understand, which will cause Python to show an error message. An error message won’t break your computer, though, so don’t be afraid to make mistakes. A <emph>crash</emph> just means the program stopped running unexpectedly.</p>

<p>If you want to know more about an error message, you can search for the exact message text online to find out more about that specific error. You can also check out the resources at http://nostarch.com/automatestuff/ to see a list of common Python error messages and their meanings.</p>
</div>

There are plenty of other operators you can use in Python expressions, too. For example, **Table 1-1** lists all the math operators in Python.

Table 1-1. Math Operators from Highest to Lowest Precedence

|Operator | Operation | Example | Evaluates to...|
|---------|-----------|---------|----------------|
|`**`  | Exponent  | `2 ** 3`| `8`            |
|`%`   | Modulus/remainder | `22 % 8` | `6`        | 
| `//` | Integer division/floored quotient | `22 // 8` | `2` |
| `/`  | Division  | `22 / 8` | `2.75` |
| `*`  | Multiplication | `3 * 5` | `15` |
| `-`  | Subtraction | `5 - 2` | `3` |
| `+`  | Addition | `2 + 2` |`4` |


The *order of operations* (also called *precedence*) of Python math operators is similar to that of mathematics. The `**` operator is evaluated first; the `*`, `/`, `//`, and `%` operators are evaluated next, from left to right; and the `+` and `-` operators are evaluated last (also from left to right). You can use parentheses to override the usual precedence if you need to. Whitespace in between the operators and values doesn't matter for Python (except for the indentation at the beginning of the line), but a single space is convention. Enter the following expressions into the cells below:
```python
2 + 3 * 6
```

```python
(2 + 3) * 6
```

```python
48565878 * 578453
```

```python
2 ** 8
```

```python
23 / 7
```

```python
23 // 7
```

```python
23 % 7
```

```python
 2     +            2
```

```python
(5 - 1) * ((7 + 1) / (3 - 1))
```

In each case, you as the programmer must enter the expression, but Python does the hard part of evaluating it down to a single value. Python will keep evaluating parts of the expression until it becomes a single value, as shown in **Figure 1-1**.

<img src="https://automatetheboringstuff.com/images/000056.png" alt="flow diagram">
Figure 1-1. Evaluating an expression reduces it to a single value.


These rules for putting operators and values together to form expressions are a fundamental part of Python as a programming language, just like the grammar rules that help us communicate. Here’s an example:

> This is a grammatically correct English sentence. 

> This grammatically is sentence not English correct a.

The second line is difficult to parse because it doesn’t follow the rules of English. Similarly, if you type in a bad Python instruction, Python won’t be able to understand it and will display a `SyntaxError` error message, as shown here:

```python
5 +
```

```python
42 + 5 + * 2
```

You can always test to see whether an instruction works by typing it into the notebook. Don’t worry about breaking the computer: The worst thing that could happen is that Python responds with an error message. Professional software developers get error messages while writing code all the time.

### The Integer, Floating-Point, and String Data Types

Remember that expressions are just values combined with operators, and they always evaluate down to a single value. A *data type* is a category for values, and every value belongs to exactly one data type. The most common data types in Python are listed in **Table 1-2**. The values `-2` and `30`, for example, are said to be *integer* values. The integer (or *int*) data type indicates values that are whole numbers. Numbers with a decimal point, such as `3.14`, are called *floating-point numbers* (or *floats*). Note that even though the value `42` is an integer, the value `42.0` would be a floating-point number.

Table 1-2. Common Data Types

<table class="table">
    <thead>
        <tr><th>Data type</th><th>Examples</th></tr>
    </thead>
    <tbody>
        <tr>
            <td>Integers</td>
            <td>
                <code>-2</code>,
                <code>-1</code>,
                <code>0</code>, 
                <code>1</code>, 
                <code>2</code>, 
                <code>3</code>, 
                <code>4</code>, 
                <code>5</code>
            </td>
        </tr>
        <tr>
            <td>Floating-point numbers</td>
            <td>
                <code>-1.25</code>,
                <code>-1.0</code>,
                <code>0.5</code>, 
                <code>0.0</code>, 
                <code>0.5</code>, 
                <code>1.0</code>, 
                <code>1.25</code>
            </td>
        </tr>
        <tr>
            <td>Strings</td>
            <td>
                <code>'a'</code>,
                <code>'aa'</code>,
                <code>'aaa'</code>, 
                <code>'Hello!'</code>, 
                <code>'11 cats'</code>
            </td>
        </tr>
    </tbody>
</table>

Python programs can also have text values called `strings`, or `strs` (pronounced “stirs”). Always surround your string in single quote (`'`) characters (as in `'Hello'` or `'Goodbye cruel world!'`) so Python knows where the string begins and ends. You can even have a string with no characters in it, `''`, called a *blank string*. Strings are explained in greater detail in Chapter 4.

If you ever see the error message `SyntaxError: EOL while scanning string literal`, you probably forgot the final single quote character at the end of the string, such as in this example:

```python
'Hello world!
```

### String Concatenation and Replication

The meaning of an operator may change based on the data types of the values next to it. For example, `+` is the addition operator when it operates on two integers or floating-point values. However, when `+` is used on two string values, it joins the strings as the string concatenation operator. Enter the following into the <del>interactive shell</del> <ins>cell below</ins>:

```python
'Alice' + 'Bob'
```

The expression evaluates down to a single, new string value that combines the text of the two strings. However, if you try to use the `+` operator on a string and an integer value, Python will not know how to handle this, and it will display an error message.

```python
'Alice' + 42
```

The error message means that Python thought you were trying to concatenate an integer to the string `'Alice'`. Your code will have to explicitly convert the integer to a string, because Python cannot do this automatically. (Converting data types will be explained in **Dissecting Your Program** when talking about the `str()`, `int()`, and `float()` functions.)

The `*` operator is used for multiplication when it operates on two integer or floating-point values. But when the `*` operator is used on one string value and one integer value, it becomes the *string replication* operator. Enter a string multiplied by a number into the <ins>cell below</ins> to see this in action.

```python
'Alice' * 5
```

The expression evaluates down to a single string value that repeats the original a number of times equal to the integer value. String replication is a useful trick, but it’s not used as often as string concatenation.

The `*` operator can be used with only two numeric values (for multiplication) or one string value and one integer value (for string replication). Otherwise, Python will just display an error message.

```python
'Alice' * 'Bob'
```

```python
'Alice' * 5.0
```

It makes sense that Python wouldn’t understand these expressions: You can’t multiply two words, and it’s hard to replicate an arbitrary string a fractional number of times.

### Storing Values in Variables

A *variable* is like a box in the computer’s memory where you can store a single value. If you want to use the result of an evaluated expression later in your program, you can save it inside a variable.

### Assignment Statements

You’ll store values in variables with an *assignment statement*. An assignment statement consists of a variable name, an equal sign (called the *assignment operator*), and the value to be stored. If you enter the assignment statement `spam = 42`, then a variable named `spam` will have the integer value `42` stored in it.

Think of a variable as a labeled box that a value is placed in, as in **Figure 1-2**.

<img src="https://automatetheboringstuff.com/images/000060.jpg">

Figure 1-2. `spam = 42` is like telling the program, “The variable `spam` now has the integer value `42` in it.”

For example, enter the following:

```python
spam = 40          # ①
spam
```

```python
eggs = 2
```

```python
spam + eggs        # ②
```

```python
spam + eggs + spam
```

```python
spam = spam + 2    # ③
spam
```

A variable is *initialized* (or created) the first time a value is stored in it ①. After that, you can use it in expressions with other variables and values ②. When a variable is assigned a new value ③, the old value is forgotten, which is why `spam` evaluated to `42` instead of `40` at the end of the example. This is called *overwriting* the variable. Enter the following code to try overwriting a string:

```python
spam = 'Hello'
spam
```

```python
spam = 'Goodbye'
spam
```

Just like the box in *Figure 1-3*, the `spam` variable in this example stores `'Hello'` until you replace it with `'Goodbye'`.

<img src="https://automatetheboringstuff.com/images/000064.jpg">

Figure 1-3. When a new value is assigned to a variable, the old one is forgotten.

### Variable Names

*Table 1-3* has examples of legal variable names. You can name a variable anything as long as it obeys the following three rules:

 1. It can be only one word.

 2. It can use only letters, numbers, and the underscore (_) character.

 3. It can’t begin with a number.

Table 1-3. Valid and Invalid Variable Names

|Valid variable names|Invalid variable names|
|--|--|
|`balance`|`current-balance` (hyphens are not allowed)|
|`currentBalance`|`current balance` (spaces are not allowed)|
|`current_balance`|`4account` (can’t begin with a number)|
|`_spam`|`42` (can’t begin with a number)|
|`SPAM`|`total_$um` (special characters like `$` are not allowed)|
|`account4`|`'hello'` (special characters like `'` are not allowed)|

Variable names are case-sensitive, meaning that `spam`, `SPAM`, `Spam`, and `sPaM` are four different variables. It is a Python convention to start your variables with a lowercase letter.

This book uses camelcase for variable names instead of underscores; that is, variables `lookLikeThis` instead of `looking_like_this`. Some experienced programmers may point out that the official Python code style, PEP 8, says that underscores should be used. I unapologetically prefer camelcase and point to “A Foolish Consistency Is the Hobgoblin of Little Minds” in PEP 8 itself:

> Consistency with the style guide is important. But most importantly: know when to be inconsistent—sometimes the style guide just doesn’t apply. When in doubt, use your best judgment.

A good variable name describes the data it contains. Imagine that you moved to a new house and labeled all of your moving boxes as *Stuff*. You’d never find anything! The variable names `spam`, `eggs`, and `bacon` are used as generic names for the examples in this book and in much of Python’s documentation (inspired by the Monty Python “Spam” sketch), but in your programs, a descriptive name will help make your code more readable.

### Your First Program

Now it’s time to create your first program!

```python
# This program says hello and asks for my name.              # ① 
                                                             #
print('Hello world!')                                        # ② 
print('What is your name?')    # ask for their name          #
myName = input()                                             # ③
print('It is good to meet you, ' + myName)                   # ④
print('The length of your name is:')                         # ⑤
print(len(myName))                                           # 
print('What is your age?')    # ask for their age            # ⑥
myAge = input()                                              #
print('You will be ' + str(int(myAge) + 1) + ' in a year.')  #
```


### Dissecting Your Program

Let’s take a quick tour of the Python instructions it uses by looking at what each line of code does.

### Comments

The following line is called a *comment*.

```python
# This program says hello and asks for my name.              # ① 
```

Python ignores comments, and you can use them to write notes or remind yourself what the code is trying to do. Any text for the rest of the line following a hash mark (`#`) is part of a comment.

Sometimes, programmers will put a `#` in front of a line of code to temporarily remove it while testing a program. This is called *commenting out* code, and it can be useful when you’re trying to figure out why a program doesn’t work. You can remove the `#` later when you are ready to put the line back in.

Python also ignores the blank line after the comment. You can add as many blank lines to your program as you want. This can make your code easier to read, like paragraphs in a book.

### The `print()` Function

The `print()` function displays the string value inside the parentheses on the screen.

```python
print('Hello world!')                                        # ② 
print('What is your name?')    # ask for their name          #
```

The line `print('Hello world!')` means “Print out the text in the string `'Hello world!'`.” When Python executes this line, you say that Python is *calling* the `print()` function and the string value is being *passed* to the function. A value that is passed to a function call is an *argument*. Notice that the quotes are not printed to the screen. They just mark where the string begins and ends; they are not part of the string value.

#### Note

*You can also use this function to put a blank line on the screen; just call `print()` with nothing in between the parentheses.*

When writing a function name, the opening and closing parentheses at the end identify it as the name of a function. This is why in this book you’ll see `print()` rather than `print`. **Chapter 2** describes functions in more detail.

### The `input()` Function

The `input()` function waits for the user to type some text on the keyboard and press <kbd>ENTER</kbd>.

```python
myName = input()                                             # ③
```

This function call evaluates to a string equal to the user’s text, and the previous line of code assigns the `myName` variable to this string value.

You can think of the `input()` function call as an expression that evaluates to whatever string the user typed in. If the user entered `'Al'`, then the expression would evaluate to `myName = 'Al'`.

### Printing the User’s Name

The following call to `print()` actually contains the expression `'It is good to meet you, ' + myName` between the parentheses.

```python
print('It is good to meet you, ' + myName)                   # ④
```

Remember that expressions can always evaluate to a single value. If `'Al'` is the value stored in `myName` on the previous line, then this expression evaluates to `'It is good to meet you, Al'`. This single string value is then passed to `print()`, which prints it on the screen.

### The `len()` Function

You can pass the `len()` function a string value (or a variable containing a string), and the function evaluates to the integer value of the number of characters in that string.

```python
print('The length of your name is:')                         # ⑤
print(len(myName))                                           # 
```

Enter the following to try this:

```python
len('hello')
```

```python
len('My very energetic monster just scarfed nachos.')
```

```python
len('')
```

Just like those examples, `len(myName)` evaluates to an integer. It is then passed to `print()` to be displayed on the screen. Notice that `print()` allows you to pass it either integer values or string values. But notice the error that shows up when you type the following <del>into the interactive shell</del>:

```python
print('I am ' + 29 + ' years old.')
```

The `print()` function isn’t causing that error, but rather it’s the expression you tried to pass to `print()`. You get the same error message if you type the expression into the interactive shell on its own.

```python
'I am ' + 29 + ' years old.'
```

Python gives an error because you can use the `+` operator only to add two integers together or concatenate two strings. You can’t add an integer to a string because this is ungrammatical in Python. You can fix this by using a string version of the integer instead, as explained in the next section.

### The `str()`, `int()`, and `float()` Functions

If you want to concatenate an integer such as `29` with a string to pass to `print()`, you’ll need to get the value `'29'`, which is the string form of `29`. The `str()` function can be passed an integer value and will evaluate to a string value version of it, as follows:

```python
str(29)
```

```python
print('I am ' + str(29) + ' years old.')
```

Because `str(29)` evaluates to `'29'`, the expression `'I am ' + str(29) + ' years old.'` evaluates to `'I am ' + '29' + ' years old.'`, which in turn evaluates to `'I am 29 years old.'`. This is the value that is passed to the `print()` function.

The `str()`, `int()`, and `float()` functions will evaluate to the string, integer, and floating-point forms of the value you pass, respectively. Try converting some values in the interactive shell with these functions, and watch what happens.

```python
str(0)
```

```python
str(-3.14)
```

```python
int('42')
```

```python
int('-99')
```

```python
int(1.25)
```

```python
int(1.99)
```

```python
float('3.14')
```

```python
float(10)
```

The previous examples call the `str()`, `int()`, and `float()` functions and pass them values of the other data types to obtain a string, integer, or floating-point form of those values.

The `str()` function is handy when you have an integer or float that you want to concatenate to a string. The `int()` function is also helpful if you have a number as a string value that you want to use in some mathematics. For example, the `input()` function always returns a string, even if the user enters a number. Enter `spam = input()` and enter `101` when it waits for your text.

```python
spam = input()
```

In [3]:
spam = input()




```python
spam
```

The value stored inside `spam` isn’t the integer `101` but the string `'101'`. If you want to do math using the value in `spam`, use the `int()` function to get the integer form of `spam` and then store this as the new value in `spam`.

```python
spam = int(spam)
spam
```

Now you should be able to treat the `spam` variable as an integer instead of a string.

```python
spam * 10 / 5
```

Note that if you pass a value to `int()` that it cannot evaluate as an integer, Python will display an error message.

```python
int('99.99')
```

```python
int('twelve')
```

The `int()` function is also useful if you need to round a floating-point number down. If you want to round a floating-point number up, just add `1` to it afterward.

```python
int(7.7)
```

```python
int(7.7) + 1
```

In your program, you used the `int()` and `str()` functions in the last three lines to get a value of the appropriate data type for the code.

```python
print('What is your age?')    # ask for their age            # ⑥
myAge = input()                                              #
print('You will be ' + str(int(myAge) + 1) + ' in a year.')  #
```

The `myAge` variable contains the value returned from `input()`. Because the `input()` function always returns a string (even if the user typed in a number), you can use the `int(myAge)` code to return an integer value of the string in `myAge`. This integer value is then added to `1` in the expression `int(myAge) + 1`.

The result of this addition is passed to the `str()` function: `str(int(myAge) + 1)`. The string value returned is then concatenated with the strings `'You will be ' and ' in a year.'` to evaluate to one large string value. This large string is finally passed to `print()` to be displayed on the screen.

Let’s say the user enters the string `'4'` for `myAge`. The string `'4'` is converted to an integer, so you can add one to it. The result is `5`. The `str()` function converts the result back to a string, so you can concatenate it with the second string, `'in a year.'`, to create the final message. These evaluation steps would look something like **Figure 1-4**.

#### Text and Number Equivalence

Although the string value of a number is considered a completely different value from the integer or floating-point version, an integer can be equal to a floating point.

```python
42 == '42'
```

```python
42 == 42.0
```

```python
42.0 == 0042.000
```

Python makes this distinction because strings are text, while integers and floats are both numbers.

<img src="https://automatetheboringstuff.com/images/000069.png">

Figure 1-4. The evaluation steps, if `4` was stored in `myAge`

### Summary

You can compute expressions with a calculator or type string concatenations with a word processor. You can even do string replication easily by copying and pasting text. But expressions, and their component values—operators, variables, and function calls—are the basic building blocks that make programs. Once you know how to handle these elements, you will be able to instruct Python to operate on large amounts of data for you.

It is good to remember the different types of operators (`+`, `-`, `*`, `/`, `//`, `%`, and `**` for math operations, and `+` and `*` for string operations) and the three data types (integers, floating-point numbers, and strings) introduced in this chapter.

A few different functions were introduced as well. The `print()` and `input()` functions handle simple text output (to the screen) and input (from the keyboard). The `len()` function takes a string and evaluates to an int of the number of characters in the string. The `str()`, `int()`, and `float()` functions will evaluate to the string, integer, or floating-point number form of the value they are passed.

In the next chapter, you will learn how to tell Python to make intelligent decisions about what code to run, what code to skip, and what code to repeat based on the values it has. This is known as *flow control*, and it allows you to write programs that make intelligent decisions.