# Python for Digital Humanities

## Unit #2: Intro to Programming

* Overview
* Variables
* Input/Output
* Reading Files
* Text as Strings

<font color=blue>---------------------------------------------------------------</font>

### 2.1 Overview 
#### Programming as a Writing Assignment

Programming is like writing a descriptive essay -- you are telling the computer exactly what you want it to do.  Each step must be explained in some detail, and there must be a logical flow from one step to the next.

Because the audience of your program is a machine (i.e., a computer), you must be clear and precise.  A machine is very limited in what it can understand.  Furthermore, you must communicate by using a language that the machine can understand.  For this workshop, the language that we will use is Python.

As a language, Python has a syntax and a grammar.  Unlike most natural lnaguages, it has very strict rules for following the syntax.  Again, the strict rules are there to ensure that we are precise in what we want the computer to do.

Learning Python means that you must learn the rules of the language and how to translate your thoughts into that language.  Often, the translation of thoughts is the most difficult step!

Python is an _interpretive_ language.  The computer will _interpret_ each statement, and follow the instructions required by that statement, before moving on to the next statement.

The beauty of programming is that once you have a complete set of instructions for the computer, it will always perform those isntructions correctly.  For that reason, we will want to write our programs to be general enough to work in different situations, reducing our future efforts.

It also is important to maintain a working program as is, so that our results can be reproduced when needed.


#### The Outline of a Simple Program

Before writing an essay, you probably would start with an outline. You should do exactly the same thing with a program.

It is important to think of programs as a series of tasks that need to be performed.  At the most basic level, the tasks would be to 
* Read in data (which could be text);
* Perform some analysis on the data;
* Report the results of the analysis.

Of course the outline could be much more detailed.  For example, the data may need to be _preprocessed_ before you can perform an analysis on it.  There also may be multiple types of analysis that you want to perform.  

Let's see how this would work.  Suppose I have a text file, called **emma_chapter_one.txt**, that has the first chapter of _Emma_ by Jane Austen, and I want to know how many times the name "Emma" appears in the chapter.  The outline for my program could be as follows: 
* Read the file **emma_chapter_one.txt**;
* Count the number of times the name "Emma" appears in the chapter;
* Display the results to the screen.

Now, we need to learn some Python to be able to perform these steps.

### 2.2  Variables

#### Variable Overview
In any programming language a **variable** is used to store information, including
* Data
* Results from computations
* Words (In either single or double quotes)
* Images
* Others

Technically, a variable represents a *_memory location_*, but we can think of it as a label for the information that we used in our program.

Variables are created by assigning a value to them:

```
n = 10
x = 2.5
ans = "You are correct" ```

<font color=blue>---------------------------------------------------------------</font>

#### Variable Names

As the writer, you may choose the names of your variables, with a few restrictions:
*  Variables must start with a letter, or underscore ( _ );
*  Variables may not contain spaces;
*  Variables cannot be words that are used in the Python language (e.g., if, for);
*  Variables are case sensitive (e.g., myVariable is not the same as Myvariable or MyVariable)

**NOTE**:  It is important to watch out for typos in variable names.  Typos can make the difference betweeen correct code and code with serious problems.

<font color=blue>---------------------------------------------------------------</font>

#### Assignment Statements

An assignment statement is a command that will `assign` a value (or values) to a variable.
An assignment statement has the form:

```variable = expression```
    
where `expression` can be a numeric value, a string, an algebraic expression, etc.

##### Examples:
   count = 12
   
   X = 3.14159
   
   speech = "Four score and seven years ago ..."
   
   area = length*width
   
   size = size + 2
   
<font color=blue>---------------------------------------------------------------</font>

#### A Closer Look as Assignment Statements
Consider the statement:
```size = size + 2```

In math, this would be an illegal expression -- there is no value that is equal to two more than itself.
In programming, this is a legal statement because it is an *_assignment_*, not an equality.

* In an assignment statement, the expression on the right-hand-side (`size + 2` ) is evaluated first, then assigned to the variable on the left-hand-side.
* If any variables exist in the expression on the right-hand-side, these variables must have a value assigned to them BEFORE they can be used in the statement.

<font color=blue>---------------------------------------------------------------</font>

### Activity:  Assigning Values to Variables

Bring up Spyder, type in and run the following code snippet:
```
size = 5
print(size)
size = size + 2
print(size)
```

<font color=blue>---------------------------------------------------------------</font>

### 2.3 Input/Output

Programs are not very useful if they cannot communicate their results to the user or if they do not allow users to provide specific inputs.

#### Output

We have already seen the `print` statement (e.g., ```print("Hello World")```.  We also can include multiple items in the print statement.  For example:  
```
Month = "December"
Day = 2
Year = 2019
print("The date is", Day, Month, Year, ".")
```

<font color=blue>---------------------------------------------------------------</font>

#### Input

If we want the user to input information by typing it on the keyboard, we can use the `input` command.  The general format of the command is as follows:

```variable = input("Some prompt to tell the user what to input:  ")```

For example:
```
name = input("Please enter your name:  ")
print("Hello", name)
```

### 2.4. Reading a Text File¶
Another way to get information into a program is to read the contents of a file. The technique for doing this will depend on the format of the file (e.g., is it a plain text file, or a comma-delimited file, or a pdf file).

We will start with a plain text file. To read a text file, you first will need to set up a connection to the file. This connection is created with the open function. Traditionally, there are three steps to reading a files: open a connection, read the file contents and save to a variable, and close the connection. For example:
```
file_connection = open('emma_chapter_one.txt')
text = file_connection.read()
file_connection.close()
```
There is a short-cut notation for doing these three steps:
```
with open('emma_chapter_one.txt') as file_connection:
      text = file_connection.read()
```

Notice that `file_connection` is a variable name. You can choose any name that you want. In different examples online, you often will see f used for the file connection. For example:

```
with open('emma_chapter_one.txt') as f:
      text = f.read()
```

The most important variable is `text`.  If the program is able to connect to the file and read the contents, then `text` will hold the contents of the file.

<font color=blue>---------------------------------------------------------------</font>



## Activity:  Reading a Text File


Make sure that you have downloaded the file "emma_chapter_one.txt".
Type in the "short-cut" code to read the contents of that file into a variable called `text`, and print `text` to see the contents.

    
<font color=blue>---------------------------------------------------------------</font>

### 2.5 Text as a String

The variable, `text`, holds a string -- a collection of characters, including symbols, spaces, and line feeds.

There are manipulations that we can do on strings.
To determine the number of characters that are in the string, we can use the `len()` function.

The line ```print(len(text))``` will print the total number of characters that were in our text file.

We also can count the number of times that a substring appears within a string.  Suppose, we want to count the number of times that the word "she" appears in the text.  We can do this with
```
text.count('she')
```

    
<font color=blue>---------------------------------------------------------------</font>



### Activity: Writing a complete program
Write a complete program that will perform the following steps:
* Read the file **emma_chapter_one.txt**;
* Count the number of times the name "Emma" appears in the chapter;
* Display the results to the screen.
    
Make sure that the results are informative -- print a complete sentence to the screen.

**Question**:  Suppose we want to count the number of times the word "dream" was said in Martin Luther King's iconic speech.  How could we reuse what we have for that task?

<font color=blue>---------------------------------------------------------------</font>

### Summary

In this section, we learned about the general steps of writing a simple programming.  We learned how to read a text file into a variable, count the number of times that a specific word appears in the text, and to print the results to the screen.

<font color=blue>---------------------------------------------------------------</font>

### Homework Task

Before attempting this task, make sure that you have at least **3.5 GB** of storage available on your laptop.

Open Spyder and run the following commands:
```
import nltk
nltk.download()
```

When a dialog box appears, select "all".  

It will take about 30 minutes for the items to install, but the downloaded items will give you better tools for analyzing text.


<font color=blue>---------------------------------------------------------------</font>