# Counting Words in a Text

For this assignment, you will write a program that counts how many times each word appears in a text file and prints out the results. For example, in one of the files we will provide you — `caesar.txt`, which is part of the text of Julius Caesar — you will find that the word "caesar" appears 34 times (ignoring case).

As you have not yet learned to read files, we have provided the count_file function to you:

```python
def count_file(fname):
    counts = {}
    with open(fname) as f:
        for line in f:
            count_words(counts, line)
    print_results(counts)
```

This function makes an empty dictionary, and then reads each line in a file, calling your `count_words` function and to update the dictionary which maps words to their counts.  When it finishes with the file, it calls your print_results function to print out the final results. 

## Step 1:

Write the function `count_words(counts, line)` which takes:

- `counts`: a dictionary that maps words to how many times they appear, and 
- `line`: a string, which you should split into words and, for each word you see, update counts to reflect that occurrence of the word. 

This function should return the updated "counts" dictionary.

A few hints:

- Python strings come with a number of standard methods for string manipulation [you can read about here](https://docs.python.org/3/library/stdtypes.html#string-methods). 
    - Of particular interest for this exercise is probably the `.split()` method (e.g., `my_string.split()`). `.split()` splits string into a list of sub-strings where it makes a split anywhere it sees whitespace (like a space or newline). So in:

    ```python
    my_string = "hello world!"
    my_string.split()
    ```

    will return `["hello", "world!"]`.
- We have provided a simple "print_results" which just prints `counts`: you can use it to test your `count_words` function before you proceed. 

## Step 2:

If you look at the results of `count_words`, you will see that sometimes we end up with words like `"now,"` (with a comma) which really just should be `"now"`. Likewise, `"The"` and `"the"` are counted as two separate words. 

You can improve on your `count_words` by using the following two string methods:

- `string.strip("-?.!,[]—:;'\"")`: will return a string with any leading or trailing occurrences of any of the characters `-?.!,[]—:;"'` removed.
- `string.lower()` will return string with all characters converted to lower case letters. (there is also a `.upper()` to go the other way if you ever need).

## Step 3:

`print_results` currently just prints the dictionary using its built in conversion to a string. That is fine for testing, but not a great output format. 

For this step, you should replace the code in `print_results` so that it prints out each word and its corresponding count, 1 per line, e.g.: 

```
and : 59
a : 27
the : 72
caesar : 34
```

## Step 4

For this final step, you will make your output a little nicer even still by sorting it by the words, so that they are in alphabetical order. To accomplish this, you will want to put the items from the dictionary into a list, then use `thatlist.sort()`.

However, you will need to pass an extra parameter to .sort() to get the behavior you desire.  Check out [the documentation](https://docs.python.org/3/library/stdtypes.html#list.sort) for .sort()  to find out how to sort by the words (which should be a particular item in a tuple). 
You may wish to make use of the other files in /usr/local/mids/words to test out your program. 