## Processing Shakespeare

Counting all the letters in Shakespeare's works!

First step is to read each line of the text file into a list of strings.

In [None]:
from io import TextIOWrapper

def read_lines(filename: str) -> list[str]:
    """Read a text file into a list of strings for each line."""
    lines: list[str] = []
    file_handle: TextIOWrapper = open(filename, "r")
    for line in file_handle:
        line = line.strip()
        line = line.lower()
        lines.append(line)
    return lines


shakespeares_lines: list[str] = read_lines("./shakespeare.txt")
print(shakespeares_lines)

In [None]:
# Example of "Slice Notation"

print(shakespeares_lines[0:5])

The next step of our analysis is to count all of letters in Shakespeare's complete body of work.

In [None]:
def tally(counts: dict[str, int], key: str) -> None:
    """Mutate counts by incrementing value stored at key by 1."""
    if key in counts:
        counts[key] += 1
    else:
        counts[key] = 1

    
def count_letters(lines: list[str]) -> dict[str, int]:
    """Count the frequency of all letters in a list of strings."""
    counts: dict[str, int] = {}
    for line in lines:
        for char in line:
            if char.isalpha():
                 tally(counts, char)
    return counts


shakespeares_counts: dict[str, int] = count_letters(shakespeares_lines)
print(shakespeares_counts)

Visualize with a bar chart this data about character frequencies.

In [None]:
from matplotlib import pyplot

shakespeares_counts = dict(sorted(shakespeares_counts.items()))
pyplot.title("Frequencies of letters in Shakespeare")
pyplot.xlabel("Letters")
pyplot.ylabel("Counts")
labels: list[str] = list(shakespeares_counts.keys())
values: list[int] = list(shakespeares_counts.values())
pyplot.bar(labels, values)


In [None]:
example_tuples: list[tuple[str, int]] = [
    ('spring,' 110),
    ('break', 1000000)
]

dict_from_tuples: dict[str, int] = dict(example_tuples)
dict_from_tuples