# Regular Expressions

Like a good espresso, regular expressions require care and precision to avoid either a watery mess or bitter sludge.

We will not have time to cover everything about **regexes** here, but this introduction should help you feel comfortable poking around and using them in your own code.

In [4]:
sample_text = """
“Ye kings and warriors! may your vows be crown’d,
And Troy’s proud walls lie level with the ground.
May Jove restore you when your toils are o’er
Safe to the pleasures of your native shore.
But, oh! relieve a wretched parent’s pain,
And give Chryseïs to these arms again;
If mercy fail, yet let my presents move,
And dread avenging Phœbus, son of Jove.”
"""

In [6]:
import re

word_regex = re.compile(r"[\s\.,!;”“]+")

lines = [[w for w in word_regex.split(l) if w != ""] for l in sample_text.strip().splitlines()]
lines

[['Ye', 'kings', 'and', 'warriors', 'may', 'your', 'vows', 'be', 'crown’d'],
 ['And', 'Troy’s', 'proud', 'walls', 'lie', 'level', 'with', 'the', 'ground'],
 ['May', 'Jove', 'restore', 'you', 'when', 'your', 'toils', 'are', 'o’er'],
 ['Safe', 'to', 'the', 'pleasures', 'of', 'your', 'native', 'shore'],
 ['But', 'oh', 'relieve', 'a', 'wretched', 'parent’s', 'pain'],
 ['And', 'give', 'Chryseïs', 'to', 'these', 'arms', 'again'],
 ['If', 'mercy', 'fail', 'yet', 'let', 'my', 'presents', 'move'],
 ['And', 'dread', 'avenging', 'Phœbus', 'son', 'of', 'Jove']]

In [7]:
n_ground = 0
for line in lines:
    for word in line:
        if word == "ground":
            n_ground += 1

n_ground

1

## Homework

Using the code above as a guide, get the counts for _every_ word in `sample_text`.

In [None]:
counter = {}
for line in lines:
    for word in line:
        if word in counter:
            counter[word] += 1
        else:
            counter[word] = 1
print (f"There are {len(counter)} unique words, below is a list and their count:" )
for word, amount in counter.items(): 
    if amount > 1:
        print (f"there are {amount} instances of {word}.")
    else: 
        print (f"there is only {amount} instance of {word}.")



There are 56 unique words, below is a list and their count:
there is only 1 instance of Ye.
there is only 1 instance of kings.
there is only 1 instance of and.
there is only 1 instance of warriors.
there is only 1 instance of may.
there are 3 amount of your.
there is only 1 instance of vows.
there is only 1 instance of be.
there is only 1 instance of crown’d.
there are 3 amount of And.
there is only 1 instance of Troy’s.
there is only 1 instance of proud.
there is only 1 instance of walls.
there is only 1 instance of lie.
there is only 1 instance of level.
there is only 1 instance of with.
there are 2 amount of the.
there is only 1 instance of ground.
there is only 1 instance of May.
there are 2 amount of Jove.
there is only 1 instance of restore.
there is only 1 instance of you.
there is only 1 instance of when.
there is only 1 instance of toils.
there is only 1 instance of are.
there is only 1 instance of o’er.
there is only 1 instance of Safe.
there are 2 amount of to.
there is only