# Advanced Regular Expressions Lab

Complete the following set of exercises to solidify your knowledge of regular expressions.

In [2]:
import re

### 1. Use a regular expression to find and extract all vowels in the following text.

In [3]:
text = "This is going to be a sentence with a good number of vowels in it."

In [5]:
pattern='[aeiou]'
print(re.findall(pattern,text))

['i', 'i', 'o', 'i', 'o', 'e', 'a', 'e', 'e', 'e', 'i', 'a', 'o', 'o', 'u', 'e', 'o', 'o', 'e', 'i', 'i']


In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


### 2. Use a regular expression to find and extract all occurrences and tenses (singular and plural) of the word "puppy" in the text below.

In [6]:
text = "The puppy saw all the rest of the puppies playing and wanted to join them. I saw this and wanted a puppy of my own!"

In [18]:
pattern='pupp[yi]e?s?'
print(re.findall(pattern,text))

['puppy', 'puppies', 'puppy']


### 3. Use a regular expression to find and extract all tenses (present and past) of the word "run" in the text below.

In [30]:
text = "I ran the relay race the only way I knew how to run it."

In [31]:
pattern='r[ua]n'
print(re.findall(pattern,text))

['ran', 'run']


### 4. Use a regular expression to find and extract all words that begin with the letter "r" from the previous text.

In [34]:
pattern='r[ua][cn]e?'

In [35]:
print(re.findall(pattern,text))

['ran', 'race', 'run']


### 5. Use a regular expression to find and substitute the letter "i" for the exclamation marks in the text below.

In [37]:
text = "Th!s !s a sentence w!th spec!al characters !n !t."

In [46]:
pattern='!'
print(re.sub(pattern,'i',text))

This is a sentence with special characters in it.


### 6. Use a regular expression to find and extract words longer than 4 characters in the text below.

In [49]:
text = "This sentence has words of varying lengths."

In [51]:
pattern='\w{4,}'
print(re.findall(pattern,text))

['This', 'sentence', 'words', 'varying', 'lengths']


### 7. Use a regular expression to find and extract all occurrences of the letter "b", some letter(s), and then the letter "t" in the sentence below.

In [52]:
text = "I bet the robot couldn't beat the other bot with a bat, but instead it bit me."

In [56]:
pattern='[b]\w\w?[t]'
print(re.findall(pattern,text))

['bet', 'bot', 'beat', 'bot', 'bat', 'but', 'bit']


### 8. Use a regular expression to find and extract all words that contain either "ea" or "eo" in them.

In [57]:
text = "During many of the peaks and troughs of history, the people living it didn't fully realize what was unfolding. But we all know we're navigating breathtaking history: Nearly every day could be — maybe will be — a book."


In [78]:
pattern='[e][a]\W+|[e][o]\W+' 'Rever
print(re.findall(pattern,text))

[]


### 9. Use a regular expression to find and extract all the capitalized words in the text below individually.

In [79]:
text = "Teddy Roosevelt and Abraham Lincoln walk into a bar."

In [82]:
pattern='[A-Z]'
print(re.findall(pattern,text))

['T', 'R', 'A', 'L']


### 10. Use a regular expression to find and extract all the sets of consecutive capitalized words in the text above.

In [89]:
pattern='[A-Z][a-z]*[ ][A-Z][a-z]*'
print(re.findall(pattern,text))

['Teddy Roosevelt', 'Abraham Lincoln']


### 11. Use a regular expression to find and extract all the quotes from the text below.

*Hint: This one is a little more complex than the single quote example in the lesson because there are multiple quotes in the text.*

In [90]:
text = 'Roosevelt says to Lincoln, "I will bet you $50 I can get the bartender to give me a free drink." Lincoln says, "I am in!"'


In [0]:
pattern=''
print(re.findall(pattern,text))

### 12. Use a regular expression to find and extract all the numbers from the text below.

In [91]:
text = "There were 30 students in the class. Of the 30 students, 14 were male and 16 were female. Only 10 students got A's on the exam."


In [104]:
pattern='[\d][\d]'
print(re.findall(pattern,text))

['30', '30', '14', '16', '10']


### 13. Use a regular expression to find and extract all the social security numbers from the text below.

In [110]:
text = """
Henry's social security number is 876-93-2289 and his phone number is (847)789-0984.
Darlene's social security number is 098-32-5295 and her phone number is (987)222-0901.
"""

In [142]:
pattern='\d{3}-\d{2}-\d{4}'
print(re.findall(pattern,text))

['876-93-2289', '098-32-5295']


### 14. Use a regular expression to find and extract all the phone numbers from the same text.

In [141]:
pattern='\(\d{3}\)\d{3}-\d{4}?'
print(re.findall(pattern,text))

['(847)789-0984', '(987)222-0901']


### 15. Use a regular expression to find and extract all the formatted numbers (both social security and phone) from the text.

In [143]:
pattern='\d{3}-\d{2}-\d{4}|\(\d{3}\)\d{3}-\d{4}?'
print(re.findall(pattern,text))

['876-93-2289', '(847)789-0984', '098-32-5295', '(987)222-0901']
