# Advanced Regular Expressions Lab

Complete the following set of exercises to solidify your knowledge of regular expressions.

In [114]:
import re

### 1. Use a regular expression to find and extract all vowels in the following text.

In [115]:
text = "This is going to be a sentence with a good number of vowels in it."

In [116]:
x = re.findall("[aeiou]", text,flags=re.IGNORECASE)
print(x)
# Ihave included the unnecesary flag "IGNORECASE" just in case the selected text might have uppercase vowels

['i', 'i', 'o', 'i', 'o', 'e', 'a', 'e', 'e', 'e', 'i', 'a', 'o', 'o', 'u', 'e', 'o', 'o', 'e', 'i', 'i']


### 2. Use a regular expression to find and extract all occurrences and tenses (singular and plural) of the word "puppy" in the text below.

In [117]:
text = "The puppy saw all the rest of the puppies playing and wanted to join them. I saw this and wanted a puppy of my own!"

In [118]:
x = re.findall("puppy|puppies", text,flags=re.I)
print(x)

['puppy', 'puppies', 'puppy']


### 3. Use a regular expression to find and extract all tenses (present and past) of the word "run" in the text below.

In [119]:
text = "I ran the relay race the only way I knew how to run it."

In [120]:
x = re.findall("run|ran", text,flags=re.I)
x = re.findall("r.n", text,flags=re.I)
print(x)

['ran', 'run']


### 4. Use a regular expression to find and extract all words that begin with the letter "r" from the previous text.

In [121]:
text = "I ran the relay race the only way I knew how to run it."

In [124]:
x = re.findall(r'\br\S+ ', text, re.IGNORECASE)
print(x)

['ran ', 'relay ', 'race ', 'run ']


### 5. Use a regular expression to find and substitute the letter "i" for the exclamation marks in the text below.

In [126]:
text = "This is a sentence with special characters in it."

In [127]:
x = re.sub("i", "!", text)
print(x)

Th!s !s a sentence w!th spec!al characters !n !t.


### 6. Use a regular expression to find and extract words longer than 4 characters in the text below.

In [139]:
text = "This sentence has words of varying lengths."

In [178]:
print (len(text))
x = re.findall(r'\b[a-z]{5,43}\s', text, re.IGNORECASE)
print(x)
y = re.findall(r'\b[a-z]{1,4}\s', text, re.IGNORECASE)
print(y)

43
['sentence ', 'words ', 'varying ']
['This ', 'has ', 'of ']


### 7. Use a regular expression to find and extract all occurrences of the letter "b", some letter(s), and then the letter "t" in the sentence below.

In [180]:
text = "I bet the robot couldn't beat the other bot with a bat, but instead it bit me."

In [194]:
x = re.findall('b[a-z]+t', text, re.IGNORECASE)
print(x)


['bet', 'bot', 'beat', 'bot', 'bat', 'but', 'bit']


### 8. Use a regular expression to find and extract all words that contain either "ea" or "eo" in them.

In [195]:
text = "During many of the peaks and troughs of history, the people living it didn't fully realize what was unfolding. But we all know we're navigating breathtaking history: Nearly every day could be — maybe will be — a book."


In [211]:
x = re.findall(r'\b[a-z]*ea[a-z]*\s|\b[a-z]*ea[a-z]*\.\Z|\b[a-z]*eo[a-z]*\s|\b[a-z]*eo[a-z]*\.\Z', text,flags=re.I)
print(x)
#The collar is more expensive than the dog!

['peaks ', 'people ', 'realize ', 'breathtaking ', 'Nearly ']


### 9. Use a regular expression to find and extract all the capitalized words in the text below individually.

In [217]:
text = "Teddy Roosevelt and Abraham Lincoln walk into a bar."

In [222]:
x = re.findall("[A-Z][a-z]*", text)
print(x)

['Teddy', 'Roosevelt', 'Abraham', 'Lincoln']


### 10. Use a regular expression to find and extract all the sets of consecutive capitalized words in the text above.

In [223]:
x = re.findall("[A-Z][a-z]* [A-Z][a-z]*", text)
print(x)
# Note that I am not ignoring the case anymore. 

['Teddy Roosevelt', 'Abraham Lincoln']


### 11. Use a regular expression to find and extract all the quotes from the text below.

*Hint: This one is a little more complex than the single quote example in the lesson because there are multiple quotes in the text.*

In [231]:
text = 'Roosevelt says to Lincoln, "I will bet you $50 I can get the bartender to give me a free drink." Lincoln says, "I am in!"'


In [232]:
x = re.findall('\".*\"', text)
print(x)

['"I will bet you $50 I can get the bartender to give me a free drink." Lincoln says, "I am in!"']


### 12. Use a regular expression to find and extract all the numbers from the text below.

In [234]:
text = "There were 30 students in the class. Of the 30 students, 14 were male and 16 were female. Only 10 students got A's on the exam."


In [235]:
x = re.findall('[0-9]+', text)
print(x)

['30', '30', '14', '16', '10']


### 13. Use a regular expression to find and extract all the social security numbers from the text below.

In [240]:
text = """
Henry's social security number is 876-93-2289 and his phone number is (847)789-0984.
Darlene's social security number is 098-32-5295 and her phone number is (987)222-0901.
"""

In [238]:
x = re.findall('[0-9]{3}-[0-9]{2}-[0-9]{4}', text)
print(x)

['876-93-2289', '098-32-5295']


### 14. Use a regular expression to find and extract all the phone numbers from the text below.

In [243]:
x = re.findall('\([0-9]{3}\)[0-9]{3}-[0-9]{4}', text)
print(x)

['(847)789-0984', '(987)222-0901']


### 15. Use a regular expression to find and extract all the formatted numbers (both social security and phone) from the text below.

In [244]:
x = re.findall('\([0-9]{3}\)[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{2}-[0-9]{4}', text)
print(x)

['876-93-2289', '(847)789-0984', '098-32-5295', '(987)222-0901']
