# Advanced Regular Expressions Lab

Complete the following set of exercises to solidify your knowledge of regular expressions.

In [2]:
import re

### 1. Use a regular expression to find and extract all vowels in the following text.

In [3]:
text = "This is going to be a sentence with a good number of vowels in it."

In [4]:
regex_pat = "[aeiou]"
result = re.findall(regex_pat,text)
print(result)

['i', 'i', 'o', 'i', 'o', 'e', 'a', 'e', 'e', 'e', 'i', 'a', 'o', 'o', 'u', 'e', 'o', 'o', 'e', 'i', 'i']


### 2. Use a regular expression to find and extract all occurrences and tenses (singular and plural) of the word "puppy" in the text below.

In [5]:
text = "The puppy saw all the rest of the puppies playing and wanted to join them. I saw this and wanted a puppy of my own!"

In [6]:
regex_pat = 'pu\w+'
result = re.findall(regex_pat,text)


In [7]:
print(result)

['puppy', 'puppies', 'puppy']


### 3. Use a regular expression to find and extract all tenses (present and past) of the word "run" in the text below.

In [8]:
text = "I ran the relay race the only way I knew how to run it."

In [9]:
regex_pat = 'r[au]n'
result = re.findall(regex_pat,text)


In [10]:
print(result)

['ran', 'run']


### 4. Use a regular expression to find and extract all words that begin with the letter "r" from the previous text.

In [11]:
regex_pat2 = '\sr\w+'
result = re.findall(regex_pat2,text)


In [12]:
print(result)

[' ran', ' relay', ' race', ' run']


### 5. Use a regular expression to find and substitute the letter "i" for the exclamation marks in the text below.

In [13]:
text = "Th!s !s a sentence w!th spec!al characters !n !t."

In [14]:
regex_pat = "!"
result = re.sub(regex_pat, "i",text)


In [15]:
print(result)

This is a sentence with special characters in it.


### 6. Use a regular expression to find and extract words longer than 4 characters in the text below.

In [16]:
text = "This sentence has words of varying lengths."

In [17]:
regex_pat = "\w{4,}"
result = re.findall(regex_pat,text)


In [18]:
print(result)

['This', 'sentence', 'words', 'varying', 'lengths']


### 7. Use a regular expression to find and extract all occurrences of the letter "b", some letter(s), and then the letter "t" in the sentence below.

In [19]:
text = "I bet the robot couldn't beat the other bot with a bat, but instead it bit me."

In [20]:
regex_pat = "\w*b\w+t"
result = re.findall(regex_pat,text)

In [21]:
print(result)

['bet', 'robot', 'beat', 'bot', 'bat', 'but', 'bit']


### 8. Use a regular expression to find and extract all words that contain either "ea" or "eo" in them.

In [22]:
text = "During many of the peaks and troughs of history, the people living it didn't fully realize what was unfolding. But we all know we're navigating breathtaking history: Nearly every day could be — maybe will be — a book."


In [23]:
regex_pat = "\w+e[a|o]\w*"
result = re.findall(regex_pat,text)

In [24]:
print(result)

['peaks', 'people', 'realize', 'breathtaking', 'Nearly']


### 9. Use a regular expression to find and extract all the capitalized words in the text below individually.

In [25]:
text = "Teddy Roosevelt and Abraham Lincoln walk into a bar."

In [26]:
regex_pat = "[A-Z]"
result = re.findall(regex_pat,text)

In [27]:
print(result)

['T', 'R', 'A', 'L']


### 10. Use a regular expression to find and extract all the sets of consecutive capitalized words in the text above.

In [28]:
regex_pat2 = "w*[A-Z][A-Z]w*"
result2 = re.findall(regex_pat2,text)

In [29]:
print(result2) 

[]


### 11. Use a regular expression to find and extract all the quotes from the text below.

*Hint: This one is a little more complex than the single quote example in the lesson because there are multiple quotes in the text.*

In [91]:
text = 'Roosevelt says to Lincoln, "I will bet you $50 I can get the bartender to give me a free drink." Lincoln says, "I am in!"'


In [92]:
regex_pat = '"'|"'"
result = re.findall(regex_pat,text)

TypeError: unsupported operand type(s) for |: 'str' and 'str'

In [93]:
print(result)

['"', '"', '"', '"']


### 12. Use a regular expression to find and extract all the numbers from the text below.

In [49]:
text = "There were 30 students in the class. Of the 30 students, 14 were male and 16 were female. Only 10 students got A's on the exam."


In [50]:
regex_pat = '\d{2}'# Because all numbers has two digits
result = re.findall(regex_pat, text)

In [51]:
print(result)

['30', '30', '14', '16', '10']


### 13. Use a regular expression to find and extract all the social security numbers from the text below.

In [72]:
text = """
Henry's social security number is 876-93-2289 and his phone number is (847)789-0984.
Darlene's social security number is 098-32-5295 and her phone number is (987)222-0901.
"""

In [73]:
regex_pat = '\d{3}[-]\d{2}[-]\d{4}' #Taking into account the hyphens and how many digits are between the hyphens
result = re.findall(regex_pat, text)


In [74]:
print(result)

['876-93-2289', '098-32-5295']


### 14. Use a regular expression to find and extract all the phone numbers from the text below.

In [75]:


regex_pat = '[(]\d{3}[)]\d{3}[-]\d{4}' #Taking into account the hyphens, the brackets and how many numbers are between them
result = re.findall(regex_pat, text)

In [76]:
print(result)

['(847)789-0984', '(987)222-0901']


### 15. Use a regular expression to find and extract all the formatted numbers (both social security and phone) from the text below.

In [79]:
regex_pat = '\d{3}[-]\d{2}[-]\d{4} | [(]\d{3}[)]\d{3}[-]\d{4}' # Is the combination between the 13 and 14 we use an or operator
result = re.findall(regex_pat, text)

In [80]:
print(result)

['876-93-2289 ', ' (847)789-0984', '098-32-5295 ', ' (987)222-0901']
