# Advanced Regular Expressions Lab

Complete the following set of exercises to solidify your knowledge of regular expressions.

## Challenge 1- Regular Expressions

Sometimes, we would like to perform more complex manipulations of our string. This is where regular expressions come in handy. In the cell below, return all characters that are upper case from the string specified below.

In [177]:
import re

poem = """The apparition of these faces in the crowd;
Petals on a wet, black bough."""

# Your code here:

In [181]:
re.findall("[A-Z]\w+",poem)

['The', 'Petals']

In the cell below, filter the list provided and return all elements of the list containing a number. To filter the list, use the re.search function. Check if the function does not return None. You can read more about the re.search function here: https://docs.python.org/3/library/re.html

In [185]:
data = ['123abc', 'abc123', 'JohnSmith1', 'ABBY4', 'JANE']

# Your code here:

In [187]:
[i for i in data if re.search(r"\d", i)]

['123abc', 'abc123', 'JohnSmith1', 'ABBY4']

## Challenge 2 - Regular Expressions II


Challenge 2 - Regular Expressions II
In the cell below, filter the list provided to keep only strings containing at least one digit and at least one lower case letter. As in the previous question, use the re.search function and check that the result is not None.

To read more about regular expressions, check out this [this link](https://developers.google.com/edu/python/regular-expressions).

In [175]:
data = ['123abc', 'abc123', 'JohnSmith1', 'ABBY4', 'JANE']
# Your code here:

In [188]:
[i for i in data if re.search(r"[a-z]+\d+|\d+[a-z]+", i)]

['123abc', 'abc123', 'JohnSmith1']

## Challenge 3 - Advanced Regular Expressions
Complete the following set of exercises to solidify your knowledge of regular expressions.

In [176]:
import re

### 1. Use a regular expression to find and extract all vowels in the following text.

In [2]:
text = "This is going to be a sentence with a good number of vowels in it."

In [24]:
re.findall("[aeiou]",text)

['i',
 'i',
 'o',
 'i',
 'o',
 'e',
 'a',
 'e',
 'e',
 'e',
 'i',
 'a',
 'o',
 'o',
 'u',
 'e',
 'o',
 'o',
 'e',
 'i',
 'i']

### 2. Use a regular expression to find and extract all occurrences and tenses (singular and plural) of the word "puppy" in the text below.

In [190]:
text = "The puppy saw all the rest of the puppies playing and wanted to join them. I saw this and wanted a puppy of my own!"

In [51]:
re.findall(r"(?:[Pp][uppyies]+)+",text)

['puppy', 'puppies', 'puppy']

In [191]:
re.findall(r"pupp.*?\b",text)

['puppy', 'puppies', 'puppy']

In [52]:
#laugh="Hahahahahaha and hyperactive"
#re.findall(r"(?:[Hh][aeioy]+)+",laugh)

### 3. Use a regular expression to find and extract all tenses (present and past) of the word "run" in the text below.

In [193]:
text = "I ran the relay race the only way I knew how to run it."

In [58]:
#re.findall(r"(?:[Rr][aun]+)+",text)

['ran', 'ra', 'run']

In [61]:
re.findall(r"(?:[Rr][aun]+)+\b",text)

['ran', 'run']

In [194]:
re.findall(r"r[au]n",text)

['ran', 'run']

### 4. Use a regular expression to find and extract all words that begin with the letter "r" from the previous text.

In [71]:
re.findall(r"([Rr][a-z]+)",text)

['ran', 'relay', 'race', 'run']

In [196]:
re.findall(r"r[a-z]+",text)

['ran', 'relay', 'race', 'run']

### 5. Use a regular expression to find and substitute the letter "i" for the exclamation marks in the text below.

In [88]:
text = "Th!s !s a sentence w!th spec!al characters !n !t."

In [102]:
re.sub("!","i", text)

'This is a sentence with special characters in it.'

In [99]:
text.replace("!","i")

'This is a sentence with special characters in it.'

### 6. Use a regular expression to find and extract words longer than 4 characters in the text below.

In [108]:
text = "This sentence has words of varying lengths."

In [112]:
re.findall(r"(\b[a-zA-Z]{4,}\b)",text)

['This', 'sentence', 'words', 'varying', 'lengths']

In [117]:
re.findall(r"(\w{4,})",text)

['This', 'sentence', 'words', 'varying', 'lengths']

### 7. Use a regular expression to find and extract all occurrences of the letter "b", some letter(s), and then the letter "t" in the sentence below.

In [197]:
text = "I bet the robot couldn't beat the other bot with a bat, but instead it bit me."

In [214]:
re.findall(r"b.t",text)

['bet', 'bot', 'bot', 'bat', 'but', 'bit']

### 8. Use a regular expression to find and extract all words that contain either "ea" or "eo" in them.

In [218]:
text = "During many of the peaks and troughs of history, the people living it didn't fully realize what was unfolding. But we all know we're navigating breathtaking history: Nearly every day could be — maybe will be — a book."


In [232]:
re.findall(r"[a-zA-Z]ea[a-zA-Z]+|[a-zA-Z]eo[a-zA-Z]+",text)

['peaks', 'people', 'realize', 'reathtaking', 'Nearly']

### 9. Use a regular expression to find and extract all the capitalized words in the text below individually.

In [236]:
text = "Teddy Roosevelt and Abraham Lincoln walk into a bar."

In [245]:
re.findall(r"[A-Z][A-Za-z]+",text)

['Teddy', 'Roosevelt', 'Abraham', 'Lincoln']

### 10. Use a regular expression to find and extract all the sets of consecutive capitalized words in the text above.

In [244]:
re.findall(r"[A-Z][A-Za-z]+",text)

['Teddy', 'Roosevelt', 'Abraham', 'Lincoln']

### 11. Use a regular expression to find and extract all the quotes from the text below.

*Hint: This one is a little more complex than the single quote example in the lesson because there are multiple quotes in the text.*

In [155]:
text = 'Roosevelt says to Lincoln, "I will bet you $50 I can get the bartender to give me a free drink." Lincoln says, "I am in!"'


In [159]:
#re.findall(r"("")",text)

### 12. Use a regular expression to find and extract all the numbers from the text below.

In [248]:
text = "There were 30 students in the class. Of the 30 students, 14 were male and 16 were female. Only 10 students got A's on the exam."


In [250]:
re.findall(r"\d+",text)

['30', '30', '14', '16', '10']

### 13. Use a regular expression to find and extract all the social security numbers from the text below.

In [169]:
text = """
Henry's social security number is 876-93-2289 and his phone number is (847)789-0984.
Darlene's social security number is 098-32-5295 and her phone number is (987)222-0901.
"""

In [167]:
#re.findall(r"(\d\d\d\d\d\d\d\d-)",text)

[]

In [170]:
#re.findall("^(?!000|.+0{4})(?:\d{9}|\d{3}-\d{2}-\d{4})$", text)

[]

### 14. Use a regular expression to find and extract all the phone numbers from the text below.

In [171]:
#re.findall(r"(\d)",text)

### 15. Use a regular expression to find and extract all the formatted numbers (both social security and phone) from the text below.