# Advanced Regular Expressions Lab

Complete the following set of exercises to solidify your knowledge of regular expressions.

In [1]:
import re

### 1. Use a regular expression to find and extract all vowels in the following text.

In [2]:
text = "This is going to be a sentence with a good number of vowels in it."

In [3]:
vowel=re.compile('[aeiou]')
vowel.sub(lambda x:"",text)


'Ths s gng t b  sntnc wth  gd nmbr f vwls n t.'

### 2. Use a regular expression to find and extract all occurrences and tenses (singular and plural) of the word "puppy" in the text below.

In [4]:
text = "The puppy saw all the rest of the puppies playing and wanted to join them. I saw this and wanted a puppy of my own!"

In [5]:
puppy=re.compile('pupp\S+') #...\S+ => todos los caracteres dentre de una palabra que queda a la derecha que 
puppy.sub(lambda x:"",text)

'The  saw all the rest of the  playing and wanted to join them. I saw this and wanted a  of my own!'

### 3. Use a regular expression to find and extract all tenses (present and past) of the word "run" in the text below.

In [6]:
text = "I ran the relay race the only way I knew how to run it."

In [7]:
run=re.compile('r[ua]n')
run.sub(lambda x:"",text)

'I  the relay race the only way I knew how to  it.'

### 4. Use a regular expression to find and extract all words that begin with the letter "r" from the previous text.

In [8]:
r_words=re.compile('r\S+')
r_words.sub(lambda x:"",text)

'I  the   the only way I knew how to  it.'

### 5. Use a regular expression to find and substitute the letter "i" for the exclamation marks in the text below.

In [9]:
text = "Th!s !s a sentence w!th spec!al characters !n !t."

In [10]:
exclamation=re.compile('!')
exclamation.sub(lambda x:"i",text)

'This is a sentence with special characters in it.'

### 6. Use a regular expression to find and extract words longer than 4 characters in the text below.

In [11]:
text = "This sentence has words of varying lengths."

In [12]:
longwords=re.compile('\w{4,20}')
longwords.sub(lambda x:"",text)

'  has  of  .'

### 7. Use a regular expression to find and extract all occurrences of the letter "b", some letter(s), and then the letter "t" in the sentence below.

In [13]:
text = "I bet the robot couldn't beat the other bot with a bat, but instead it bit me."

In [14]:
letters_b_t=re.compile('[bt]')
letters_b_t.sub(lambda x:"",text)

"I e he roo couldn' ea he oher o wih a a, u insead i i me."

### 8. Use a regular expression to find and extract all words that contain either "ea" or "eo" in them.

In [15]:
text = "During many of the peaks and troughs of history, the people living it didn't fully realize what was unfolding. But we all know we're navigating breathtaking history: Nearly every day could be — maybe will be — a book."


In [16]:
ea_eo_word=re.compile('\S+e[ao]\S+')
ea_eo_word.sub(lambda x:"",text)
                      

"During many of the  and troughs of history, the  living it didn't fully  what was unfolding. But we all know we're navigating  history:  every day could be — maybe will be — a book."

### 9. Use a regular expression to find and extract all the capitalized words in the text below individually.

In [17]:
text = "Teddy Roosevelt and Abraham Lincoln walk into a bar."

In [18]:
cap_word=re.compile('[A-Z]\S+')
cap_word.findall(text)

['Teddy', 'Roosevelt', 'Abraham', 'Lincoln']

### 10. Use a regular expression to find and extract all the sets of consecutive capitalized words in the text above.

In [19]:
cap_2word=re.compile('[A-Z]\S+\s[A-Z]\S+') #\S+ =  texto sin espacio a la derecha & s\ = 1 espacio
cap_2word.findall(text)

['Teddy Roosevelt', 'Abraham Lincoln']

### 11. Use a regular expression to find and extract all the quotes from the text below.

*Hint: This one is a little more complex than the single quote example in the lesson because there are multiple quotes in the text.*

In [20]:
text = 'Roosevelt says to Lincoln, "I will bet you $50 I can get the bartender to give me a free drink." Lincoln says, "I am in!"'


In [21]:
quotes=re.compile('(?:"(.*?)")') #(?: encuentra las palabras que cumplan lo de la derecha .*? => todo el texto de la cita solo una vez ()=> encuentra la expresion dentro del parentesis y la transforma en texto
quotes.findall(text)

['I will bet you $50 I can get the bartender to give me a free drink.',
 'I am in!']

### 12. Use a regular expression to find and extract all the numbers from the text below.

In [22]:
text = "There were 30 students in the class. Of the 30 students, 14 were male and 16 were female. Only 10 students got A's on the exam."


In [23]:
numbers=re.compile('\d+')
numbers.findall(text)

['30', '30', '14', '16', '10']

### 13. Use a regular expression to find and extract all the social security numbers from the text below.

In [24]:
text = """
Henry's social security number is 876-93-2289 and his phone number is (847)789-0984.
Darlene's social security number is 098-32-5295 and her phone number is (987)222-0901.
"""

In [25]:
#re.findall('\d{3}-\d{2}-\d{4}',text)
ss_numbers=re.compile('\d{3}-\d{2}-\d{4}')
ss_numbers.findall(text)

['876-93-2289', '098-32-5295']

### 14. Use a regular expression to find and extract all the phone numbers from the text above.

In [26]:
phone_n=re.compile('[(]\d{3}[)]\d{3}-\d{4}') #usamos [] para buscar caracters especiales como parentesis
phone_n.findall(text)                  

['(847)789-0984', '(987)222-0901']

### 15. Use a regular expression to find and extract all the formatted numbers (both social security and phone) from the text above.

In [34]:
#phone_ss_num=re.compile('((\d{3}-\d{2}-\d{4})|([(]\d{3}[)]\d{3}-\d{4}))')
re.findall('[\S(]\d{2,3}[)-]\d{2,3}-\d{4}',text) 

['876-93-2289', '(847)789-0984', '098-32-5295', '(987)222-0901']