<a href="https://colab.research.google.com/github/carloslme/automating-boring-stuff/blob/main/Chapter_7_Pattern_Matching_with_Regular_Expressions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Finding Patterns of Text Without Regular Expressions
Create a function that evaluate a phone number with the next format: 415-555-*1234*

In [6]:
def isPhoneNumber(text):
  
  # Checks if the text given is exactly 12 characters
  if len(text) != 12:
    return False

  # Checks if the first three numbers are only numeric characters
  for i in range(0,3):
    if not text[i].isdecimal():
      return False
    # Checks if the fourth character is '-'
    if text[3] != '-':
      return False 

  # Checks if the next three numbers are only numeric characters
  for i in range(4, 7):
    if not text[i].isdecimal():
      return False 
    # Checks if the eighth character is '-'
    if text[7] != '-': 
      return False 

  # Checks if the next three numbers are only numeric characters
  for i in range(8, 12):
    if not text[i].isdecimal():
      return False 
  return True

In [9]:
print(isPhoneNumber('415-555-4242'))
print(isPhoneNumber('Hello world!'))

True
False


In [11]:
# Find the pattern of text in a larger string
message = 'Call me at 415-555-1011 tomorrow. 411-555-9999 is my office.'
for i in range(len(message)):
  chunk = message[i:i+12]
  if isPhoneNumber(chunk):
    print('Phone number found: ' + chunk)
print('Done')

Phone number found: 415-555-1011
Phone number found: 411-555-9999
Done


##Finding Patterns of Text With Regular Expressions

### *Matching Regex Objects*

In [13]:
import re
# For this example, \d means any number 0-9
phoneNumRegex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d') #r' means raw string

In [15]:
mo = phoneNumRegex.search('My number is 415-555-4242')

In [18]:
print('Phone number found: ' + mo.group())

Phone number found: 415-555-4242


### *Grouping with Parentheses*

In [23]:
import re

''' 
Adding () -> (\d\d\d) group the numbers that contains the expression
group() match object method to grab the matching text from just one group. 
'''

phoneNumRegex = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')
mo = phoneNumRegex.search('My number is 415-555-4242.')
mo.group(1)

'415'

In [24]:
mo.group(2)

'555-4242'

In [25]:
mo.group(0)

'415-555-4242'

In [26]:
mo.group()

'415-555-4242'

In [30]:
'''
If you would like to retrieve all the groups at once, use the groups() method.
'''
mo.groups()

('415', '555-4242')

In [29]:
areaCode, mainNumber = mo.groups()
print(areaCode)
print(mainNumber)

415
555-4242


In [42]:
''' 
To escape the ( and ) characters the next code can be added
'''

phoneNumRegex = re.compile(r'(\(\d\d\d\)) (\d\d\d-\d\d\d\d)')

In [43]:
mo = phoneNumRegex.search('My phone number is (415) 555-4242.')

In [44]:
mo.group(1)

'(415)'

In [45]:
mo.group(2)

'555-4242'