# 7 Pattern Matching with Regular Expressions

## 7.1 Finding Patterns of Text without Regular Expressions

In [1]:
def isPhoneNumber(text):
    if len(text) != 12:
        return False
    for i in range(0, 3):
        if not text[i].isdecimal():
            return False
    if text[3] != '-':
        return False
    for i in range(4, 7):
        if not text[i].isdecimal():
            return False
    if text[7] != '-':
        return False
    for i in range(8, 12):
        if not text[i].isdecimal():
            return False
    return True

In [3]:
print('Is 415-555-4242 a phone number?')
print(isPhoneNumber("415-555-4242"))
print('Is Moshi moshi a phone number?')
print(isPhoneNumber("Moshi moshi"))

Is 415-555-4242 a phone number?
True
Is Moshi moshi a phone number?
False


In [4]:
message = 'Call me at 415-555-1011 tomorrow. 415-555-9999 is my office.'
for i in range(len(message)):
    chunk = message[i:i+12]
    if isPhoneNumber(chunk):
        print('Phone number found: ' + chunk)
print('Done')

Phone number found: 415-555-1011
Phone number found: 415-555-9999
Done


## 7.2 Finding Patterns of Text with Regular Expressions

### 7.2.1 Creating Regex Objects

In [5]:
# Import re module to work with regex. 
import re

Passing a string value representing your regular expression to re.compile() returns a Regex pattern object (or simply, a Regex object).

To create a Regex object that matches the phone number pattern, enter the following into the interactive
shell. (Remember that \d means “a digit character” and \d\d\d-\d\d\d-\d\d\d\d is the regular expression for a phone number pattern.)

In [6]:
phoneNumRegex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')

### 7.2.2 Matching Regex Objects

A Regex object’s search() method searches the string it is passed for any matches to the regex. The search() method will return None if the regex pattern is not found in the string. If the pattern is found, the search() method returns a Match object, which have a group() method that will return the actual matched text from the searched string. (I’ll explain groups shortly.)

In [7]:
mo = phoneNumRegex.search('My number is 415-555-4242.')
print('Phone number found: ' + mo.group())

Phone number found: 415-555-4242


Here, we pass our desired pattern to re.compile() and store the resulting Regex object in phoneNumRegex. Then we call search() on phoneNumRegex and pass search() the string we want to match for during the search. The result of the search gets stored in the variable mo. In this example, we know that our pattern will be found in the string, so we know that a Match object will be returned. Knowing that mo contains a Match object and not the null value None, we can call group() on mo to return the match. Writing mo.group() inside our print() function call displays the whole match, 415-555-4242.

### 7.2.3 Review of Regular Expression Matching

1. Import the regex module with import re.
2. Create a Regex object with the re.compile() function. (Remember to use a raw string.)
3. Pass the string you want to search into the Regex object’s search() method. This returns a Match object.
4. Call the Match object’s group() method to return a string of the actual matched text.

## 7.3 More Pattern Matching with Regular Expression

### 7.3.1 Grouping with Parentheses

In [13]:
phoneNumRegex = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')
mo = phoneNumRegex.search('My number is 415-555-4242.')
# Call the first group using mo.group(1)
mo.group(1)
# Call the second group using mo.group(2)
mo.group(2)
# Call the whole number using mo.group(0) or mo.group()

'555-4242'

In [14]:
# If you would like to retrieve all the groups at once, use the groups() method—note the plural form for the name.
mo.groups()

('415', '555-4242')

In [15]:
areaCode, mainNumber = mo.groups()
print(areaCode)

415


Since mo.groups() returns a tuple of multiple values, you can use the multiple-assignment trick to assign
each value to a separate variable, as in the previous areaCode, mainNumber = mo.groups() line.
