Regular expressions help us in complex pattern matching. This is available through the standard re module.

In [1]:
import re

In [9]:
# The most simplest pattern matching syntax is as simple as:
result = re.match('Love', 'Love one another')

In [5]:
result

<_sre.SRE_Match object; span=(0, 4), match='Love'>

Here, 'Love' is the pattern and 'Love one another' is the source string we want to test. The function of match() is to check whether the source begins with the given pattern.

For more complex pattern matching, we can compile our pattern beforehand to make it faster.

In [21]:
pattern = re.compile('Love')

In [31]:
result = pattern.match('Love one another')

In [33]:
# This is the same as using a start anchor '^'
result = re.match('^Love', 'Love one another')

In [34]:
# We can check the result of our pattern matching as follows
# string will return the source we are testing against
result.string

'Love one another'

In [35]:
# group() will return the patterns matched.
result.group()

'Love'

The common methods we use in pattern matching are:
1. search() - returns the first match of the pattern we are looking for, if any.
2. findall() - returns the list of all non-overlapping matches.
3. split() - splits the source string at matches with pattern and returns a list of the split string pieces.
4. sub() - takes a replacement argument, and changes all parts of source string that are matched by pattern to replacement.

match() works only if the pattern is at the beginning of the source. But search() works if the pattern is anywhere:

In [36]:
source = 'She sells sea shells on the sea shore'
m = re.search('sh', source)

In [39]:
m.group()

'sh'

In [42]:
# Find all matches using findall()
m = re.findall('sh', source)

In [44]:
# findall() returns a list
print(m)

['sh', 'sh']


In [47]:
# We can get the count of the number of matches as follows:
print('Found',len(m),'matches') 

Found 2 matches


In [None]:
# Split at matches with split()

In [48]:
m = re.split('sh', source)

In [49]:
m

['She sells sea ', 'ells on the sea ', 'ore']

In [56]:
# Replace matches using sub()
m = re.sub('sh', 'b', source) # replacing all 'sh' with 'b'

In [57]:
m

'She sells sea bells on the sea bore'