The `sub()` Method is searches for an expression and replaces it.

It is equivalent to `CTRL + H` for most office suites.

In [1]:
import re

In [2]:
namesRegex = re.compile(r'Agent \w+')

In [3]:
namesRegex.findall('Agent Alice gave the secret documents to Agent Bob')

['Agent Alice', 'Agent Bob']

In [4]:
namesRegex.sub('REDACTED', 'Agent Alice gave the secret documents to Agent Bob')

'REDACTED gave the secret documents to REDACTED'

From the above example, we are able to replace an agent's name with `REDACTED`.

What if we wanted to keep the first letter of the agent's name?

In [5]:
'''
Group the first letter of the agent's name and the rest of the name can be included as 0 or more letters.
'''
namesRegex = re.compile(r'Agent (\w)\w*')

In [6]:
namesRegex.findall('Agent Alice gave the secret documents to Agent Bob')

['A', 'B']

In [7]:
namesRegex.sub(r'Agent \1****', 'Agent Alice gave the secret documents to Agent Bob')

'Agent A**** gave the secret documents to Agent B****'

The `verbose` methods allows the use of new lines and other white spaces to separate logical sections of the pattern.

In [None]:
re.compile(r'''
(\d\d\d-)|      # area code, without parenthesis with dash
(\d(\d\d\d\) )  # -or- area code with patenthesis and no dash
\d\d\d          # first 3 digits
-               # second dash
\d\d\d\d        # last 4 digits
\sx\d{2,4}      # extenstion, e.g. x1234
''', re.VERBOSE)

We can include multiple options of the regular expression with the pipe operator:

In [None]:
re.compile(r'''
(\d\d\d-)|      # area code, without parenthesis with dash
(\d(\d\d\d\) )  # -or- area code with patenthesis and no dash
\d\d\d          # first 3 digits
-               # second dash
\d\d\d\d        # last 4 digits
\sx\d{2,4}      # extenstion, e.g. x1234
''', re.IGNORECASE | re.DOTALL | re.VERBOSE)