The caret symbol (^) indicates that the search must begin with this regular expression pattern.

The dollar symbol ($) indicates that the search must end with this regular expression pattern.

In [2]:
import re

In [3]:
'''
With the below pattern, we are looking for the world 'Hello' at the beginning of the string only:
'''
beginsWithHelloRegex = re.compile(r'^Hello')
beginsWithHelloRegex.search('Hello there!')

<re.Match object; span=(0, 5), match='Hello'>

In [4]:
'''
The below regex.search returns nothing as the pattern 'Hello' is not at the start of the string.
'''
beginsWithHelloRegex.search('He said "Hello there!"') == None

True

In [5]:
'''
And with the dollar sign:
'''
endsWithWorldRegex = re.compile(r'world!$')
endsWithWorldRegex.search('Hello world!')

<re.Match object; span=(6, 12), match='world!'>

In [6]:
endsWithWorldRegex.search('Hello world! How are you?') == None

True

In [7]:
'''
Using both caret and dollar sign, we can use the below pattern to determine whether a string contains digits only or not:
'''
allDigitsRegex = re.compile(r'^\d+$')
allDigitsRegex.search('879564361987456194798987')

<re.Match object; span=(0, 24), match='879564361987456194798987'>

In [8]:
'''
As the matching pattern must start and end with digits, and must also contain that pattern of digits only.
The below does not fit this as there is a non-digit character in the middle.
'''
allDigitsRegex.search('879564361987x456194798987') == None

True

In [9]:
'''
The period character (.) is a wildcard for any character that is not a new line.
'''
atRegex = re.compile(r'.at')
atRegex.findall('The cat in the hat sat on the flat mat')

['cat', 'hat', 'sat', 'lat', 'mat']

In [10]:
'''
The above output shows everything ending in at.
However when it finds flat, the output is lat.
This is due to the pattern only looking for one character preceeding the at.

We can adjust this as follows:
'''
atRegex = re.compile(r'.{1,2}at')
atRegex.findall('The cat in the hat sat on the flat mat')

'''
The above is now adding the white space preceeding at.
'''

'\nThe above is now adding the white space preceeding at.\n'

In [11]:
'''
The dot-star (.*) is any character that occurs zero or more times.

We will try to extract the first name and last name from the string:
    'First Name: Al Last Name: Sweigart'
'''
'First Name: Al Last Name: Sweigart'.find(':')

'First Name: Al Last Name: Sweigart'.find(':') + 2

'First Name: Al Last Name: Sweigart'[12:]

'Al Last Name: Sweigart'

In [12]:
nameRegex = re.compile(r'First Name: (.*) Last Name: (.*)')
nameRegex.findall('First Name: Al Last Name: Sweigart')

'''
The above code searches for the pattern "First Name: " and returns the first group, then searches for " Last Name: " and returns that for the second group'''

'\nThe above code searches for the pattern "First Name: " and returns the first group, then searches for " Last Name: " and returns that for the second group'

In [14]:
'''
(.*) is greedy - will match with the maximum number of characters possible.
(.*?) is non-greedy - will match with the minimum number of characters as possible.
'''
serve = '<To serve humans> for dinner.>'
nongreedy = re.compile(r'<(.*?)>')
nongreedy.findall(serve)

['To serve humans']

In [15]:
greedy = re.compile(r'<(.*)>')
greedy.findall(serve)

['To serve humans> for dinner.']

In [20]:
prime = '''Serve the public trust.
Protect the innocent.
Uphold the law.
'''
dotStar = re.compile(r'.*')
dotStar.search(prime)

<re.Match object; span=(0, 23), match='Serve the public trust.'>

In [21]:
dotStar = re.compile(r'.*', re.DOTALL)
dotStar.search(prime)

<re.Match object; span=(0, 62), match='Serve the public trust.\nProtect the innocent.\nU>

In [25]:
'''
The below only matches vowels of lower case.
'''
vowelRegex = re.compile(r'[aeiou]')
vowelRegex.findall('Al, why does your programming book talk about RoboCop so much?')

['o',
 'e',
 'o',
 'u',
 'o',
 'a',
 'i',
 'o',
 'o',
 'a',
 'a',
 'o',
 'u',
 'o',
 'o',
 'o',
 'o',
 'u']

In [26]:
'''
We can use re.IGNORECASE or re.I to match the patter regardless of case.
'''
vowelRegex = re.compile(r'[aeiou]',re.IGNORECASE)
vowelRegex.findall('Al, why does your programming book talk about RoboCop so much?')

['A',
 'o',
 'e',
 'o',
 'u',
 'o',
 'a',
 'i',
 'o',
 'o',
 'a',
 'a',
 'o',
 'u',
 'o',
 'o',
 'o',
 'o',
 'u']