# Regular Expressions Exercises

[Markdown Cheatsheet](https://www.markdownguide.org/cheat-sheet/)

Implements the exercises in [regexone.com](https://regexone.com/references/python).

Idea: return all sentences in the works of Charles Dickens that include the word "butter". Or more complex stuff.



In [9]:
import re


In [10]:
# Lets use a regular expression to match a date string. Ignore
# the output since we are just testing if the regex matches.
regex = r"([a-zA-Z]+) (\d+)"
if re.search(regex, "June 24"):
    # Indeed, the expression "([a-zA-Z]+) (\d+)" matches the date string
    
    # If we want, we can use the MatchObject's start() and end() methods 
    # to retrieve where the pattern matches in the input string, and the 
    # group() method to get all the matches and captured groups.
    match = re.search(regex, "June 24")
    
    # See output of this cell. Matchesat the beginning and end of the string
    print("Match at index %s, %s" % (match.start(), match.end()))
    
    # The groups contain the matched values.  In particular:
    #    match.group(0) always returns the fully matched string
    #    match.group(1), match.group(2), ... will return the capture
    #            groups in order from left to right in the input string
    #    match.group() is equivalent to match.group(0)
    
    # So this will print "June 24"
    print("Full match: %s" % (match.group(0)))
    # So this will print "June"
    print("Month: %s" % (match.group(1)))
    # So this will print "24"
    print("Day: %s" % (match.group(2)))
else:
    # If re.search() does not match, then None is returned
    print("The regex pattern does not match. :(")

Match at index 0, 7
Full match: June 24
Month: June
Day: 24


In [16]:
# Lets use a regular expression to match a few date strings.
regex = r"[a-zA-Z]+ \d+"
matches = re.findall(regex, "June 24, August 9, Dec 12")

for match in matches:
    print(f"Full match: {match}")

# To capture the specific months of each date we can use the following pattern
regex = r"([a-zA-Z]+) \d+"
matches = re.findall(regex, "June 24, August 9, Dec 12")

for match in matches:
    print(f"Match month:{match}")

# If we need the exact positions of each match
regex = r"([a-zA-Z]+) \d+"
matches = re.finditer(regex, "June 24, August 9, Dec 12")

for match in matches:
    print(f"Match at index: {match.start()}, {match.end()}")

Full match: June 24
Full match: August 9
Full match: Dec 12
Match month:June
Match month:August
Match month:Dec
Match at index: 0, 7
Match at index: 9, 17
Match at index: 19, 25


In [17]:
# Lets try and reverse the order of the day and month in a date 
# string. Notice how the replacement string also contains metacharacters
# (the back references to the captured groups) so we use a raw 
# string for that as well.
regex = r"([a-zA-Z]+) (\d+)"

# This will reorder the string and print:
#   24 of June, 9 of August, 12 of Dec
print(re.sub(regex, r"\2 of \1", "June 24, August 9, Dec 12"))

24 of June, 9 of August, 12 of Dec


In [None]:
this = 50
that = "this"
the_other = lambda x: x + 2
the_other(5)
test_strings = [
    'this is a test',
    'of the emergency',
    'broadcast system. This is ', 
    'only a test.' 
]


