# Regular Expressions

"Regular expressions (called REs, or regexes, or regex patterns) are essentially a tiny, highly specialized programming language embedded inside Python and made available through the [re](https://docs.python.org/3/library/re.html#module-re) module. Using this little language, you specify the rules for the set of possible strings that you want to match; this set might contain English sentences, or e-mail addresses..." - https://docs.python.org/3/howto/regex.html

In [None]:
# here we're importing python's module to handle regular expressions/regex (re)
import re

# regex methods

## Using re.search
Finds the first occurance and returns a Match object

In [None]:
# Which occurance on "N" does this find?
result = re.search("N", "hhhhhhNhhhhhh")
print(result)

In [None]:
# Which occurance on "N" does this find?
result2 = re.search("N", "hhhhhhNhhhhhhN")
print(result)

In [None]:
# This is how we can find multiple matches/occurances 
needle = "Buffalo"
haystack = "Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo"

while True:
    match = re.search(needle, haystack)
    if not match:
        break
    else:
        print(match)
        haystack = haystack[match.end():]
        

## Using re.findall
Finds all occurances and returns a list with the occurances

In [None]:
# You may find this easier to implement than the previous example 
results = re.findall("Buffalo", "Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo")
print(results)

## Using re.compile
Often times, you'll find examples using re.compile. It's another way to do the same thing with different syntax.

Benefits:
- performance
- reusable

### match vs. search
Two popular methods you'll commonly see: match & search

#### match 
finds an occurance if it's at the beginning of the string

In [None]:

pattern = re.compile("N")
result = pattern.match("hhhhhhNhhhhNhh")
# note how the substring "N" is not at the beginning of our string

if result:
    print(result)
else:
    print(f"NO MATCH, result evalutates to {result}")

In [None]:
pattern = re.compile("N")
result = pattern.match("NhhhhhhNhhhhNhh")
# note how the substring "N" is at the beginning of our string

if result:
    print(result)
else:
    print(f"NO MATCH, result evalutates to {result}")

#### search
finds the first occurance anywhere in the string

In [None]:
pattern = re.compile("N")
match = pattern.search("hhhhhhNhhhhNhh")

if match:
    print(match)
else:
    print("NO MATCH")

In [None]:
# let's find multiple occurances
needle = "Buffalo"
haystack = "Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo"
pattern = re.compile(needle)
match = pattern.search(haystack)

while True:
    if not match:
        break
    else:
        print(match)
        match = pattern.search(haystack, match.end())

# Exercises

Test your regex knowledge

In [None]:
lyrics = "Everything is awesome, everything is cool when you're part of a team. \
Everything is awesome, when you're living out a dream."

In [None]:
# 1. find all occurances of the substring "awesome"
# CODE GOES HERE
x = re.findall("awesome", lyrics)
print(x)

In [None]:
# 2. find the first occurance of the substring "Everything"
# CODE GOES HERE
x = re.search("Everything", lyrics)
print(x)

In [None]:
# 3. find the first occurence of the substring "team"
# CODE GOES HERE
t = "team"
x = re.search(f"{t}", lyrics)
print(x)

In [None]:
# use this string for number 4 and 5
# you may want to use this website: https://regex101.com/
info = "Peter Parker, Friendly Neigborhood Spider-Man, peterparker@gmail.com, spiderman@aol.com, 212-456-7890, (212)324-2354"


In [None]:
# 4. find the both occurances of an email address in the string
# CODE GOES HERE

# Extra Credit: just find the domain names in the email addresses
# CODE GOES HERE

In [None]:
# 5. find the first occurance of a phone number
# CODE GOES HERE

# Extra Credit: find all phone numbers in the info string
# CODE GOES HERE

# FAQ
- Why do we see a lowercase **r** in front of the regex patterns in python? -->
[It's a feature](https://docs.python.org/3/library/re.html#raw-string-notation)
- When should we use re.complile versus re.search, re.match, re.findall, etc. ? -->
[It depends on your use case](https://docs.python.org/3/library/re.html#re.compile)
- When do we have re.match? -->
[It also depends on your use case](https://stackoverflow.com/a/29009475)