# Regular Expressions

* See: https://docs.python.org/3/howto/regex.html
* See: https://docs.python.org/3/library/re.html
* See: https://www.guru99.com/python-regular-expressions-complete-tutorial.html

## Metacharacters:
```. ^ $ * + ? { } [ ] \ | ( )```

## Special Sequences 

* ```\d``` Matches any decimal digit, equivalent to ```[0-9]```
* ```\D``` Matches any non-digit character, equivalent to ```[^0-9]```
* ```\s``` Matches any whitespace character, equivalent to ```[ \t\n\r\f\v]```
* ```\S``` Matches any non-whitespace character, equivalent to ```[^ \t\n\r\f\v]```
* ```\w``` Matches any alphanumeric character, equivalent to ```[a-zA-Z0-9_]```
* ```\W``` Matches any non-alphanumeric character, equivalent to  ```[^a-zA-Z0-9_]```
* etc.

## Sets

* ```[arn]``` returns a match where one of the specified characters (a, r, or n) are present	
* ```[a-n]``` returns a match for any lower case character, alphabetically between a and n	
* ```[^arn]``` returns a match for any character EXCEPT a, r, and n	
* ```[0123]``` returns a match where any of the specified digits (0, 1, 2, or 3) are present	
* ```[0-9]``` returns a match for any digit between 0 and 9	
* ```[0-5][0-9]``` returns a match for any two-digit numbers from 00 and 59	
* ```[a-zA-Z]``` returns a match for any character alphabetically between a and z, lower case OR upper case	
* ```[+]``` In sets, +, *, ., |, (), $,{} has no special meaning, so [+] means: return a match for any + character

## Performing Matches

* ```match()``` Determine if the RE matches at the beginning of the string.
* ```search()``` Scan through a string, looking for any location where this RE matches.
* ```findall()``` Find all substrings where the RE matches, and returns them as a list.
* ```finditer()``` Find all substrings where the RE matches, and returns them as an iterator.

In [1]:
import re

txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)
print(x)
if x:
  print("Found a match!")
else:
  print("No match")

<re.Match object; span=(0, 17), match='The rain in Spain'>
Found a match!


In [2]:
str = 'purple alice-b@google.com monkey dishwasher'
match = re.search(r'\w+@\w+', str)
if match:
    print(match.group())

b@google


In [3]:
p = re.compile('[a-z]+')
print(p)
m = p.match('tempo')
print(m)
if m:
    print('Match found: ', m.group())
else:
    print('No match')

re.compile('[a-z]+')
<re.Match object; span=(0, 5), match='tempo'>
Match found:  tempo


In [4]:
import re
m = re.search('(?<=abc)def', 'abcdef')
m.group(0)

'def'

In [5]:
m = re.search(r'(?<=-)\w+', 'spam-egg')
m.group(0)

'egg'