# `re` and Regular Expressions
`re` is a built-in Python module for working with regular expressions

In [1]:
import re

In [2]:
text = 'I am the very model of a modern major general'

In [3]:
pattern = 'model'

In [5]:
match = re.search(pattern,text)

In [8]:
match.span()

(14, 19)

In [9]:
multi_match = re.findall("mod",text)

In [10]:
multi_match

['mod', 'mod']

In [13]:
for matches in re.finditer("mod",text):
    print(matches.span())

(14, 17)
(25, 28)


## Regex patterns

|Character|Desc|Example pattern|Example match|
|:--|:--|:--|:--|
|`\d`|Digits|`test_\d\d\d`|'test_123'|
|`\w`|Alphanumeric|`\w\w\w`|'xy4'|
|`\s`|Whitespace|`\s\w\s\w\s\w`|' a b 4'|
|`\D`|Non-digits|`\D\D\s\D`|'QR S'|
|`\W`|Non-alphanumeric|`\W\W\W`|';-)'|
|`\S`|Non-whitespace|`\S\S\S`|'Foo'|

In [27]:
sentence = 'the haunted phone number is 451-666-6665'

In [28]:
phone_pattern = re.search(r'\d\d\d-\d\d\d-\d\d\d\d',sentence)

In [29]:
phone_pattern

<re.Match object; span=(28, 40), match='451-666-6665'>

In [30]:
phone_pattern.group()

'451-666-6665'

## Regex Pattern Quantifiers

|Character|Desc|Example pattern|Example match|
|:--|:--|:--|:--|
|`+`|One of more|`test_\d+`|'test_123'|
|`{x}`|Exactly x times|`\w{3}`|'xy4'|
|`{x,y}`|Between x and y times|`\d\s{4,7}\d`|'0    8'|
|`{x,}`|x or more times|`\D{3,}`|'QRST'|
|`*`|zero or more times|`ABC*`|'ABB'|
|`?`|once or none|`t?e?st`|'est'|

In [31]:
shorter_ph_pattern = re.search(r'\d{3}-\d{3}-\d{4}',sentence)

In [32]:
shorter_ph_pattern

<re.Match object; span=(28, 40), match='451-666-6665'>

In [36]:
grouped_ph_pattern = re.compile(r'(\d{3})-(\d{3})-(\d{4})')

In [37]:
new_result = re.search(grouped_ph_pattern, sentence)

In [41]:
new_result.group(1)

'451'

### OR
The pipe `|` can be used to search for multiple patterns.

In [42]:
new_sentence = 'I like cats and dogs and snakes and rabbits. I do not like spiders.'

In [44]:
re.search(r'dogs|giraffes',new_sentence)

<re.Match object; span=(16, 20), match='dogs'>

### Wildcard
A period `.` can be used to signify any character.

In [45]:
at_sentence = 'Batman and Catwoman were having tea with Gnatman.'

In [47]:
re.findall(r'.at',at_sentence)

['Bat', 'Cat', 'nat']

### Starts with and Ends with
The carat symbol `^` can be used to denote the start of a start of a target. The dollar symbol `$` can be used to denote the end of a pattern.

In [49]:
another_sentence = '1 is the loneliest number that you\'ll ever do'
re.search(r'^1', another_sentence)

<re.Match object; span=(0, 1), match='1'>

In [50]:
another_sentence2 = '2 can be as bad as 1 it\'s the loneliest number since the number 1'
re.search(r'1$',another_sentence2)

<re.Match object; span=(64, 65), match='1'>

### Exclude
Square brackets `[]` can be used to exclude items from a pattern.

In [51]:
new_phrase = 'There are 39 steps'

In [53]:
re.findall(f'[^\d]+',new_phrase)

['There are ', ' steps']

In [54]:
transformative_phrase = 'bah-weep grana-weep ninnibong! bah-weep grana-weep ninnibong?'

In [57]:
re.findall(r'[^! ?]+', transformative_phrase) # exclude exclamations, question marks, and spaces

['bah-weep', 'grana-weep', 'ninnibong', 'bah-weep', 'grana-weep', 'ninnibong']