# Regular Expressions in Python
Regex are a way to search a string using patterns. To use them the *re* module needs to be imported as it provides several methods for pattern matching including:
- **re.findall(pattern, str)**: returns a list containing all matches
- **re.finditer(pattern, str)**: returns an iterator with matching objects
- **re.search(pattern, str), re.match(pattern, str)**: returns a match object, where re.serch() only finds the first occurence.
- **re.split()**: returns a list where each matched object has been split
- **re.sub(pattern, replacement, str, count=0)**: replaces the matched patterns with the new string.

In [1]:
import re

## Patterns
A pattern represents a set of strings, which can be infinite. The backslash character can alter the meaning of letters and numbers in regex. Common regular expression notation includes:
|Symbol | Meaning                  | Example         |
|:-----:|:-------------------------|:---------------:|
|`^`| Matches the start of the string| re.findall(^P, string)|
|`[]`| Any character inside the bracket would match| re.match(r'[Pp]erfume', string), P or p will macth|
|`[^...]`| Any character before the carat would match| re.match(r'[^ast]', string)|
|`.`| Matches any character except newlines | re.findall('he..o', string)|
|`$`| Matches the end of the string| re.findall('planet$', string)|
|`*`| Matches zero or more previous occurences| re.findall(r'he*a', string)|
|`+`| Matches one or more occurence| re.match('h+art', string)|
|`?`|Matches zero or more occurences of the previous RE|re.findall('?P', string)|
|`{m,n}`| Exact amount of occurences, m to n| re.findall('he.{2}0', string), 2 of the same characters after e|
|`\`| Special sequence| re.findall('\d', string), finds any digit [0-9]|
|`\|`| Either or | re.match(r'he' | 'sa', string)|
|`()`| Used to group sub-patterns| re.findall((^P | [1-2]), string)|
|`\d`| Same as [0-9], matches any digit| re.findall('\d', string)|
|`\D`| Matches anything besides a digit| re.findall('\D', string)|
|`\s`| Matches a whitespce character| re.match('\s', string)|
|`\S`| Matches a nonwhitespace character| re.match('\S', string)|
|`\w`| Same as [a-zA-Z0-9_], matches one alphanumeric character| re.match('\w', string)|
|`\W`| Matches non-alphanumeric characters| re.match('\W', string)|

The patterns *\A and \Z* are similar to *^ and $* repsectively except they do not match on a newline. Additionally, *\b* can match at the start or end of a word, while *\B* does the opposite. All these characters match in specific places of the string.

In [3]:
s = "Doing things, going home, staying awake, sleeping later"
re.findall(r'\w+ing\b', s)

find_int = '38 + 32 = 70'
print('Find all the integers in the string: ')
re.findall(r'[+-]?\d+', find_int)

Find all the integers in the string: 


['38', '32', '70']