# Regex
https://docs.python.org/3/howto/regex.html#more-metacharacters

## 1. Simple pattern

In [3]:
import re

In [4]:
p = re.compile(r"abc")
m = p.search("asdfabcasdf")
print(f"{m.start()}, {m.end()}, {m.group()}")

4, 7, abc


## 2. Match APIs
- `match()`: match from the beginning
- `search()`: search the entire string
- `findall()`: find all matches
- `finditer()`: find all matches as iterator

## 3. Character Class
- `[abc]`, `[a-z]`, `[0-0]`, etc
- `^`: complementing the set, e.g. `[^abc]`. Note, have to be at the beginning
- `/d`: [0-9]
- `/s`: [\r\n\r\f\v]
- `/w`: [a-zA-Z0-9_]

In [5]:
p = re.compile(r"[abc][123]")
m = p.search("aaabbb111222ccc")
print(f"{m.start()}, {m.end()}, {m.group()}")

5, 7, b1


## 4. Repitition
- `{m, n}`: at least m, at most n
- `*` == {0,}
- `+` == {1,}
- `?` == {0,1}

In [6]:
p = re.compile(r"[0-9]{3}")
for itr in p.finditer("asdf123def456"):
    print(itr.group())

123
456


## 5. Group
- `()`
- `|`: or

In [None]:
p = re.compile(r"\+?(\d)-(\d{3})-(\d{3})-(\d{4})")
for itr in p.finditer("my numbers are 1-123-456-7890 and +2-345-667-1234"):
    print(f"{itr.group(0)