
# Python Regular Expressions (re) — Exhaustive Combination Notebook

This notebook is designed as a **near-complete practical closure** of Python regex.
It demonstrates **all meaningful combinations** of:

- Character classes
- Quantifiers
- Character sets
- Groups
- Anchors
- Alternation
- Greedy vs Lazy
- Lookarounds
- Flags
- re module functions

⚠️ Note:
Literal *infinite* combinations are impossible, but this notebook covers **all conceptually distinct patterns**
used in **interviews, production, and teaching**.

---


In [1]:
import re

## 1. Character Classes — Single, Pair, Sequence


Covers:
- \d \D \w \W \s \S
- Adjacent usage
- Sequential usage


In [2]:

re.findall(r"\d", "a1b2")
re.findall(r"\D", "a1b2")
re.findall(r"\w", "a_1$")
re.findall(r"\W", "a_1$")
re.findall(r"\s", "a b\tc")
re.findall(r"\S", "a b")

re.findall(r"\d\D", "1a2b")
re.findall(r"\w\s", "a b")
re.findall(r"\d+\w+", "12ab 9x")


['12ab', '9x']

## 2. Character Sets [ ] — Mixed & Negated


Character sets allow OR logic at a single position.


In [18]:

# re.findall(r"[abc]", "a1b2c3")
# re.findall(r"[a-zA-Z0-9]", "aB1$")
re.findall(r"[\d\s]", "1 a\t")
# re.findall(r"^\d", "a1b2")


['1', ' ', '\t']

<!-- ## 3. Quantifiers — + * ? {n} {n,m}


Quantifiers define repetition.
⚠️ * and ? can match empty strings. -->


In [4]:

re.findall(r"\d+", "a12b345")
re.findall(r"\d*", "a1b")
re.findall(r"\d?", "a1b")
re.findall(r"\d{2}", "12345")
re.findall(r"\d{2,4}", "123456")


['1234', '56']

## 4. Groups — Capturing & Non-capturing


Groups control precedence and extraction.


In [5]:

re.findall(r"(ab)+", "abab ab")
re.findall(r"(\d+)-(\d+)", "12-34 56-78")
re.findall(r"(?:ab)+", "abab")


['abab']

## 5. Alternation | (OR logic)


Matches one of multiple alternatives.


In [6]:

re.findall(r"cat|dog", "cat dog cow")
re.findall(r"\d+|\w+", "ab 12")


['ab', '12']

## 6. Anchors — ^ $ \b


Anchors assert position.


In [7]:

re.findall(r"^\d+", "123abc\n456", re.MULTILINE)
re.findall(r"\d+$", "abc123\nxyz456", re.MULTILINE)
re.findall(r"\bcat\b", "cat scatter")


['cat']

## 7. Greedy vs Lazy Quantifiers


Greedy consumes maximum, Lazy consumes minimum.


In [8]:

re.findall(r"<.*>", "<a><b>")
re.findall(r"<.*?>", "<a><b>")


['<a>', '<b>']

## 8. Lookarounds — Full Set


Context-based matching without consuming characters.


In [9]:

re.findall(r"\d+(?=px)", "10px 20px")
re.findall(r"\d+(?!px)", "10px 20em")
re.findall(r"(?<=\$)\d+", "$100 $200")
re.findall(r"(?<!\$)\d+", "$100 200")


['00', '200']

## 9. re Module Functions — Combined Usage


Demonstrates regex patterns across all re functions.


In [10]:

re.findall(r"\d+", "a1b22")
re.search(r"\d+", "abc123").group()
re.match(r"\d+", "123abc")
re.fullmatch(r"\d+", "123")
re.sub(r"(\d+)", r"<\1>", "a12b")
re.subn(r"\d", "#", "a1b2")
re.split(r"[,\s]+", "a, b  c")


['a', 'b', 'c']

## 10. finditer() & compile() — Performance


Streaming and compiled patterns.


In [11]:

for m in re.finditer(r"\d+", "a1b22"):
    print(m.group(), m.span())

pat = re.compile(r"\w+")
pat.findall("hello_123")


1 (1, 2)
22 (3, 5)


['hello_123']

## 11. Flags — IGNORECASE, MULTILINE, DOTALL


Flags modify regex behavior.


In [12]:

re.findall(r"cat", "Cat CAT", re.IGNORECASE)
re.findall(r"^\d+", "123\n456", re.MULTILINE)
re.findall(r"a.*c", "a\nb\nc", re.DOTALL)


['a\nb\nc']

## 12. Real-world Composite Patterns


Patterns combining multiple concepts.


In [13]:

re.fullmatch(r"[\w.-]+@[\w.-]+\.\w+", "test@example.com")
re.findall(r"\b\w+\.\w+\b", "file.txt img.png")
re.findall(r"(\d+)(kg|cm|m)", "10kg 20cm")


[('10', 'kg'), ('20', 'cm')]