# Regular Expressions in Python

This notebook demonstrates how to use Regular Expressions (regex) in Python, starting from the basics and moving toward more advanced usage.

We'll use the built-in `re` module.

In [1]:
import re

## Step 1: Basic Matching
Use `re.search()` to find a simple pattern in a string.

In [2]:

text = "Hello world!"
match = re.search("world", text)
if match:
    print("Found:", match.group())
else:
    print("Not found")


Found: world


## Step 2: Metacharacters
Characters like `. ^ $ * + ? { } [ ] \ | ( )` have special meanings in regex.

In [3]:

text = "The price is $100."
match = re.search(r"\$\d+", text)
if match:
    print("Found:", match.group())


Found: $100


## Step 3: Character Classes and Ranges
`\d`, `\w`, `\s`, and custom ranges like `[a-z]`

In [4]:

text = "User123 logged in at 10:45AM"
print(re.findall(r"\d+", text))  # all numbers
print(re.findall(r"\w+", text))  # all words


['123', '10', '45']
['User123', 'logged', 'in', 'at', '10', '45AM']


## Step 4: Anchors
`^` matches the start, `$` matches the end of a string.

In [5]:

print(re.search(r"^User", "User123"))      # Match at start
print(re.search(r"123$", "User123"))      # Match at end


<re.Match object; span=(0, 4), match='User'>
<re.Match object; span=(4, 7), match='123'>


## Step 5: Quantifiers
`*`, `+`, `?`, `{n}`, `{n,m}` control how many times a pattern appears.

In [6]:

text = "hellooo"
print(re.search(r"o+", text))     # one or more o's
print(re.search(r"lo{2,3}", text))  # l followed by 2 or 3 o's


<re.Match object; span=(4, 7), match='ooo'>
<re.Match object; span=(3, 7), match='looo'>


## Step 6: Groups and Capturing
Use parentheses `()` to capture parts of the match.

In [7]:

text = "John's email is john@example.com"
match = re.search(r"(\w+)@(\w+\.com)", text)
if match:
    print("Username:", match.group(1))
    print("Domain:", match.group(2))


Username: john
Domain: example.com


## Step 7: Substitution
Use `re.sub()` to replace patterns.

In [None]:

text = "My number is 123-456-7890"
masked = re.sub(r"\d{3}-\d{2}", "***-**", text)
print(masked)


## Step 8: Using `re.compile()`
Compile patterns for reuse and efficiency.

In [None]:

pattern = re.compile(r"\b\w{4}\b")
text = "This text has some four letter words like word and test."
print(pattern.findall(text))
