# Python for Beginners — Lesson 10: Regular Expressions in Python
**What you will learn in this video:**
- What a regular expression is and why it's useful.
- How to import the `re` module and use functions like `findall`, `search`, `split` and `sub`.
- Understand basic metacharacters (`.`, `^`, `$`, `*`, `+`, `?`, etc.).
- Use special sequences such as `\d`, `\w`, and `\s`.
- Apply flags like `IGNORECASE` for case-insensitive matching.

 **Tip:** Imagine regex as a powerful search engine that understands patterns instead of exact text. It can find words, numbers or formats inside strings based on rules you define.


# Introduction to Regular Expressions in Python

A **regular expression** (or *regex*) is a sequence of characters that defines a search pattern . You can use regexes to check whether a string contains a pattern, extract matching parts, or replace parts of a string. Python provides built‑in support for regular expressions through the `re` module .


In [1]:
import re  # Import the regular expression module


## Basic functions

The `re` module offers several functions for working with regular expressions【282892803799340†L922-L933】:

* `re.findall(pattern, string)` – returns a list containing all matches.
* `re.search(pattern, string)` – returns a match object for the first match (or `None` if there is no match).
* `re.split(pattern, string)` – splits the string by matches of the pattern.
* `re.sub(pattern, replacement, string)` – replaces one or many matches with a replacement.


In [2]:
text = "The rain in Spain falls mainly in the plain 123."

# findall: find all occurrences of 'in'
print(re.findall(r"in", text))

# search: check if string starts with 'The' and ends with 'Spain'
match = re.search(r"^The.*Spain$", text)
print("Match found?", bool(match))

# split: split by whitespace
print(re.split(r"\s+", text))

# sub: replace digits with '#'
print(re.sub(r"\d", "#", text))


['in', 'in', 'in', 'in', 'in', 'in']
Match found? False
['The', 'rain', 'in', 'Spain', 'falls', 'mainly', 'in', 'the', 'plain', '123.']
The rain in Spain falls mainly in the plain ###.


## Metacharacters

Regex patterns include special symbols called *metacharacters* that describe complex search criteria【282892803799340†L940-L956】:

* `.` – matches any character except newline.
* `^` – matches the start of the string.
* `$` – matches the end of the string.
* `*` – matches zero or more occurrences of the preceding pattern.
* `+` – matches one or more occurrences.
* `?` – matches zero or one occurrence.
* `{n}` – matches exactly `n` repetitions.
* `|` – acts like “or” between patterns.
* `()` – groups patterns together.


In [3]:
text2 = "hello helo heeeello"

# . matches any character
print(re.findall(r"he..o", text2))  # matches 'hello'

# * matches zero or more
print(re.findall(r"he.*o", text2))  # matches from 'helo' to 'heeeello'

# + matches one or more
print(re.findall(r"he.+o", text2))  # requires at least one character between

# ? matches zero or one
print(re.findall(r"he.?o", text2))  # matches 'helo' and 'hello'


['hello']
['hello helo heeeello']
['hello helo heeeello']
['helo']


## Special sequences

Special sequences start with a backslash (`\`) and represent common character classes【282892803799340†L980-L1018】:

* `\d` – matches any digit (0–9).
* `\D` – matches any non‑digit.
* `\w` – matches any word character (letters, digits, underscore).
* `\W` – matches any non‑word character.
* `\s` – matches any whitespace character (space, tab, newline).
* `\S` – matches any non‑whitespace character.


In [6]:
text3 = "User_01 scored 98 points."
"Next user scored 87."

# find all digits
print(re.findall(r"\d+", text3))

# find all words
print(re.findall(r"\w+", text3))

# find whitespace characters
print(re.findall(r"\s", text3))


['01', '98']
['User_01', 'scored', '98', 'points']
[' ', ' ', ' ']


## Flags

Regex *flags* modify how patterns are interpreted. For example, `re.IGNORECASE` makes matching case‑insensitive. You can pass a flag as a third argument to functions like `re.search()` or include it inline.


In [7]:
# Case-insensitive search
text4 = "Python is fun. PYTHON regex is powerful."
print(re.findall(r"python", text4, re.IGNORECASE))


['Python', 'PYTHON']
