# Day 28 — Regular Expressions (Regex)

1. What is Regex?
- A powerful tool for pattern matching and text searching.
- Used for validation, extraction, replacements.
- Comes from 're' module.

2. Common Functions:
- re.match() → checks only at the beginning of string
- re.search() → checks anywhere in string
- re.findall() → returns all matched patterns
- re.finditer() → returns match objects (iterable)
- re.sub() → replaces text

3. Special Characters:
.  → any character except newline
^ → start of string
$ → end of string
[] → character set
() → group
+  → one or more
*  → zero or more
?  → zero or one
{n} → exact count
{n,m} → range repetition
| → OR operator

4. Common Patterns:
\d → digit
\D → non-digit
\w → word character (a-z,0-9,_)
\W → non-word
\s → space
\S → non-space

5. Use Cases:
- Email validation
- Phone number extraction
- Password validation
- Searching patterns
- Replacing words automatically


## EXAMPLES

In [1]:
import re

# Example 1: Simple match
text = "Hello Python"
print(re.match("Hello", text))

<re.Match object; span=(0, 5), match='Hello'>


In [2]:
# Example 2: search()
print(re.search("Python", text))

<re.Match object; span=(6, 12), match='Python'>


In [3]:
# Example 3: findall()
print(re.findall("o", text))

['o', 'o']


In [4]:
# Example 4: finditer()
for m in re.finditer("o", text):
    print(m.group(), m.start())

o 4
o 10


In [5]:
# Example 5: Character set
print(re.findall("[aeiou]", "beautiful"))

['e', 'a', 'u', 'i', 'u']


In [6]:
# Example 6: Using + quantifier
print(re.findall("a+", "caaaatt"))

['aaaa']


In [7]:
# Example 7: Using ^ and $
print(re.match("^Hello", "Hello world"))
print(re.match("world$", "Hello world"))

<re.Match object; span=(0, 5), match='Hello'>
None


In [8]:
# Example 8: Grouping ()
match = re.search(r"(\d{2})-(\d{2})-(\d{4})", "Today is 12-05-2024")
print(match.groups())

('12', '05', '2024')


In [9]:
# Example 9: Replace using re.sub()
print(re.sub("Python", "Regex", "I love Python"))

I love Regex


In [10]:
# Example 10: Extracting numbers
print(re.findall(r"\d+", "Marks: 45, Age: 22, Year: 2025"))

['45', '22', '2025']


## PRACTICE QUESTIONS

In [11]:
# Q1: Find all vowels
print(re.findall("[aeiouAEIOU]", "HELLO world"))

['E', 'O', 'o']


In [12]:
# Q2: Extract all digits
print(re.findall(r"\d", "a1b2c3"))

['1', '2', '3']


In [13]:
# Q3: Validate if string starts with 'Hi'
print(bool(re.match("Hi", "Hi Python")))

True


In [14]:
# Q4: Extract words
print(re.findall(r"\w+", "Python is awesome!"))

['Python', 'is', 'awesome']


In [15]:
# Q5: Match only 3-letter words
print(re.findall(r"\b\w{3}\b", "cat dog ball sun sky"))

['cat', 'dog', 'sun', 'sky']


In [16]:
# Q6: Replace digits with X
print(re.sub(r"\d", "X", "AB12CD34"))

ABXXCDXX


In [17]:
# Q7: Extract email pattern
print(re.findall(r"[a-zA-Z0-9._]+@[a-z]+\.(com|in)", "Contact: test123@gmail.com"))

['com']


In [18]:
# Q8: Find repeated characters
print(re.findall(r"(.)\1", "aabbcdddee"))

['a', 'b', 'd', 'e']


In [19]:
# Q9: Extract uppercase words
print(re.findall(r"[A-Z]+", "India WIN World CUP"))

['I', 'WIN', 'W', 'CUP']


In [20]:
# Q10: Validate 6-digit pincode
print(bool(re.match(r"\d{6}$", "560001")))

True


## CHALLENGE QUESTIONS

In [21]:
# Challenge 1: Validate email
def is_email(s):
    return bool(re.match(r"^[\w.]+@[a-zA-Z]+\.(com|in|org)$", s))
print(is_email("test@gmail.com"))


True


In [22]:
# Challenge 2: Validate strong password
# Must contain: A-Z, a-z, 0-9, special char, min 8 chars
def strong_pass(p):
    return bool(re.match(r"^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[@#$]).{8,}$", p))
print(strong_pass("Abcd@1234"))

True


In [23]:
# Challenge 3: Extract dates DD/MM/YYYY
print(re.findall(r"\b\d{2}/\d{2}/\d{4}\b", "Dates: 12/03/2023, 05/05/2025"))

['12/03/2023', '05/05/2025']


In [24]:
# Challenge 4: Extract hashtags
print(re.findall(r"#\w+", "Learning #python #coding is fun!"))

['#python', '#coding']


In [25]:
# Challenge 5: Validate Indian mobile number
print(bool(re.match(r"^[6-9]\d{9}$", "9876543210")))

True


In [26]:
# Challenge 6: Extract all capitalised words
print(re.findall(r"\b[A-Z][a-z]*\b", "I Met John In Paris"))

['I', 'Met', 'John', 'In', 'Paris']


In [27]:
# Challenge 7: Extract time HH:MM
print(re.findall(r"\b\d{2}:\d{2}\b", "Time is 09:30 now"))

['09:30']


In [28]:
# Challenge 8: Replace multiple spaces with single space
print(re.sub(r"\s+", " ", "Python    is   awesome"))

Python is awesome


In [29]:
# Challenge 9: Find words ending with "ing"
print(re.findall(r"\w+ing\b", "I am learning coding and running daily"))

['learning', 'coding', 'running']


In [30]:
# Challenge 10: Extract domain names
print(re.findall(r"@[a-zA-Z]+\.(com|in|org)", "Mail: abc@gmail.com test@xyz.in"))

['com', 'in']


## INTERVIEW QUESTIONS

#### Q1: What is regex?
#### A: Pattern matching tool for text searching and validation.

#### Q2: Difference between match() and search()?
#### A: match → only at start, search → anywhere.

#### Q3: What does findall() do?
#### A: Returns all matched occurrences.

#### Q4: What is the use of groups?
#### A: To capture specific parts of a match.

#### Q5: What does +, *, ?, {} mean?
#### A: Quantifiers for repetition.

#### Q6: How to replace using regex?
#### A: Using re.sub().

#### Q7: What is a raw string (r"")?
#### A: Used in regex to avoid escape confusion.

#### Q8: What does \d and \w represent?
#### A: digit and word character.

#### Q9: How to extract emails with regex?
#### A: Using pattern like [\w.]+@[a-z]+\.(com|in).

#### Q10: What are practical uses of regex?
#### A: Validation, scraping, cleaning data, searching patterns.
