# LAB | Regular Expressions (Regex) in Python

## Overview
This exercise notebook will help you practice using regular expressions in Python. Regular expressions are powerful tools for matching patterns in strings, which can be useful for validation, searching, and data manipulation.

## Instructions
- Complete each exercise by writing the appropriate regex pattern and Python code in the provided space.
- Test your code to ensure it works as expected.
<!-- - Use the hints provided if you get stuck. -->

### Exercise 1: Match Email Addresses
Write a regex pattern to match valid email addresses. An email address should contain an '@' symbol and a domain.

In [1]:
import re

# Example input
email = "example@example.com"

# Practical regex pattern:
# - Local part: letters, digits, and allowed symbols . _ % + -
# - '@' separator
# - Domain: letters, digits, dots, hyphens
# - TLD: 2–63 letters
pattern = r"^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,63}$"

# Test the regex
if re.fullmatch(pattern, email):
    print("Valid email")
else:
    print("Invalid email")


Valid email


### Exercise 2: Validate Phone Numbers
Create a regex pattern to validate phone numbers in the format (123) 456-7890 or 123-456-7890.

In [2]:
import re

# Example input
phone_number = "(123) 456-7890"

# Pattern:
# ^            -> start of string
# \(?\d{3}\)?  -> 3 digits, optionally wrapped in parentheses
# [\s-]?       -> optional space or hyphen
# \d{3}        -> 3 digits
# -            -> hyphen
# \d{4}        -> 4 digits
# $            -> end of string
pattern = r"^(?:\(\d{3}\)\s|\d{3}-)\d{3}-\d{4}$"

# Test the regex
if re.fullmatch(pattern, phone_number):
    print("Valid phone number")
else:
    print("Invalid phone number")


Valid phone number


### Exercise 3: Extract Dates
Write a regex pattern to extract dates in the format YYYY-MM-DD from a string.

In [3]:
import re

# Example input
text = "The event is scheduled for 2024-12-25."

# Pattern:
# \b         -> word boundary to avoid partial matches
# \d{4}      -> exactly 4 digits (year)
# -          -> hyphen separator
# \d{2}      -> exactly 2 digits (month)
# -          -> hyphen separator
# \d{2}      -> exactly 2 digits (day)
# \b         -> word boundary
pattern = r"\b\d{4}-\d{2}-\d{2}\b"

# Find all matches
dates = re.findall(pattern, text)
print(dates)


['2024-12-25']


### Exercise 4: Match URLs
Create a regex pattern to match URLs that start with http:// or https://.

In [4]:
import re

# Example input
url = "https://www.example.com"

# Pattern:
# ^https?://       -> must start with http:// or https://
# [\w.-]+          -> domain name (letters, digits, underscore, dot, hyphen)
# (?:\.[a-zA-Z]{2,63})+  -> at least one dot followed by TLD (2–63 letters)
# (?:[/?#]\S*)?    -> optional path, query, or fragment
# $                -> end of string
pattern = r"^https?://[\w.-]+(?:\.[a-zA-Z]{2,63})+(?:[/?#]\S*)?$"

# Test the regex
if re.fullmatch(pattern, url):
    print("Valid URL")
else:
    print("Invalid URL")


Valid URL


### Exercise 5: Find Words Starting with a Specific Letter
Write a regex pattern to find all words starting with the letter 'a' in a given string.

In [5]:
import re

# Example input
text = "A quick brown fox jumps over a lazy dog."

# Pattern:
# \b        -> word boundary
# [aA]      -> matches lowercase 'a' or uppercase 'A'
# \w*       -> zero or more word characters (letters, digits, underscore)
pattern = r"\b[aA]\w*\b"

# Find all matches
words = re.findall(pattern, text)
print(words)


['A', 'a']


### Exercise 6: Match Hexadecimal Colors
Create a regex pattern to match hexadecimal color codes (e.g., #FFFFFF).

In [6]:
import re

# Example input
color_code = "#FFFFFF"

# Pattern:
# ^#              -> start with a hash symbol
# (?:[0-9A-Fa-f]{3}|[0-9A-Fa-f]{6})  -> exactly 3 or 6 hexadecimal digits
# $               -> end of string
pattern = r"^#(?:[0-9A-Fa-f]{3}|[0-9A-Fa-f]{6})$"

# Test the regex
if re.fullmatch(pattern, color_code):
    print("Valid hex color code")
else:
    print("Invalid hex color code")


Valid hex color code


### Exercise 7: Validate Passwords 
Write a regex pattern to validate passwords that must be at least 8 characters long and contain at least one uppercase letter, one lowercase letter, one digit, and one special character.

In [7]:
import re

# Example input
password = "Password123!"

# Pattern explanation:
# ^                 -> start of string
# (?=.*[A-Z])       -> at least one uppercase letter
# (?=.*[a-z])       -> at least one lowercase letter
# (?=.*\d)          -> at least one digit
# (?=.*[^A-Za-z0-9]) -> at least one special character
# .{8,}             -> at least 8 characters total
# $                 -> end of string
pattern = r"^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[^A-Za-z0-9]).{8,}$"

# Test the regex
if re.fullmatch(pattern, password):
    print("Valid password")
else:
    print("Invalid password")


Valid password


### Exercise 8: Remove Extra Spaces 
Create a regex pattern that removes extra spaces from a string while keeping single spaces between words.

In [8]:
import re

# Example input
text = "This   is   an   example."

# Pattern:
# \s+   -> one or more whitespace characters (spaces, tabs, etc.)
# Replace them with a single space
pattern = r"\s+"

# Replace extra spaces with a single space and strip leading/trailing spaces
cleaned_text = re.sub(pattern, " ", text).strip()

print(cleaned_text)


This is an example.


### Exercise 9: Match IP Addresses 
Write a regex pattern to match valid IPv4 addresses.

In [9]:
import re

# Example input
ip_address = "192.168.1.1"

# Pattern:
# - Each octet: 0–255
# - Use alternation for ranges: 25[0-5] (250–255), 2[0-4]\d (200–249), 1\d\d (100–199), [1-9]?\d (0–99)
# - Separate by literal dots
pattern = r"^(?:(?:25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)\.){3}" \
          r"(?:25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)$"

# Test the regex
if re.fullmatch(pattern, ip_address):
    print("Valid IP address")
else:
    print("Invalid IP address")


Valid IP address


### Exercise 10: Extract Hashtags 
Create a regex pattern to extract hashtags from a string.

In [10]:
import re

# Example input
text = "Here are some hashtags: #Python #Regex #Coding."

# Pattern:
# \B#       -> match a '#' not at a word boundary (ensures it's part of a word, not something like 'C#')
# [A-Za-z0-9_]+ -> one or more letters, digits, or underscores (valid hashtag characters)
pattern = r"\B#[A-Za-z0-9_]+"

# Find all matches
hashtags = re.findall(pattern, text)
print(hashtags)


['#Python', '#Regex', '#Coding']


## Bonus Exercises



### Bonus Exercise 1: Match All Digits 
Write a regex pattern to match all digits in a given string.

In [11]:
import re

# Example input
text = "There are 2 apples and 3 oranges."

# Pattern:
# \d -> matches any single digit (0–9)
pattern = r"\d"

# Find all matches
digits = re.findall(pattern, text)
print(digits)


['2', '3']


### Bonus Exercise 2: Validate Credit Card Numbers  
Create a regex pattern to validate credit card numbers (16 digits).

In [12]:
import re

# Example input
credit_card_number = "1234-5678-9876-5432"

# Pattern:
# ^                           -> start of string
# (?:\d{4}[-\s]?){3}          -> first 12 digits in 3 groups of 4, each optionally followed by a hyphen or space
# \d{4}                       -> final group of 4 digits
# $                           -> end of string
pattern = r"^(?:\d{4}[-\s]?){3}\d{4}$"

# Test the regex
if re.fullmatch(pattern, credit_card_number):
    print("Valid credit card number")
else:
    print("Invalid credit card number")


Valid credit card number


### Bonus Exercise 3: Match Non-Alphanumeric Characters  
Write a regex pattern to match non-alphanumeric characters in a string.

In [13]:
import re

# Example input
text = "Hello! How are you? @Python3."

# Pattern:
# [^A-Za-z0-9] -> matches any character that is NOT a letter or digit
pattern = r"[^A-Za-z0-9]"

# Find all matches
non_alphanumeric_chars = re.findall(pattern, text)
print(non_alphanumeric_chars)


['!', ' ', ' ', ' ', '?', ' ', '@', '.']


### Bonus Exercise 4: Validate Date Format  
Create a regex pattern to validate dates in the format DD/MM/YYYY.

In [14]:
import re

# Example input
date_string = "25/12/2024"

# Pattern:
# ^                         -> start of string
# (0[1-9]|[12]\d|3[01])     -> day: 01–09, 10–29, 30–31
# /                         -> separator
# (0[1-9]|1[0-2])           -> month: 01–09, 10–12
# /                         -> separator
# \d{4}                     -> year: exactly 4 digits
# $                         -> end of string
pattern = r"^(0[1-9]|[12]\d|3[01])/(0[1-9]|1[0-2])/\d{4}$"

# Test the regex
if re.fullmatch(pattern, date_string):
    print("Valid date format")
else:
    print("Invalid date format")


Valid date format


### Bonus Exercise 5: Extract Email Domains  
Write a regex pattern to extract domains from email addresses.

In [15]:
import re

# Example input
email_list = ["user@example.com", "admin@domain.org"]

# Pattern:
# @        -> literal '@'
# ([A-Za-z0-9.-]+\.[A-Za-z]{2,63}) -> capture domain name + TLD
pattern = r"@([A-Za-z0-9.-]+\.[A-Za-z]{2,63})"

for email in email_list:
    match = re.search(pattern, email)
    if match:
        print(match.group(1))  # Extracted domain part


example.com
domain.org


### Exercise Completion  
Once you have completed all exercises:
- Review your solutions.
- Ensure your regular expressions and Python code are well-documented with comments explaining your logic.
- Save your notebook for submission or further review.

Happy coding! Enjoy practicing Regular Expressions in Python!