## This is just for testing purpose only

In [2]:
print("Hello world")

Hello world


## Adding my name

In [3]:
print('prajwal Chaudhary')

prajwal Chaudhary


# Regular Expressions (RegEx) in Python: Beginner to Advanced

Welcome to this modular, clear, and concise guide to Regular Expressions (RegEx) in Python. This notebook will take you from the basics to advanced techniques, with practical examples and exercises.

---

**Outline:**
1. Import the re Module
2. Basic Pattern Matching
3. Metacharacters and Special Sequences
4. Character Classes and Quantifiers
5. Grouping and Capturing
6. Search, Match, and Findall Functions
7. Substitution and Splitting
8. Flags and Their Usage
9. Advanced Patterns: Lookahead and Lookbehind
10. Practical Examples and Exercises


## 1. Import the re Module

Python's built-in `re` module provides support for regular expressions. Always start by importing it.

In [None]:
import re
# The re module is now available for all regex operations.

## 2. Basic Pattern Matching

Let's start with simple pattern matching using `re.match()` and `re.search()`.

In [None]:
# re.match() checks for a match only at the beginning of the string
result = re.match(r'cat', 'catapult')
print('re.match:', result.group() if result else 'No match')

# re.search() checks for a match anywhere in the string
result = re.search(r'cat', 'concatenate')
print('re.search:', result.group() if result else 'No match')

## 3. Metacharacters and Special Sequences

Metacharacters are characters with special meaning in RegEx. Special sequences help match specific character types.

In [None]:
# Metacharacters: . ^ $
print(re.search(r'.', 'abc').group())  # Matches any character
print(re.search(r'^a', 'abc').group()) # Matches 'a' at start
print(re.search(r'c$', 'abc').group()) # Matches 'c' at end

# Special sequences: \d, \w, \s
print(re.search(r'\d', 'abc123').group()) # Matches a digit
print(re.search(r'\w', '***a').group())  # Matches a word character
print(re.search(r'\s', 'a b').group())   # Matches a whitespace

## 4. Character Classes and Quantifiers

Character classes match specific sets of characters. Quantifiers specify how many times a pattern should occur.

In [None]:
# Character classes
print(re.search(r'[aeiou]', 'python').group())      # Matches any vowel
print(re.search(r'[^aeiou]', 'python').group())     # Matches any non-vowel

# Quantifiers
print(re.search(r'\d+', 'abc12345').group())       # Matches one or more digits
print(re.search(r'\d{2,4}', 'abc12345').group())   # Matches 2 to 4 digits

## 5. Grouping and Capturing

Parentheses `()` are used to group patterns and capture matched substrings for later use.

In [None]:
# Grouping and capturing
match = re.search(r'(\d+)-(\w+)', '12345-abc')
if match:
    print('Full match:', match.group(0))
    print('First group:', match.group(1))
    print('Second group:', match.group(2))

## 6. Search, Match, and Findall Functions

Let's compare `re.search()`, `re.match()`, and `re.findall()`.

In [None]:
text = 'cat dog catfish dogcat'
print('re.match:', re.match(r'cat', text))           # Only matches at start
print('re.search:', re.search(r'dog', text))         # Finds first occurrence
print('re.findall:', re.findall(r'cat', text))       # Finds all occurrences

## 7. Substitution and Splitting

Use `re.sub()` to replace patterns and `re.split()` to split strings based on patterns.

In [None]:
# Substitution
text = 'apple banana apple orange'
new_text = re.sub(r'apple', 'fruit', text)
print('Substitution:', new_text)

# Splitting
sentence = 'one,two;three four'
parts = re.split(r'[;, ]', sentence)
print('Splitting:', parts)

## 8. Flags and Their Usage

Flags modify regex behavior. Common flags include `re.IGNORECASE`, `re.MULTILINE`, and `re.DOTALL`.

In [None]:
# IGNORECASE: Case-insensitive matching
print(re.search(r'apple', 'APPLE', re.IGNORECASE).group())

# MULTILINE: ^ and $ match at the start/end of each line
text = 'first\nsecond'
print(re.findall(r'^\w+', text, re.MULTILINE))

# DOTALL: . matches newline as well
text = 'abc\ndef'
print(re.search(r'a.*f', text, re.DOTALL).group())

## 9. Advanced Patterns: Lookahead and Lookbehind

Lookahead and lookbehind assertions allow you to match patterns based on what comes before or after them, without including those parts in the match.

In [None]:
# Lookahead: Match 'foo' only if followed by 'bar'
print(re.search(r'foo(?=bar)', 'foobar').group())

# Lookbehind: Match 'bar' only if preceded by 'foo'
print(re.search(r'(?<=foo)bar', 'foobar').group())

## 10. Practical Examples and Exercises

Let's apply what you've learned with some real-world examples and short exercises.

In [None]:
# Example 1: Extract all email addresses from text
text = 'Contact: alice@example.com, bob@work.org'
emails = re.findall(r'[\w.-]+@[\w.-]+', text)
print('Emails:', emails)

# Example 2: Validate a phone number (US format)
phone = '123-456-7890'
if re.fullmatch(r'\d{3}-\d{3}-\d{4}', phone):
    print('Valid phone number')
else:
    print('Invalid phone number')

# Exercise: Find all words starting with 'a' in a sentence
sentence = 'An apple a day keeps anxiety away.'
words = re.findall(r'\ba\w*', sentence, re.IGNORECASE)
print('Words starting with a:', words)