# Python Regular Expressions (Regex) Course
Welcome! This notebook will teach you how to use regular expressions (regex) in Python using the `re` library. Each section includes code snippets and explanations to help you master regex from scratch.

## What is Regex?
Regular expressions are patterns used to match, search, and manipulate text. They are powerful tools for text processing, data validation, and extraction.

## What will you learn?
- How to import and use the `re` library
- Basic regex patterns and syntax
- Common regex functions: `search`, `match`, `findall`, `finditer`, `sub`, `split`, `compile`
- Special characters and quantifiers
- Grouping and capturing
- Practical examples and tips

## 1. Importing the re Library
The `re` module in Python provides support for working with regular expressions. You need to import it before using regex functions.

- `import re` imports the regex library.

In [None]:
import re
print('re library imported!')
print('This is a Python script that imports the re library.')

re library imported!


## 2. Basic Regex Patterns and Syntax
A regex pattern is a string that defines a search pattern. Here are some basic elements:
- `.` : Any character except newline
- `^` : Start of string
- `$` : End of string
- `*` : 0 or more repetitions
- `+` : 1 or more repetitions
- `?` : 0 or 1 repetition
- `[]` : Set of characters
- `|` : Either/or
- `()` : Grouping
- `\d` : Digit
- `\w` : Word character (alphanumeric or underscore)
- `\s` : Whitespace

## 3. re.search()
`re.search(pattern, string)` scans through a string, looking for any location where the pattern matches. Returns a match object if found, else None.

- Use when you want to find a pattern anywhere in the string.

In [2]:
text = 'My phone number is 123-456-7890.'
match = re.search(r'\d{3}-\d{3}-\d{4}', text)
if match:
    print('Phone number found:', match.group())
else:
    print('No phone number found.')

Phone number found: 123-456-7890


## 4. re.match()
`re.match(pattern, string)` checks for a match only at the beginning of the string. Returns a match object if found, else None.

- Use when you want to check if the string starts with a pattern.

In [3]:
text = 'Hello world!'
match = re.match(r'Hello', text)
if match:
    print('String starts with Hello')
else:
    print('String does not start with Hello')

String starts with Hello


## 5. re.findall()
`re.findall(pattern, string)` returns all non-overlapping matches of the pattern in the string as a list.

- Use when you want to find all occurrences of a pattern.

In [None]:
# This code searches for email addresses in the given text using a regular expression.

text = 'My emails are test1@example.com and test2@domain.com.'

# re.findall() returns all non-overlapping matches of the pattern in the string as a list.
# The pattern r'[\w.-]+@[\w.-]+' matches:
# - [\w.-]+ : one or more word characters (letters, digits, underscore), dots, or hyphens (the username part)
# - @      : the at symbol
# - [\w.-]+ : one or more word characters, dots, or hyphens (the domain part)
emails = re.findall(r'[\w.-]+@[\w.-]+', text)

print('Found emails:', emails)
# Output: Found emails: ['test1@example.com', 'test2@domain.com']

In [4]:
text = 'My emails are test1@example.com and test2@domain.com.'
emails = re.findall(r'[\w.-]+@[\w.-]+', text)
print('Found emails:', emails)

Found emails: ['test1@example.com', 'test2@domain.com.']


## 6. re.finditer()
`re.finditer(pattern, string)` returns an iterator yielding match objects for all non-overlapping matches of the pattern in the string.

- Use when you want to iterate over matches and get their positions.

In [5]:
text = 'The numbers are 12, 34, and 56.'
for match in re.finditer(r'\d+', text):
    print('Found number:', match.group(), 'at position', match.start())

Found number: 12 at position 16
Found number: 34 at position 20
Found number: 56 at position 28


## 7. re.sub()
`re.sub(pattern, repl, string)` replaces occurrences of the pattern in the string with `repl`.
- Use for search and replace operations.

In [6]:
text = 'I have 2 apples and 3 bananas.'
result = re.sub(r'\d+', 'many', text)
print('After substitution:', result)

After substitution: I have many apples and many bananas.


## 8. re.split()
`re.split(pattern, string)` splits the string by the occurrences of the pattern. Returns a list of substrings.
- Use to split text by regex patterns (e.g., multiple delimiters).

In [7]:
text = 'apple, orange; banana|grape'
fruits = re.split(r'[;,| ]+', text)
print('Fruits:', fruits)

Fruits: ['apple', 'orange', 'banana', 'grape']


## 9. re.compile()
`re.compile(pattern)` compiles a regex pattern into a regex object for repeated use. This can improve performance if you use the same pattern multiple times.
- Use the compiled object’s methods: `search`, `match`, `findall`, etc.

In [8]:
pattern = re.compile(r'\d+')
text = 'Order numbers: 123, 456, 789.'
numbers = pattern.findall(text)
print('Numbers found:', numbers)

Numbers found: ['123', '456', '789']


## 10. Groups and Capturing
Parentheses `()` in regex are used to capture groups. You can extract specific parts of a match using groups.
- Use `group(1)`, `group(2)`, etc., to access captured groups.

In [9]:
text = 'Name: John, Age: 30'
match = re.search(r'Name: (\w+), Age: (\d+)', text)
if match:
    print('Name:', match.group(1))
    print('Age:', match.group(2))

Name: John
Age: 30


## 11. Regex Flags
Flags modify the behavior of regex functions. Common flags:
- `re.IGNORECASE` or `re.I`: Case-insensitive matching
- `re.MULTILINE` or `re.M`: `^` and `$` match the start/end of each line
- `re.DOTALL` or `re.S`: `.` matches any character, including newline

In [10]:
text = 'Python\nPYTHON\npython'
matches = re.findall(r'python', text, re.IGNORECASE)
print('Case-insensitive matches:', matches)

Case-insensitive matches: ['Python', 'PYTHON', 'python']


# Summary and Next Steps
You have learned the basics of Python's `re` library, including how to search, match, find, split, substitute, and use groups and flags in regular expressions.

## Next Steps
- Practice writing your own regex patterns for different text processing tasks.
- Explore advanced topics like lookahead/lookbehind, non-capturing groups, and named groups.
- Use online tools like [regex101.com](https://regex101.com/) to test and debug your patterns.
- Read the [Python regex documentation](https://docs.python.org/3/library/re.html) for more details and examples.

Happy regex-ing!