Understanding the re.library

The re library in Python provides support for working with regular expressions (regex). Regular expressions are powerful tools for pattern matching and string manipulation — they allow you to search, extract, and modify text based on specific patterns.

In [1]:
import re

pattern = r"hello"
text = "hello world"
match = re.match(pattern, text)

if match:
    print("Match found!")
else:
    print("No match.")


Match found!


re.search()

In [2]:
pattern = r"world" #the r stands for a raw string
text = "hello world"
match = re.search(pattern, text)

if match:
    print("Match found:", match.group())
else:
    print("No match.")


Match found: world


re.findin()

In [3]:
pattern = r"\d+"   # Matches one or more digits
text = "My number is 123 and your number is 456."
matches = re.findall(pattern, text)

print(matches)


['123', '456']


re.sub()
Replaces occurrences of a pattern with a specified replacement string.

In [4]:
pattern = r"apples"
replacement = "oranges"
text = "I like apples."
result = re.sub(pattern, replacement, text)

print(result)


I like oranges.


re.split()
Splits a string into a list using the pattern as the delimiter.

In [5]:
pattern = r"\s+"  # Split on whitespace
text = "This is a test"
result = re.split(pattern, text)

print(result)


['This', 'is', 'a', 'test']


🔎 Match Object Methods

When a match is found (using match() or search()), you get a Match object with useful methods:

Method	Description	Example

group()	Returns the matched text	match.group() → 'hello'

start()	Returns the starting index of the match	match.start() → 0

end()	Returns the ending index of the match	match.end() → 5

span()	Returns a tuple (start, end) of the match	match.span() → (0, 5)


🎯 Common Regex Patterns

Pattern	Description	Example

.	Any character except a newline	        -> "h.t" matches hat, hit

\d	Digit (0-9)	        -> "Order \d" matches Order 5

\D	Non-digit character	        -> "Item \D" matches Item A

\s	Whitespace (space, tab, newline)	    -> "a\sb" matches a b

\S	Non-whitespace character	    ->"a\Sb" matches a5b

\w	Alphanumeric + underscore	        ->"a\w+" matches apple

\W	Non-alphanumeric character	        ->"a\Wb" matches a!b

^	Start of string	^hello matches      ->"hello world"

$	End of string	world$ matches          ->"hello world"

*	0 or more occurrences	            ->"he*o" matches heo, heeeo

+	1 or more occurrences	        ->"he+o" matches heo, heeeo

?	0 or 1 occurrence	        ->"he?o" matches heo, ho

{m,n}	Between m and n occurrences	"       -> he{2,4}o" matches heeoo, heeeeo

`	`	OR operator

()	Capture group	        ->"(ab)+" matches ababab

\b	Word boundary	        -> \bcat\b matches cat but not scattered
