#Regular Expression

\d: Matches any digit (0-9). Example: \d will match any single digit in a string.

\D: Matches any non-digit. Example: \D will match any character that is not a digit.

\s: Matches any whitespace character (spaces, tabs, newlines). Example: \s will match any space character.

\S: Matches any non-whitespace character. Example: \S will match any character that is not a space.

\w: Matches any word character (alphanumeric + underscore). Example: \w+ will match one or more word characters.

\W: Matches any non-word character. Example: \W will match any character that is not a word character.

\A: Anchors the match at the start of the string. Example: \A\d will match a digit at the beginning of the string.

\b: Represents a word boundary. Example: \bword\b will match the word "word" as a whole word.

\B: Represents a non-word boundary. Example: \Bword\B will match the word "word" only if it's within another word.

\Z: Anchors the match at the end of the string. Example: pattern\z will match "pattern" only at the end of the string.

#Syntax

```
import re
pattern = <Pattern>
text = <Text>
matches = re.<method_name>(pattern,text)
print(matchers)
```



In [1]:
# \d - Matches any digit (0-9)

import re

pattern = r'\d'
text = 'The price is 25 Rupees'

matches = re.findall(pattern,text)
print(matches)

['2', '5']


In [2]:
# \D - Matches any non digit

pattern = r'\D'
text = 'The price is 25 Rupees'

matches = re.findall(pattern,text)
print(matches)

['T', 'h', 'e', ' ', 'p', 'r', 'i', 'c', 'e', ' ', 'i', 's', ' ', ' ', 'R', 'u', 'p', 'e', 'e', 's']


In [3]:
# \s - Matches any white space character

pattern = r'\s'
text = 'The price is 25 Rupees'

matches = re.findall(pattern,text)
print(matches)

[' ', ' ', ' ', ' ']


In [4]:
# \S - Matches any non - white space character

pattern = r'\S'
text = 'The price is 25 Rupees'

matches = re.findall(pattern,text)
print(matches)

['T', 'h', 'e', 'p', 'r', 'i', 'c', 'e', 'i', 's', '2', '5', 'R', 'u', 'p', 'e', 'e', 's']


In [6]:
# \w - Matches any word character(alphanumeric + underscore)
import re
pattern = r'\w'
text = 'The price is 25 Rupees_$'

matches = re.findall(pattern,text)
print(matches)

['T', 'h', 'e', 'p', 'r', 'i', 'c', 'e', 'i', 's', '2', '5', 'R', 'u', 'p', 'e', 'e', 's', '_']


In [7]:
# \W - Matches any non word character

pattern = r'\W'
text = 'The price is 25 Rupees, purchased by john_doe123 #:'

matches = re.findall(pattern,text)
print(matches)

[' ', ' ', ' ', ' ', ',', ' ', ' ', ' ', ' ', '#', ':']


In [10]:
# \A: Anchors the match at the start of the string

pattern = r'\AHello'
text = 'Hello world'

matches = re.findall(pattern,text)
print(matches)

['Hello']


In [23]:
# \b: Represents a word boundary.

pattern = r'\bprice\b'
text = 'The price is 25 Rupees prices'

matches = re.findall(pattern,text)
print(matches)

['price', 'price']


In [29]:
# \B: Represents a non word boundary.
import re
pattern = r'\Brice'
text = 'The cprice is 25 Rupees prices'

matches = re.findall(pattern,text)
print(matches)

['rice']


In [31]:
# \Z: Anchors the match at the end of the string

pattern = r'end\Z'
text = 'This is the ends'

matches = re.findall(pattern,text)
print(matches)

[]


In [34]:
# Dot(.)

pattern = r'h.t'
text = 'hat, pot, h5t, haat'

matches = re.findall(pattern,text)
print(matches)

['hat', 'h5t']


In [36]:
# Caret(^)

pattern = r'^The'
text = 'the price is 25 Rupees'

matches = re.findall(pattern,text)
print(matches)

[]


In [37]:
# $ (Dollar)

pattern = r'Rupees$'
text = 'The price is 25 Rupees'

matches = re.findall(pattern,text)
print(matches)

['Rupees']


In [41]:
# * (Asterisk)

pattern = r'ab*c'
text = 'ac, abc, abbc, abdc'

matches = re.findall(pattern,text)
print(matches)

['ac', 'abc', 'abbc']


In [42]:
# + (Plus):

pattern = r'ab+c'
text = 'ac, abc, abbc, abd'

matches = re.findall(pattern,text)
print(matches)

['abc', 'abbc']


In [44]:
# ?(Question mark)

pattern = r'colors?'
text = 'hai, colors, color colorssss'

matches = re.findall(pattern,text)
print(matches)

['colors', 'color', 'colors']


In [51]:
# {} (Curly Braces)

pattern = r'\d{2}:\d{2}'
text = 'The time is 10:10'

matches = re.findall(pattern,text)
print(matches)

['10:10']


In [53]:
# [] (Bracket)

pattern = r'[ch]at'
text = 'The cat wears hat'

matches = re.findall(pattern,text)
print(matches)

['cat', 'hat']


In [58]:
# | (Pipe)

pattern = r'cat|hat'
text = 'The cat wears hat'

matches = re.findall(pattern,text)
print(matches)

['cat', 'hat']


### Regex Functions

re.compile(pattern, flags=0): Compiles a regular expression pattern into a regex object for efficient reuse.

re.search(pattern, string, flags=0): Searches for the first occurrence of the pattern in the string and returns a match object.

re.match(pattern, string, flags=0): Checks if the pattern matches at the beginning of the string and returns a match object.

re.fullmatch(pattern, string, flags=0): Checks if the entire string matches the pattern and returns a match object.

re.split(pattern, string, maxsplit=0, flags=0): Splits the string at occurrences of the pattern and returns a list of substrings.

re.findall(pattern, string, flags=0): Finds all occurrences of the pattern in the string and returns a list of matches.

re.finditer(pattern, string, flags=0): Finds all occurrences of the pattern in the string and returns an iterator of match objects.

re.sub(pattern, repl, string, count=0, flags=0): Substitutes occurrences of the pattern with the replacement string in the input string.

re.subn(pattern, repl, string, count=0, flags=0): Similar to re.sub, but also returns the number of substitutions made.

re.escape(string): Escapes special characters in the string, making it safe to use as a literal in a regex pattern.


In [None]:
# re.compile

pattern = re.compile(r'\d+')
text = 'There are 123 apples and 456 oranges'

matches = re.findall(pattern,text)
print(matches)

In [None]:
# re.search

pattern = re.compile(r'\d+')
text = 'There are 123 apples and 456 oranges'

matches = pattern.search(text)
print(matches.group())

In [None]:
# re.match

pattern = re.compile(r'\d+')
text = '123 apples and 456 oranges'

matches = pattern.match(text)
print(matches.group())

In [None]:
# re.fullmatch

pattern = re.compile(r'\d+')
text = '123456'

matches = pattern.fullmatch(text)
print(matches.group())

In [None]:
# re.split

pattern = re.compile(r'\s+')
text = 'The price of apple is 25 Rupees'

matches = pattern.split(text)
print(matches)

In [None]:
# re.findall

pattern = re.compile(r'\d+')
text = 'There are 123 apples and 456 oranges'

matches = re.findall(pattern,text)
print(matches)

In [None]:
# re.finditer

pattern = re.compile(r'\d+')
text = 'There are 123 apples and 456 oranges'

matches = pattern.finditer(text)
for match in matches:
    print(match.group())

In [None]:
# re.sub

pattern = re.compile(r'\d+')
text = 'The price of apple is 25 Rupees and Orange is 50 Rupees'

matches = pattern.sub("A", text)
print(matches)

In [None]:
# re.subn

pattern = re.compile(r'\d+')
text = 'The price of apple is 25 Rupees and Orange is 50 Rupees'

replaced_text, num_substitutions = pattern.subn("A",text)
print(replaced_text)
print(num_substitutions)

In [None]:
# re.escape

pattern = re.compile(r'\$')
text = 'The price of apple is $50'

matches = re.findall(pattern, text)
print(matches)