#Regular Expression

\d: Matches any digit (0-9). Example: \d will match any single digit in a string.

\D: Matches any non-digit. Example: \D will match any character that is not a digit.

\s: Matches any whitespace character (spaces, tabs, newlines). Example: \s will match any space character.

\S: Matches any non-whitespace character. Example: \S will match any character that is not a space.

\w: Matches any word character (alphanumeric + underscore). Example: \w+ will match one or more word characters.

\W: Matches any non-word character. Example: \W will match any character that is not a word character.

\A: Anchors the match at the start of the string. Example: \A\d will match a digit at the beginning of the string.

\b: Represents a word boundary. Example: \bword\b will match the word "word" as a whole word.

\B: Represents a non-word boundary. Example: \Bword\B will match the word "word" only if it's within another word.

\Z: Anchors the match at the end of the string. Example: pattern\z will match "pattern" only at the end of the string.

#Syntax

```
import re
pattern = <Pattern>
text = <Text>
matches = re.<method_name>(pattern,text)
print(matchers)
```



In [None]:
# \d - Matches any digit (0-9)

import re

pattern = r'\d'
text = 'The price is 25 Rupees'

matches = re.findall(pattern,text)
print(matches)

['2', '5']


In [None]:
# \D - Matches any non digit

pattern = r'\D'
text = 'The price is 25 Rupees'

matches = re.findall(pattern,text)
print(matches)

['T', 'h', 'e', ' ', 'p', 'r', 'i', 'c', 'e', ' ', 'i', 's', ' ', ' ', 'R', 'u', 'p', 'e', 'e', 's']


In [None]:
# \s - Matches any white space character

pattern = r'\s'
text = 'The price is 25 Rupees'

matches = re.findall(pattern,text)
print(matches)

[' ', ' ', ' ', ' ']


In [None]:
# \S - Matches any non - white space character

pattern = r'\S'
text = 'The price is 25 Rupees'

matches = re.findall(pattern,text)
print(matches)

['T', 'h', 'e', 'p', 'r', 'i', 'c', 'e', 'i', 's', '2', '5', 'R', 'u', 'p', 'e', 'e', 's']


In [None]:
# \w - Matches any word character(alphanumeric + underscore)
import re
pattern = r'\w'
text = 'The price is 25 Rupees_$'

matches = re.findall(pattern,text)
print(matches)

['T', 'h', 'e', 'p', 'r', 'i', 'c', 'e', 'i', 's', '2', '5', 'R', 'u', 'p', 'e', 'e', 's', '_']


In [None]:
# \W - Matches any non word character

pattern = r'\W'
text = 'The price is 25 Rupees, purchased by john_doe123 #:'

matches = re.findall(pattern,text)
print(matches)

[' ', ' ', ' ', ' ', ',', ' ', ' ', ' ', ' ', '#', ':']


In [None]:
# \A: Anchors the match at the start of the string

pattern = r'\AHello'
text = 'Hello world'

matches = re.findall(pattern,text)
print(matches)

['Hello']


In [None]:
# \b: Represents a word boundary.

pattern = r'\bprice\b'
text = 'The price is 25 Rupees prices'

matches = re.findall(pattern,text)
print(matches)

['price', 'price']


In [None]:
# \B: Represents a non word boundary.
import re
pattern = r'\Brice'
text = 'The cprice is 25 Rupees prices'

matches = re.findall(pattern,text)
print(matches)

['rice']


In [None]:
# \Z: Anchors the match at the end of the string

pattern = r'end\Z'
text = 'This is the ends'

matches = re.findall(pattern,text)
print(matches)

[]


In [None]:
# Dot(.)

pattern = r'h.t'
text = 'hat, pot, h5t, haat'

matches = re.findall(pattern,text)
print(matches)

['hat', 'h5t']


In [None]:
# Caret(^)

pattern = r'^The'
text = 'the price is 25 Rupees'

matches = re.findall(pattern,text)
print(matches)

[]


In [None]:
# $ (Dollar)

pattern = r'Rupees$'
text = 'The price is 25 Rupees'

matches = re.findall(pattern,text)
print(matches)

['Rupees']


In [None]:
# * (Asterisk)

pattern = r'ab*c'
text = 'ac, abc, abbc, abdc'

matches = re.findall(pattern,text)
print(matches)

['ac', 'abc', 'abbc']


In [None]:
# + (Plus):

pattern = r'ab+c'
text = 'ac, abc, abbc, abd'

matches = re.findall(pattern,text)
print(matches)

['abc', 'abbc']


In [None]:
# ?(Question mark)

pattern = r'colors?'
text = 'hai, colors, color colorssss'

matches = re.findall(pattern,text)
print(matches)

['colors', 'color', 'colors']


In [None]:
# {} (Curly Braces)

pattern = r'\d{2}:\d{2}'
text = 'The time is 10:10'

matches = re.findall(pattern,text)
print(matches)

['10:10']


In [None]:
# [] (Bracket)

pattern = r'[ch]at'
text = 'The cat wears hat'

matches = re.findall(pattern,text)
print(matches)

['cat', 'hat']


In [None]:
# | (Pipe)

pattern = r'cat|hat'
text = 'The cat wears hat'

matches = re.findall(pattern,text)
print(matches)

['cat', 'hat']


### Regex Functions

re.compile(pattern, flags=0): Compiles a regular expression pattern into a regex object for efficient reuse.

re.search(pattern, string, flags=0): Searches for the first occurrence of the pattern in the string and returns a match object.

re.match(pattern, string, flags=0): Checks if the pattern matches at the beginning of the string and returns a match object.

re.fullmatch(pattern, string, flags=0): Checks if the entire string matches the pattern and returns a match object.

re.split(pattern, string, maxsplit=0, flags=0): Splits the string at occurrences of the pattern and returns a list of substrings.

re.findall(pattern, string, flags=0): Finds all occurrences of the pattern in the string and returns a list of matches.

re.finditer(pattern, string, flags=0): Finds all occurrences of the pattern in the string and returns an iterator of match objects.

re.sub(pattern, repl, string, count=0, flags=0): Substitutes occurrences of the pattern with the replacement string in the input string.

re.subn(pattern, repl, string, count=0, flags=0): Similar to re.sub, but also returns the number of substitutions made.

re.escape(string): Escapes special characters in the string, making it safe to use as a literal in a regex pattern.


In [6]:
# re.compile
import re

pattern = re.compile(r'\d+')
text = 'There are 123 apples and 456 oranges'

matches = re.findall(pattern,text)
print(matches)

['123', '456']


In [8]:
# re.search

pattern = re.compile(r'\d+')
text = 'There are 123 apples and 456 oranges'

matches = pattern.search(text)
print(matches)
print(matches.group())

<re.Match object; span=(10, 13), match='123'>
123


In [12]:
# re.match

pattern = re.compile(r'\d+')
text = '123 apples and 456 oranges'

matches = pattern.match(text)
print(matches)
print(matches.group())

<re.Match object; span=(0, 3), match='123'>
123


In [16]:
# re.fullmatch

pattern = re.compile(r'\d+')
text = '123456'

matches = pattern.fullmatch(text)
print(matches)
print(matches.group())

<re.Match object; span=(0, 6), match='123456'>
123456


In [19]:
# re.split

pattern = re.compile(r'\s+')
text = 'The price of apple is 25 Rupees'

matches = pattern.split(text)
print(matches)

['The', 'price', 'of', 'apple', 'is', '25', 'Rupees']


In [None]:
# re.findall

pattern = re.compile(r'\d+')
text = 'There are 123 apples and 456 oranges'

matches = re.findall(pattern,text)
print(matches)

In [23]:
# re.finditer

pattern = re.compile(r'\d+')
text = 'There are 123 apples and 456 oranges'

matches = pattern.finditer(text)
print(matches)
for mx in matches:
    print(mx)
    print(mx.group())

<callable_iterator object at 0x7ed9841561d0>
<re.Match object; span=(10, 13), match='123'>
123
<re.Match object; span=(25, 28), match='456'>
456


In [25]:
# re.sub

pattern = re.compile(r'\d+')
text = 'The price of apple is 25 Rupees and Orange is 50 Rupees'

matches = pattern.sub("privce", text)
print(matches)

The price of apple is privce Rupees and Orange is privce Rupees


In [26]:
# re.subn

pattern = re.compile(r'\d+')
text = 'The price of apple is 25 Rupees and Orange is 50 Rupees'

a,b = pattern.subn("A",text)
print(a)
print(b)

The price of apple is A Rupees and Orange is A Rupees
2


In [29]:
# re.escape

pattern = re.compile(r'a')
text = 'The price of apple is $50'

matches = re.findall(pattern, text)
print(matches)

['a']


In [43]:
#Name
#data of birth [dd-mm-yy]
#mobile xxx-xxx-xxx
#instaid
#email text@gmail.com

#Name
x = True
while x:
  pattern = re.compile(r'^[A-Za-z ]+')
  text = input("Enter Name: ")
  matches = pattern.fullmatch(text)
  if matches != None:
    name = matches.group()
    x = False
  else:
    print("Enter Name in correct format")
print(name)
x = True
while x:
  pattern = re.compile(r'\d{2}-\d{2}-\d{4}')
  dob1 = input("Enter Date of Birth: ")
  matches = pattern.fullmatch(dob1)
  if matches != None:
    dob = matches.group()
    x = False
  else:
    print("Enter DOB in correct format")
print(dob)
x = True
while x:
  pattern = re.compile(r'\d{3}-\d{3}-\d{4}')
  phone1 = input("Enter Mobile Number: ")
  matches = pattern.fullmatch(phone1)
  if matches != None:
    phone = matches.group()
    x = False
  else:
    print("Enter Mobile in correct format")
print(phone)
insta = input("Enter Insta Id: ")
print(insta)
x = True
while x:
  pattern = re.compile(r'^[a-zA-Z0-9]*+@gmail.com\Z')
  email1 = input("Enter Email: ")
  matches = pattern.fullmatch(email1)
  if matches != None:
    email = matches.group()
    x = False
  else:
    print("Enter Email in correct format")
print(email)

Enter Name: sai 123#
Enter Name in correct format
Enter Name: sai vardhan
sai vardhan
Enter Date of Birth: 24-08-20244
Enter DOB in correct format
Enter Date of Birth: 24-08-2024
24-08-2024
Enter Mobile Number: 7893570611
Enter Mobile in correct format
Enter Mobile Number: 789-357-0611
789-357-0611
Enter Mobile Number: 789-357-0611
Enter Mobile in correct format
Enter Mobile Number: saivardhan@gmail.com
saivardhan@gmail.com
Enter Insta Id: iamsai2408
iamsai2408
Enter Email: sai vardhan@gmail.com
Enter Email in correct format
Enter Email: saivardhan@gmail.com
saivardhan@gmail.com
