<img src = "python-logo.png" width = "300" height = "300">
<h1>Regular Expressions</h1>

<ul>
    <li>A RegEx or Regular Expression, is a sequence of characters that forms a search pattern.</li>
    <li>RegEx can be used to check if a string contains the specified search pattern to validate the input.</li>
    <li>Python has a built-in package called re, which can be used to work with Regular Expressions.</li>
</ul>

<h4>RegEx Functions:</h4>
<ol>
    <li>re.findall( pattern, string ) --> findall	Returns a list containing all matches</li>
    <li>re.search( pattern, string ) --> search	Returns a Match object if there is a match anywhere in the string</li>
    <li>re.split( pattern, string, maxsplit(optional) ) --> split	Returns a list where the string has been split at each match</li>
    <li>re.sub( pattern, replacement, string, count(optional) ) --> sub	Replaces one or many matches with a string</li>
</ol>

<h4>1. findall()</h4>

In [296]:
import re

my_string = "Welcome to Python"
result = re.findall("to", my_string)
print(result)

['to']


In [298]:
my_string = "Welcome to Python"
result = re.findall("elcome ", my_string)
print(result)

['elcome ']


In [300]:
my_string = "Welcome to Python"
result = re.findall("P", my_string)
print(result)

['P']


In [302]:
my_string = "Welcome to Python"
pattern = "Welcome"
result = re.findall(pattern, my_string)
print(result)

['Welcome']


In [304]:
# if match doesn't exist returns empty list
my_string = "Welcome to Python"
pattern = "weLcO"
result = re.findall(pattern, my_string)
print(result)

[]


<h4>2. search()</h4>

In [307]:
my_string = "Welcome to Python"
pattern = "Wel"
result = re.search(pattern, my_string)
print(result)
print(result.start())

<re.Match object; span=(0, 3), match='Wel'>
0


In [309]:
print(result.span())

(0, 3)


In [311]:
print(result.string)

Welcome to Python


In [313]:
print(result.group())

Wel


In [315]:
# if match doesn't exit returns None
my_string = "Welcome to Python"
pattern = "WEL"
result = re.search(pattern, my_string)
print(result)

None


<h4>3. split()</h4>

In [318]:
my_string = "Welcome to Python"
pattern = "Welc"
result = re.split(pattern, my_string)
print(result)

['', 'ome to Python']


In [320]:
my_string = "Welcome to Python"
pattern = ""
result = re.split(pattern, my_string)
print(result)

['', 'W', 'e', 'l', 'c', 'o', 'm', 'e', ' ', 't', 'o', ' ', 'P', 'y', 't', 'h', 'o', 'n', '']


In [322]:
my_string = "Welcome to Python"
pattern = " "
result = re.split(pattern, my_string)
print(result)

['Welcome', 'to', 'Python']


In [324]:
my_string = "Welcome to Python"
pattern = ""
result = re.split(pattern, my_string, 3)
print(result)

['', 'W', 'e', 'lcome to Python']


<h4>4. sub()</h4>

In [327]:
my_string = "Welcome to Python, happy for learning it"
pattern = " "
result = re.sub(pattern, "_", my_string)
print(result)

Welcome_to_Python,_happy_for_learning_it


In [329]:
pattern = "p"
result = re.sub(pattern, "P", my_string)
print(result)

Welcome to Python, haPPy for learning it


In [331]:
pattern = "o"
result = re.sub(pattern, "O", my_string, 3)
print(result)

WelcOme tO PythOn, happy for learning it


<h4>Metacharacters</h4>

In [334]:
my_string = "Welcome to Python, happy for learning it"
pattern = "[o]"
result = re.findall(pattern, my_string)
print(result)

['o', 'o', 'o', 'o']


In [336]:
pattern = "[a-m]"
result = re.findall(pattern, my_string)
print(result)

['e', 'l', 'c', 'm', 'e', 'h', 'h', 'a', 'f', 'l', 'e', 'a', 'i', 'g', 'i']


In [338]:
pattern = "P..."
result = re.findall(pattern, my_string)
print(result)

['Pyth']


In [340]:
pattern = "...P"
result = re.findall(pattern, my_string)
print(result)

['to P']


In [342]:
pattern = "^Wel"
result = re.findall(pattern, my_string)
print(result)

['Wel']


In [344]:
pattern = "it$"
result = re.findall(pattern, my_string)
print(result)

['it']


In [346]:
pattern = "o*"
result = re.findall(pattern, my_string)
print(result)

['', '', '', '', 'o', '', '', '', '', 'o', '', '', '', '', '', 'o', '', '', '', '', '', '', '', '', '', '', 'o', '', '', '', '', '', '', '', '', '', '', '', '', '', '']


In [348]:
pattern = "o+"
result = re.findall(pattern, my_string)
print(result)

['o', 'o', 'o', 'o']


In [350]:
my_string = "Welcoome too Python, happy foor learning it"
pattern = "o{2}"
result = re.findall(pattern, my_string)
print(result)

['oo', 'oo', 'oo']


In [352]:
pattern = "Python|python"
result = re.findall(pattern, my_string)
print(result)

['Python']


<h4>Special Characters</h4>

In [355]:
my_string = "The rain in Spain"
pattern = r"\AThe"
result = re.findall(pattern, my_string)
print(result)

['The']


In [357]:
pattern = r"\Ahe"
result = re.findall(pattern, my_string)
print(result)

[]


In [359]:
pattern = r"ain\b"
result = re.findall(pattern, my_string)
print(result)

['ain', 'ain']


In [361]:
pattern = r"\br"
result = re.findall(pattern, my_string)
print(result)

['r']


In [363]:
pattern = r"ain\B"
result = re.findall(pattern, my_string)
print(result)

[]


In [365]:
pattern = r"\Bain"
result = re.findall(pattern, my_string)
print(result)

['ain', 'ain']


In [367]:
my_string = "ljalkfjd 197837"
pattern = r"\d"
result = re.findall(pattern, my_string)
print(result)

['1', '9', '7', '8', '3', '7']


In [369]:
my_string = "ljalkfjd 197837"
pattern = r"\D"
result = re.findall(pattern, my_string)
print(result)

['l', 'j', 'a', 'l', 'k', 'f', 'j', 'd', ' ']


In [371]:
my_string = "lja lkf jd1 978 375"
pattern = r"\s"
result = re.findall(pattern, my_string)
print(result)

[' ', ' ', ' ', ' ']


In [373]:
my_string = "lja lkf jd1 978 375"
pattern = r"\S"
result = re.findall(pattern, my_string)
print(result)

['l', 'j', 'a', 'l', 'k', 'f', 'j', 'd', '1', '9', '7', '8', '3', '7', '5']


In [375]:
my_string = "lja lkf jd1 978 375"
pattern = r"\w"
result = re.findall(pattern, my_string)
print(result)

['l', 'j', 'a', 'l', 'k', 'f', 'j', 'd', '1', '9', '7', '8', '3', '7', '5']


In [377]:
my_string = "lja lkf jd1 978 375"
pattern = r"\W"
result = re.findall(pattern, my_string)
print(result)

[' ', ' ', ' ', ' ']


In [379]:
my_string = "lja lkf jd1 978 375"
pattern = r"5\Z"
result = re.findall(pattern, my_string)
print(result)

['5']


<h4>Sets</h4>

In [382]:
my_string = "asdfgf;lkjhjASDFGD:LKJHG12340987"
pattern = "[asd]"
result = re.findall(pattern, my_string)
print(result)

['a', 's', 'd']


In [384]:
pattern = "[a-n]"
result = re.findall(pattern, my_string)
print(result)

['a', 'd', 'f', 'g', 'f', 'l', 'k', 'j', 'h', 'j']


In [386]:
pattern = "[A-N]"
result = re.findall(pattern, my_string)
print(result)

['A', 'D', 'F', 'G', 'D', 'L', 'K', 'J', 'H', 'G']


In [388]:
pattern = "[^asd]"
result = re.findall(pattern, my_string)
print(result)

['f', 'g', 'f', ';', 'l', 'k', 'j', 'h', 'j', 'A', 'S', 'D', 'F', 'G', 'D', ':', 'L', 'K', 'J', 'H', 'G', '1', '2', '3', '4', '0', '9', '8', '7']


In [390]:
pattern = "[asd]"
result = re.findall(pattern, my_string)
print(result)

['a', 's', 'd']


In [392]:
pattern = "[0123]"
result = re.findall(pattern, my_string)
print(result)

['1', '2', '3', '0']


In [394]:
pattern = "[0-9]"
result = re.findall(pattern, my_string)
print(result)

['1', '2', '3', '4', '0', '9', '8', '7']


In [396]:
pattern = "[^0123]"
result = re.findall(pattern, my_string)
print(result)

['a', 's', 'd', 'f', 'g', 'f', ';', 'l', 'k', 'j', 'h', 'j', 'A', 'S', 'D', 'F', 'G', 'D', ':', 'L', 'K', 'J', 'H', 'G', '4', '9', '8', '7']


In [398]:
pattern = "[0-5][0-9]"
result = re.findall(pattern, my_string)
print(result)

['12', '34', '09']


In [400]:
pattern = "[0-9][0-9][0-9]"
result = re.findall(pattern, my_string)
print(result)

['123', '409']


In [402]:
pattern = "[a-zA-Z]"
result = re.findall(pattern, my_string)
print(result)

['a', 's', 'd', 'f', 'g', 'f', 'l', 'k', 'j', 'h', 'j', 'A', 'S', 'D', 'F', 'G', 'D', 'L', 'K', 'J', 'H', 'G']


In [404]:
pattern = "[+.;:,]"
result = re.findall(pattern, my_string)
print(result)

[';', ':']


<h4>Programs Using Regular Expression:</h4>

In [407]:
def valid_number(phone_number):
    pattern = r"^[789]\d{9}$"
    if re.match(pattern, phone_number):
        return True
    else:
        return False

list_of_numbers = ["9876345678", "7689345678", "1236345678", "8896345678"]

for i in list_of_numbers:
    if valid_number(i):
        print(f"{i} is a valid phone number.")
    else:
        print(f"{i} is not a valid phone number.")

9876345678 is a valid phone number.
7689345678 is a valid phone number.
1236345678 is not a valid phone number.
8896345678 is a valid phone number.


In [409]:
def valid_gmail_id(gmail_id):
    pattern = r"^[a-zA-Z0-9.]+[@gmail]{6}+[.com]{4}$"
    return bool(re.match(pattern, gmail_id))

list_of_gmails = ["gireesh@gmail.com", "gireesh@email.com", "gireesh@gmails.com"]

for i in list_of_gmails:
    if valid_gmail_id(i):
        print(f"{i} is a valid gmail ID")
    else:
        print(f"{i} is not a valid gmail ID")

gireesh@gmail.com is a valid gmail ID
gireesh@email.com is not a valid gmail ID
gireesh@gmails.com is not a valid gmail ID


In [411]:
def valid_mail_id(mail_id):
    pattern = r"^[a-zA-Z0-9.]+[@a-z]+[.com]{4}$"
    return bool(re.match(pattern, mail_id))

list_of_mails = ["gireesh@gmail.com", "gireesh@yahoo.com", "gireesh@outlook.com", "gireesh@emails.coms" ]

for i in list_of_mails:
    if valid_mail_id(i):
        print(f"{i} is a valid mail ID")
    else:
        print(f"{i} is not a valid mail ID")

gireesh@gmail.com is a valid mail ID
gireesh@yahoo.com is a valid mail ID
gireesh@outlook.com is a valid mail ID
gireesh@emails.coms is not a valid mail ID
