# Regular Expressions in Python

### Q1: How to search for a pattern in a string using Regex?
==> The re.search() function is used to search for a pattern in a string. It returns a match object if the pattern is found, otherwise, it returns None. Here's an example:

In [44]:
import re 

pattern = r"world"
# The r before the pattern denotes a raw string literal in Python. It is used to ensure that backslashes in the string are treated as literal characters and not as escape characters.

text = "Hello, World, world, World, world"

match = re.search(pattern, text)

print(match)

<re.Match object; span=(14, 19), match='world'>


In [3]:
import re

pattern = r"world"
text = "Hello, World, world, World, world"

match = re.search(pattern, text)

if match:
    print("Pattern found:", match.group()) 
else:
    print("Pattern not found.")

# match.group() is used to extract the actual substring that matches the specified pattern from the input text.

Pattern found: world


### Q2: How to find all occurrences of a pattern in a string?
==> The re.findall() function can be used to find all occurrences of a pattern in a string. It returns a list of all matches.

In [45]:
pattern = r"world"
text = "Hello, World, world, World, world"

matches = re.findall(pattern, text)
print("All occurrences:", matches)

All occurrences: ['world', 'world']


### Q3: How to match a pattern at the beginning and end of a string?

In [29]:
import re

paragraph = """Hello, world!
This is a sample paragraph.
It contains multiple lines of text.
The world is a vast and interesting place!"""

# Check if the paragraph starts with "Hello"
pattern_start = r"^Hello"
match_start = re.search(pattern_start, paragraph)

if match_start:
    print("Pattern found at the beginning:", match_start.group())
else:
    print("Pattern not found at the beginning.")

# Check if the paragraph ends with "place."
pattern_end = r"place.$"
# pattern_end = r"place\.$"

match_end = re.search(pattern_end, paragraph)

if match_end:
    print("Pattern found at the end:", match_end.group())
else:
    print("Pattern not found at the end.")

# The dot (.) is a special character that matches any character except for a newline.
# The \. ensures that the dot is treated as a literal dot and not as the special regex character that matches any character.

Pattern found at the beginning: Hello
Pattern found at the end: place!


### Q4: Validate Email Addresses using RegEx

In [35]:
import re

def is_valid_email(email):
    pattern = r'^[\w\.-]+@[\w\.-]+\.\w+$'
    match = re.search(pattern, email)
    if match:
        return f"{email} is a valid Email Address."
    else:
        return f"{email} is not a valid Email Address."

e1 = is_valid_email("user@gmail.com")
e2 = is_valid_email("a@a.com")
e3 = is_valid_email("123@.in")

print(e1)
print(e2)
print(e3)

user@gmail.com is a valid Email Address.
a@a.com is a valid Email Address.
123@.in is not a valid Email Address.


### Q5: Validate a phone number of 10 digits that starts with "70" and ends with "86"

In [42]:
import re

def is_valid_phno(number):
    pattern = r'^70\d{6}86$'
    match = re.search(pattern, number)
    if match:
        return f"{number} is a valid phone number."
    else:
        return f"{number} is not a valid phone number."

p1 = is_valid_phno("7017536786")
p2 = is_valid_phno("12345678")
p3 = is_valid_phno("70203633386")

print(p1)
print(p2)
print(p3)

7017536786 is a valid phone number.
12345678 is not a valid phone number.
70203633386 is not a valid phone number.


### Question 6: Extracting Domain Name

Write a Python function extract_domain(email) that extracts and returns the domain name from an email address. For example, given the email "user@example.com", the function should return "example".

In [66]:
def extract_domain(email):
    pattern = r"(@{1})(\w+)(\.)"
    match = re.search(pattern, email)
    if match:
        return match.group(2)
    else:
        print("Not valid email")

d1 = extract_domain("arshad@google.com")

print(d1)
    

google


### Question 7: Finding Duplicate Words

Write a Python function find_duplicates(text) that takes a string as input and returns a list of unique duplicate words in the text. For example, given the text "the the quick brown fox fox", the function should return ["the", "fox"].

In [79]:
def find_duplicate(text):
    pattern = r"\b\w+\b"
    match = re.findall(pattern, text)

    duplicates = []
    for i in match:
        times = match.count(i)
        if times > 1:
            duplicates.append(i)

    return list(set(duplicates))

text = "the the quick brown fox fox"
t1 = find_duplicate(text)

print(t1)

{'the', 'fox'}


Instead of using list(set(duplicates)) at the end, you can directly return duplicates.

In this version, I added and word not in duplicates to the condition to ensure that each duplicate is added only once to the duplicates list. This avoids duplicate entries in the final result.

In [82]:
def find_duplicates(text):
    pattern = r'\b\w+\b'
    matches = re.findall(pattern, text.lower())

    duplicates = []
    for word in matches:
        times = matches.count(word)
        if times > 1 and word not in duplicates:
            duplicates.append(word)

    return duplicates

text = "the the quick brown fox fox"
t1 = find_duplicates(text)

print(t1)

['the', 'fox']


### Question 8: Password Strength Checker

Write a function is_strong_password(password) that uses regex to check if a given password is strong. A strong password should have at least 8 characters, contain both uppercase and lowercase letters, and include at least one digit.

In [87]:
def is_strong_password(password):
    pattern = r"^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$"
    match = re.match(pattern, password)
    if match:
        return "It is a strong Password."
    else:
        return "Weak Password"

password1 = "Abcdef12"
password2 = "abc123"
password3 = "ABCDEFGH"

print(is_strong_password(password1))
print(is_strong_password(password2))  
print(is_strong_password(password3))  

It is a strong Password.
Weak Password
Weak Password
