<!-- Regular expressions, often abbreviated as regex, are powerful tools for pattern matching and manipulating strings. The Python programming language provides a built-in module called re that allows you to work with regular expressions.

The re module in the Python standard library offers several functions for working with regular expressions, including:

re.match(pattern, string): Attempts to match the pattern at the beginning of the string and returns a match object if successful.

re.search(pattern, string): Searches the entire string for a match to the pattern and returns a match object if found.

re.findall(pattern, string): Returns all non-overlapping matches of the pattern in the string as a list of strings.

re.finditer(pattern, string): Returns an iterator yielding match objects for all non-overlapping matches of the pattern in the string.

re.sub(pattern, repl, string): Searches for all occurrences of the pattern in the string and replaces them with the specified replacement string.

These are just a few of the functions provided by the re module. The module also provides various flags and options to modify the behavior of regular expression matching.

Here's a simple example that demonstrates the usage of regular expressions with the re module: -->

In [5]:
import re

pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b'  # Email pattern
string = 'Contact us at info@example.com or support@example.org'

matches = re.findall(pattern, string)
print(matches)  # Output: ['info@example.com', 'support@example.org']

['info@example.com', 'support@example.org']


In [6]:
# Matching Dates:

import re

pattern = r'\d{2}-\d{2}-\d{4}'  # Date pattern (DD-MM-YYYY)
string = 'Today is 27-06-2023, and tomorrow is 28-06-2023.'

matches = re.findall(pattern, string)
print(matches)  # Output: ['27-06-2023', '28-06-2023']

['27-06-2023', '28-06-2023']


In [10]:
# Extracting Hashtags:

import re

pattern = r'#\w+'  # Hashtag pattern
string = 'I love #coding and #programming! #PythonIsAwesome'

matches = re.findall(pattern, string)
print(matches)  # Output: ['#coding', '#programming', '#PythonIsAwesome']

['#coding', '#programming', '#PythonIsAwesome']


In [11]:
# Splitting Text:

import re

pattern = r'\W+'  # Non-word characters pattern
string = 'Hello, world! How are you?'

words = re.split(pattern, string)
print(words)  # Output: ['Hello', 'world', 'How', 'are', 'you']

['Hello', 'world', 'How', 'are', 'you', '']


In [12]:
# Validating Phone Numbers:

import re

pattern = r'^\d{3}-\d{3}-\d{4}$'  # Phone number pattern (###-###-####)

phone_numbers = ['123-456-7890', '555-5555', '123-456-78901']

for number in phone_numbers:
    if re.match(pattern, number):
        print(f"{number} is a valid phone number.")
    else:
        print(f"{number} is not a valid phone number.")

123-456-7890 is a valid phone number.
555-5555 is not a valid phone number.
123-456-78901 is not a valid phone number.


In [13]:
# Extracting URLs from Text:

import re

pattern = r'https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+'

text = 'Visit my website at https://www.example.com or check out https://blog.example.com for more.'

urls = re.findall(pattern, text)
print(urls)  # Output: ['https://www.example.com', 'https://blog.example.com']

['https://www.example.com', 'https://blog.example.com']


In [14]:
# Removing Punctuation:

import re

pattern = r'[^\w\s]'

text = 'Hello, world! How are you?'

clean_text = re.sub(pattern, '', text)
print(clean_text)  # Output: 'Hello world How are you'

Hello world How are you
