# REGEX Examples

| Symbol    | Description                                                  | Example Usage                                       |
|-----------|--------------------------------------------------------------|-----------------------------------------------------|
| `.`       | Matches any character except newline                         | `re.findall(r't.p', 'top tap tip')`                |
| `^`       | Matches the start of the string                              | `re.findall(r'^start', 'start end')`               |
| `$`       | Matches the end of the string                                | `re.findall(r'end$', 'start end')`                 |
| `[]`      | Matches any single character within the brackets             | `re.findall(r'[aeiou]', 'apple banana')`           |
| `[^]`     | Matches any single character not in the brackets             | `re.findall(r'[^0-9]', 'a1b2c3')`                  |
| `*`       | Matches 0 or more occurrences of the preceding element       | `re.findall(r'go*', 'gogogo go')`                  |
| `+`       | Matches 1 or more occurrences of the preceding element       | `re.findall(r'go+', 'gogogo go')`                  |
| `?`       | Matches 0 or 1 occurrence of the preceding element           | `re.findall(r'go?', 'gogogo go')`                  |
| `{n}`     | Matches exactly n occurrences of the preceding element       | `re.findall(r'go{2}', 'gogogo go')`                |
| `{n,}`    | Matches n or more occurrences of the preceding element       | `re.findall(r'go{2,}', 'gogogo go')`               |
| `{n,m}`   | Matches between n and m occurrences of the preceding element | `re.findall(r'go{1,2}', 'gogogo go')`              |
| `\d`      | Matches any decimal digit (0-9)                              | `re.findall(r'\d', '123-456-7890')`                |
| `\w`      | Matches any alphanumeric character (a-zA-Z0-9_)              | `re.findall(r'\w', 'abc123_$')`                    |
| `\s`      | Matches any whitespace character (spaces, tabs, newlines)    | `re.findall(r'\s', 'Hello\tWorld\n')`              |
| `()`      | Groups regular expressions together                           | `re.findall(r'(ab)+', 'ababab')`                   |
| `(?i)`    | Flags for case-insensitive matching                          | `re.findall(r'(?i)hello', 'Hello World')`          |
| `\b`      | Matches empty string at the beginning or end of a word       | `re.findall(r'\bcat\b', 'cat cats caterpillar')`   |


In [4]:
import re

# Regex pattern for matching email addresses
pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'

# Example text containing email addresses
text = "Contact us at email@example.com or support@domain.co.uk for assistance."

# Find all email addresses in the text
emails = re.findall(pattern, text)
print(emails)


['email@example.com', 'support@domain.co.uk']


In [5]:
# Regex pattern for matching phone numbers
pattern = r'\d{3}-\d{3}-\d{4}'

# Example text containing phone numbers
text = "Call 123-456-7890 for inquiries or 987-654-3210 for support."

# Find all phone numbers in the text
phone_numbers = re.findall(pattern, text)
print(phone_numbers)


['123-456-7890', '987-654-3210']


In [6]:

# Regex pattern for tokenizing words
pattern = r'\b\w+\b'

# Example text containing words
text = "The quick brown fox jumps over the lazy dog."

# Tokenize the text into words
words = re.findall(pattern, text)
print(words)


['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']


In [7]:

# Regex pattern for validating URLs
pattern = r'^https?://(?:www\.)?[a-zA-Z0-9-]+\.[a-zA-Z]{2,}(?:/[^\s]*)?$'

# Example URLs to validate
urls = ["https://www.example.com", "http://sub.domain.co.uk/page", "invalid.url"]

# Validate URLs
for url in urls:
    if re.match(pattern, url):
        print(f"{url} is a valid URL.")
    else:
        print(f"{url} is not a valid URL.")


https://www.example.com is a valid URL.
http://sub.domain.co.uk/page is not a valid URL.
invalid.url is not a valid URL.


In [2]:
# Regex pattern for extracting hashtags
pattern = r'#\w+'

# Example tweet containing hashtags
tweet = "Excited to announce our new product #innovation #technology"

# Extract hashtags from the tweet
hashtags = re.findall(pattern, tweet)
print("Hashtags:", hashtags)


Hashtags: ['#innovation', '#technology']


In [3]:
# Regex pattern for validating IP addresses
pattern = r'^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$'

# Example IP addresses to validate
ips = ["192.168.0.1", "256.0.0.1", "invalid.ip.address"]

# Validate IP addresses
for ip in ips:
    if re.match(pattern, ip):
        print(f"{ip} is a valid IP address.")
    else:
        print(f"{ip} is not a valid IP address.")


192.168.0.1 is a valid IP address.
256.0.0.1 is not a valid IP address.
invalid.ip.address is not a valid IP address.
