Regular expressions are powerful tools for text processing and manipulation, and they are widely used in programming languages, text editors, and data processing tools. Regular Expressions (regex or regexp) are sequences of characters that form a search pattern. They are used for pattern matching within strings. Here are some key concepts and components of regular expressions:
1. Metacharacters: Metacharacters are special characters that have a special meaning when used in regular expressions. Some examples of metacharacters are: 
* . (dot): Matches any character except a newline.
* ^ (caret): Matches the beginning of a line.    
* $ (dollar): Matches the end of a line.
* *(asterisk): Matches zero or more occurrences of the preceding character.
* +(plus): Matches one or more occurrences of the preceding character.
* ?(question mark): Matches zero or one occurrence of the preceding character.
2. Character Classes: Character classes are used to match a specific set of characters. Some examples of character classes are:
* [a-z]: Matches any lowercase letter.
* [A-Z]: Matches any uppercase letter.
* [0-9]: Matches any digit.
* [a-zA-Z0-9]: Matches any alphanumeric character.
3. Quantifiers: Quantifiers are used to specify the number of occurrences of a character or character class. Some examples of quantifiers are:
* {n}: Matches exactly n occurrences of the preceding character or character class.
* {n,}: Matches at least n occurrences of the preceding character or character class.
* {n,m}: Matches between n and m occurrences of the preceding character or character class.
4. Anchors: Anchors are used to match the beginning or end of a string. Some examples of anchors are:
* ^ (caret): Matches the beginning of a line.
* $ (dollar): Matches the end of a line.
5. Grouping: Grouping is used to match a specific part of a string. Some examples of grouping are:
* (pattern): Matches the pattern inside the parentheses.
* (pattern1|pattern2): Matches either pattern1 or pattern2.
6. Backreferences: Backreferences are used to match a specific part of a string that has already been matched. Some examples of backreferences are:
* \1: Matches the first group that has already been matched.
7. Escape Sequences: Escape sequences are used to match special characters. Some examples of escape sequences are:
* \d: Matches any digit (equivalent to [0-9]).
* \D: Matches any non-digit (equivalent to [^0-9]).
* \w: Matches any alphanumeric character (equivalent to [a-zA-Z0-9_]).
* \W: Matches any non-alphanumeric character (equivalent to [^a-zA-Z0-9_]).
* \s: Matches any whitespace character (spaces, tabs, line breaks).
* \S: Matches any non-whitespace character.

In [1]:
strings23='''  
"Email: john.doe@example.com",
    "Contact: +91-9876543210",
    "TransactionID_12345_DONE",
    "Username: user_2025!",
    "Invoice #INV-00458",
    "Order placed on 2025-11-04",
    "Visit our site: https://www.learnregex.com",
    "Temp reading: 37.6°C",
    "Address: 42B Baker Street, London",
    "Note—check at 10:30am tomorrow!",
    "HELLO_world_Regex123",
    "E-mail: support@company.co.in",
    "Total Amount = $1499.75",
    "CustomerID: CUST_0098",
    "Alert! Invalid password entered 3 times",
    "Task Completed? Yes/No",
    "FlightNo: AI-202",
    "Marks: 78, 82, 91, 65",
    "Ref: #ABCDEF12345",
    "ZIP Code: 560034",
    "File saved as report_v2.1.pdf",
    "Email-Me@Now.org",
    "St",
    "Sot",
    "Soot",
    "st",
    "PASSWORD: My$ecret@123",
    "Meeting scheduled on Monday, 9th Dec",
    "Tag: <regex> is powerful </regex>",
    "User said: 'I love learning REGEX!'",
    "item-1, item_2, ITEM#3",
    "Room Temp: -5°C",
    "IP Address: 192.168.0.1",
    "Mask",
    "Lask",
    "aask",
    "5GB",
    "1TB",
    '11GB"," 123 ",
    "[pdf]",
    "[video]","Log","Fog","Dog","Mog",
    "tasmd",
    "UUID: 550e8400-e29b-41d4-a716-446655440000"'''

In [None]:
import re
# pattern=r'So*t'
# pattern=r'[sS]o*[tT]'
# pattern=r'[A-Za-z0-9]ask'
# pattern=r'\b\d{3}\b'                         # \b is for word boundary,\d{3} is for 3 digit number
# pattern=r'[A-Z]\w+'                          # [A-Z] for starting with capital letter, \w+ for single word characters
# pattern=r'\s\d{3}\b'                         # \s for space, \d{3} for 3 digit number, \b for word boundary
# pattern=r"....\."                            # . for any character, \. for literal dot
# pattern=r'\[\w+\]'                           # \[ for literal [, \w+ for word characters, \] for literal ]
# pattern=r'[^LM]og'                           # [^LM] for not starting with L or M, og for literal og
# pattern=r"^Ema\w+"                           # ^ for starting with, tas for literal tas
# pattern=r"(\d+)(GB|TB)"                      # (\d+) for one or more digits, (GB|TB) for either GB or TB

matches=re.findall(pattern,strings23)          # re.findall to find all occurrences of the pattern
# matches=re.finditer(pattern,strings23)       # re.finditer to find all occurrences of the pattern as an iterator
# for match i matches:
    # print(match)                             # to print each matched result from the iterator
# matches=re.search(pattern,strings23)         # re.search to find the first occurrence of the pattern
# print(matches)                                 # to print all matched results
# print(matches.group(2))                      # to print specific group from the matched result 
new_list=[]                                    # initialize empty list
for tup in matches:                            # iterate over the matched tuples
    if tup[1]=="TB":                           # check if the unit is TB
        new_num=int(tup[0])*1024               # convert TB to GB
        new_list.append(new_num)               # append the converted value to the new list
    else:                                      # if the unit is GB
        new_list.append(tup[0])                # append the value as is to the new list
print(new_list)                                # print the final list with all values in GB

['5', 1024, '11']


In [3]:
import re

strings23 = '''  
"Email: john.doe@example.com",
"Contact: +91-9876543210",
"TransactionID_12345_DONE",
"Username: user_2025!",
"Invoice #INV-00458",
"Order placed on 2025-11-04",
"Visit our site: https://www.learnregex.com",
"Temp reading: 37.6°C",
"Address: 42B Baker Street, London",
"Note—check at 10:30am tomorrow!",
"HELLO_world_Regex123",
"E-mail: support@company.co.in",
"Total Amount = $1499.75",
"CustomerID: CUST_0098",
"Alert! Invalid password entered 3 times",
"Task Completed? Yes/No",
"FlightNo: AI-202",
"Marks: 78, 82, 91, 65",
"Ref: #ABCDEF12345",
"ZIP Code: 560034",
"File saved as report_v2.1.pdf",
"Email-Me@Now.org",
"St",
"Sot",
"Soot",
"st",
"PASSWORD: My$ecret@123",
"Meeting scheduled on Monday, 9th Dec",
"Tag: <regex> is powerful </regex>",
"User said: 'I love learning REGEX!'",
"item-1, item_2, ITEM#3",
"Room Temp: -5°C",
"IP Address: 192.168.0.1",
"Mask",
"Lask",
"aask",
"5GB",
"1TB",
'11GB"," 123 ',
"[pdf]",
"[video]","Log","Fog","Dog","Mog",
"tasmd",
"UUID: 550e8400-e29b-41d4-a716-446655440000"'''

# ✅ pattern to match lines starting with optional quotes and 'Ema'
pattern = r'^"Em\w+'

matches = re.findall(pattern, strings23, flags=re.MULTILINE)
print(matches)


['"Email', '"Email']
