# **Solution for Regex Practice Questions**

In [None]:
import re

#### **1. Write a regex pattern to find all words in the text that start with a capital letter.**

    "Last summer, Alice and Bob traveled to Paris. During their trip, they visited the Louvre, saw the Eiffel Tower, and dined at Chez Marianne."

In [2]:
import re

text = "Last summer, Alice and Bob traveled to Paris. During their trip, they visited the Louvre, saw the Eiffel Tower, and dined at Chez Marianne."
matches = re.findall(r"\b[A-Z][a-zA-Z]*\b", text)
print(matches)

['Last', 'Alice', 'Bob', 'Paris', 'During', 'Louvre', 'Eiffel', 'Tower', 'Chez', 'Marianne']


#### **2. Develop a regex pattern to find all words in the text that end with 'ing'.**

    "Walking through the park, I was thinking about the upcoming meeting while singing softly to myself."

In [3]:
text = "Last summer, Alice and Bob traveled to Paris. During their trip, they visited the Louvre, saw the Eiffel Tower, and dined at Chez Marianne."
matches = re.findall(r"\b[A-Z][a-zA-Z]*\b", text)
print(matches)

['Last', 'Alice', 'Bob', 'Paris', 'During', 'Louvre', 'Eiffel', 'Tower', 'Chez', 'Marianne']


#### **3. Write a regex pattern to extract all hashtags (e.g., #Cafe123) and mentions (e.g., @manager_john).**

    "Loved the service at #Cafe123! Thanks to @manager_john for the warm hospitality. Looking forward to our next visit!"



In [4]:
text = "Loved the service at #Cafe123! Thanks to @manager_john for the warm hospitality. Looking forward to our next visit!"
matches = re.findall(r"[@#][\w]+", text)
print(matches)

['#Cafe123', '@manager_john']


#### **4. Develop a regex pattern to find all the phone numbers, considering formats with parentheses, hyphens, and spaces.**

    "For reservations, call us at (123) 456-7890 or reach our branch at 098-765-4321. We're here to help!" 

In [5]:
text = "For reservations, call us at (123) 456-7890 or reach our branch at 098-765-4321. We're here to help!"
matches = re.findall(r"\(?\d{3}\)?[-\s]?\d{3}[-\s]?\d{4}", text)
print(matches)



['(123) 456-7890', '098-765-4321']


**Note:** The question mark (?) in a regular expression is a quantifier that makes the preceding element optional. It indicates that the preceding token (which could be a single character or a group) can appear zero or one time in the target string. 

#### **5. Create a regex pattern to find all decimal and integer numbers in the text.**

    "The recipe calls for 1.5 cups of flour, 0.75 cups of sugar, and 2 eggs. Make sure to bake at 350 degrees for 20 minutes."

In [9]:
text = "The recipe calls for 1.5 cups of flour, 0.75 cups of sugar, and 2 eggs. Make sure to bake at 350 degrees for 20 minutes."
matches = re.findall(r"(\d+\.\d+|\d+)", text)
print(matches)

['1.5', '0.75', '2', '350', '20']


#### **6. Develop a regex pattern to extract all web URLs, including those starting with 'http', 'https', or 'www'.**

    "Helpful resources can be found at http://www.example.com, https://secure.site, and on our old page at www.example.net/archive."

In [11]:
text = "Helpful resources can be found at http://www.example.com, https://secure.site, and on our old page at www.example.net/archive."
matches = re.findall(r"https?://\S+|www\.\S+", text)
print(matches)

['http://www.example.com,', 'https://secure.site,', 'www.example.net/archive.']


#### **7. Create a regex pattern to find all HTML tags in the text.**
    
    "<html><head><title>Sample Page</title></head><body><p>Hello World!</p></body></html>"

In [17]:
text = "<html><head><title>Sample Page</title></head><body><p>Hello World!</p></body></html>"
matches = re.findall(r"<[^>]+>", text)
print(matches)

['<html>', '<head>', '<title>', '</title>', '</head>', '<body>', '<p>', '</p>', '</body>', '</html>']


#### **8. Write a regex pattern to extract all URLs from hyperlink (\<a>) tags in the text.**
    
    <div class="links">
        <a href='https://www.example.com'>Visit Example</a>
        <a href='https://www.sample.com'>Visit Sample</a>
        Check out our blog for more information at <a href='https://blog.example.com'>Example's Blog</a>.
    </div>

In [27]:
text = "<div class='links'><a href='https://www.example.com'>Visit Example</a><a href='https://www.sample.com'>Visit Sample</a>Check out our blog for more information at <a href='https://blog.example.com'>Example's Blog</a>.</div>"
pattern = r"<a href='([https://www.\w./]+)'>"
matches = re.findall(pattern, text)
print(matches)

['https://www.example.com', 'https://www.sample.com', 'https://blog.example.com']


#### **9. Develop a regex pattern to find all id attribute values in the HTML tags.**

    <section id='main-content'>
        <div id='header'>Welcome to Our Website</div>
        <p id='intro'>Explore our products and services.</p>
        <span id='footer'>2023 © Example Corporation</span>
    </section>

In [31]:
text = "<section id='main-content'><div id='header'>Welcome to Our Website</div><p id='intro'>Explore our products and services.</p><span id='footer'>2023 © Example Corporation</span></section>"
matches = re.findall(r"id='([\w-]+)'", text)
print(matches)

['main-content', 'header', 'intro', 'footer']


**10. Write a regex pattern to find all comment blocks (\<!-- comment -->) in the HTML.**

    <!DOCTYPE html>
    <html>
    <head>
        <!-- Page Title -->
        <title>My Sample Page</title>
    </head>
    <body>
        <!-- Header Start -->
        <header>
            <h1>Welcome to My Page</h1>
        </header>
        <!-- Main Content Start -->
        <main>
            This is a sample text for my web page.
        </main>
        <!-- Footer Start -->
        <footer>
            Copyright 2023.
        </footer>
        <!-- End of Page -->
    </body>
    </html>

In [35]:
text = """
<!DOCTYPE html>
<html>
<head>
    <!-- Page Title -->
    <title>My Sample Page</title>
</head>
<body>
    <!-- Header Start -->
    <header>
        <h1>Welcome to My Page</h1>
    </header>
    <!-- Main Content Start -->
    <main>
        This is a sample text for my web page.
    </main>
    <!-- Footer Start -->
    <footer>
        Copyright 2023.
    </footer>
    <!-- End of Page -->
</body>
</html>
"""
matches = re.findall(r"<!--[\s\S]*?-->", text, re.DOTALL)
print(matches)

['<!-- Page Title -->', '<!-- Header Start -->', '<!-- Main Content Start -->', '<!-- Footer Start -->', '<!-- End of Page -->']
