# **Web Scraping Examples with Python**

The code you've provided is a collection of examples demonstrating how to extract information from websites using Python. This process is called **web** **scraping**.

In [10]:
# Install the requests library to make HTTP requests (get web pages)
!pip install requests

# Install the beautifulsoup4 library to parse HTML content
!pip install beautifulsoup4

# Import BeautifulSoup for parsing HTML
from bs4 import BeautifulSoup

# Import requests to fetch web pages
import requests



**Extract Headlines from TCS: Extract top 5 headlines from https://www.tcs.com/insights**


In [3]:
import requests
from bs4 import BeautifulSoup

def get_tcs_headlines():
    url = "https://www.tcs.com/insights"
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')

    headline = soup.find_all("h3")
    for i, h in enumerate(headline[:5]):
        print(f"{i+1}. {h.get_text(strip=True)}")
get_tcs_headlines()

1. TCS is here to make a difference through technology.
2. We deliver excellence and create value for customers and communities.
3. Extraordinary expertise leads to remarkable results.
4. Want to be a global change-maker?  Join our team.
5. Find the latest news about TCS in our Newsroom


 **Get All Hyperlinks from a Website
Goal: Extract and print all <a> tags (text + href) from https://www.ibm.com/in-en**

In [4]:
def extract_links():
    url = "https://www.ibm.com/in-en"
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html')

    links = soup.find_all('a')
    for link in links:
        text = link.get_text(strip=True)
        href = link.get('href')
        if href:
            print(f"Text: {text} | Link: {href}")

extract_links()


Text: Get the event details | Link: https://www.ibm.com/in-en/events/gartner-data-analytics-summit?lnk=hpls1in
Text: Explore AI agents | Link: https://www.ibm.com/products/watsonx-orchestrate
Text: Start building with IBM Granite models | Link: https://www.ibm.com/granite?lnk=dev
Text: Explore AI courses, APIs, data sets and more | Link: https://developer.ibm.com/technologies/artificial-intelligence?lnk=dev
Text: Accelerate software development with watsonx Code Assistant | Link: https://www.ibm.com/products/watsonx-code-assistant?lnk=dev
Text: Check out the watsonx.ai Developer Toolkit | Link: https://www.ibm.com/watsonx/developer/?lnk=dev
Text: Read the CEO's guide to generative AI | Link: https://www.ibm.com/thought-leadership/institute-business-value/report/ceo-generative-ai?lnk=bus
Text: Explore an AI curriculum designed for business leaders | Link: https://www.ibm.com/think/videos/ai-academy
Text: Deploy an AI agent for HR with watsonx Orchestrate | Link: https://www.ibm.com/prod

**Scrape Book Titles from books.toscrape.com Goal: Get titles of first 10 books.**

In [5]:
def scrape_book_titles():
    url = "http://books.toscrape.com"
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html')

    books = soup.find_all('h3')[:10]
    for i, book in enumerate(books, 1):
        title = book.a['title']
        print(f"{i}. {title}")

scrape_book_titles()


1. A Light in the Attic
2. Tipping the Velvet
3. Soumission
4. Sharp Objects
5. Sapiens: A Brief History of Humankind
6. The Requiem Red
7. The Dirty Little Secrets of Getting Your Dream Job
8. The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull
9. The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics
10. The Black Maria


**Extract Quotes and Authors Goal: Get 5 quotes and their authors from http://quotes.toscrape.com**

In [9]:
def scrape_quotes():
    url = "http://quotes.toscrape.com"
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')

    quotes = soup.find_all('div', class_='quote')[:5]
    for i, quote in enumerate(quotes, 1):
        text = quote.find('span', class_='text').get_text(strip=True)
        author = quote.find('small', class_='author').get_text(strip=True)
        print(f"{i}. {text} — {author}")

scrape_quotes()


1. “The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.” — Albert Einstein
2. “It is our choices, Harry, that show what we truly are, far more than our abilities.” — J.K. Rowling
3. “There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.” — Albert Einstein
4. “The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.” — Jane Austen
5. “Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.” — Marilyn Monroe
