## Load the web page content

In [8]:
import requests

# URL of the website
url = 'https://www.bbc.com/news'

# Fetch the webpage content
response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    print("Page loaded successfully.")
    print(response.content[:1000])  # Print the first 1000 characters to verify we're loading the page
else:
    print(f"Failed to retrieve content. Status code: {response.status_code}")


Page loaded successfully.
b'<!DOCTYPE html><html lang="en-GB"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width"/><title>Home - BBC News</title><meta property="og:title" content="Home - BBC News"/><meta name="twitter:title" content="Home - BBC News"/><meta name="description" content="Visit BBC News for up-to-the-minute news, breaking news, video, audio and feature stories. BBC News provides trusted World and UK news as well as local and regional perspectives. Also entertainment, business, science, technology and health news."/><meta property="og:description" content="Visit BBC News for up-to-the-minute news, breaking news, video, audio and feature stories. BBC News provides trusted World and UK news as well as local and regional perspectives. Also entertainment, business, science, technology and health news."/><meta name="twitter:description" content="Visit BBC News for up-to-the-minute news, breaking news, video, audio and feature stories. BBC News provide

## Find tags and their class attributes

In [9]:
import requests
from bs4 import BeautifulSoup

# URL of the website
url = 'https://www.bbc.com/news'

# Fetch the webpage content
response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    # Parse the page content
    soup = BeautifulSoup(response.content, 'html.parser')
    
    # Find and print all h1, h2, h3 tags with their class attributes
    for tag in ['h1', 'h2', 'h3', 'h4']:
        print(f"\n--- {tag.upper()} Tags ---")
        for element in soup.find_all(tag):
            print(f"Tag: {tag}, Text: {element.get_text()}, Class: {element.get('class')}")
else:
    print(f"Failed to retrieve content. Status code: {response.status_code}")



--- H1 Tags ---
Tag: h1, Text: NewsNews, Class: None

--- H2 Tags ---
Tag: h2, Text: Sri Lanka election goes to historic second count, Class: ['sc-4fedabc7-3', 'dsoipF']
Tag: h2, Text: Hezbollah rocket attacks damage homes in Israel, while IDF launches more Lebanon strikes, Class: ['sc-4fedabc7-3', 'zTZri']
Tag: h2, Text: Israel orders 45-day closure of Al Jazeera West Bank office, Class: ['sc-4fedabc7-3', 'zTZri']
Tag: h2, Text: MrBeast is called the internet's nicest man - now he faces 54-page lawsuit, Class: ['sc-4fedabc7-3', 'zTZri']
Tag: h2, Text: CPS twice did not prosecute Fayed over sex abuse claims, Class: ['sc-4fedabc7-3', 'zTZri']
Tag: h2, Text: Hezbollah rocket attacks damage homes in Israel, while IDF launches more Lebanon strikes, Class: ['sc-4fedabc7-3', 'zTZri']
Tag: h2, Text: Four dead and dozens hurt in Alabama mass shooting, Class: ['sc-4fedabc7-3', 'zTZri']
Tag: h2, Text: Macron unveils new right-wing French government, Class: ['sc-4fedabc7-3', 'zTZri']
Tag: h2, Te

## Extract headlines from the website

In [13]:
import requests
from bs4 import BeautifulSoup

# URL of the website
url = 'https://www.bbc.com/news'

# Fetch the webpage content
response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    # Parse the page content
    soup = BeautifulSoup(response.content, 'html.parser')
    
    # Find all 'h2' tags with class 'sc-4fedabc7-3'
    headlines = soup.find_all('h2', class_='sc-4fedabc7-3')
    
    # Print each headline
    for i, headline in enumerate(headlines, 1):
        print(f"{i}. {headline.get_text()}")
else:
    print(f"Failed to retrieve content. Status code: {response.status_code}")


1. Sri Lanka election goes to historic second count
2. Hezbollah rocket attacks damage homes in Israel, while IDF launches more Lebanon strikes
3. Israel orders 45-day closure of Al Jazeera West Bank office
4. MrBeast is called the internet's nicest man - now he faces 54-page lawsuit
5. CPS twice did not prosecute Fayed over sex abuse claims
6. Hezbollah rocket attacks damage homes in Israel, while IDF launches more Lebanon strikes
7. Four dead and dozens hurt in Alabama mass shooting
8. Macron unveils new right-wing French government
9. At least 51 dead in Iran coal mine explosion
10. Israel orders 45-day closure of Al Jazeera West Bank office
11. One dead and several missing after 'unprecedented' rains in Japan
12. Trump rejects second TV debate as 'too late'
13. How Punjabi megastar Diljit Dosanjh is inspiring the next gen
14. Dissident in prisoner swap vows to return to Russia
15. Joshua future in doubt after mauling by dominant Dubois
16. Amazon says workers must be in the office.