### Downloading Images from a Website
#### Overview
```
The script downloads all images from the Biography.com website, saves them into a local folder (images), and ensures that both absolute and relative URLs are correctly processed.
```

In [41]:
import requests
from bs4 import BeautifulSoup
import os
from urllib.parse import urlparse, urljoin

#### Explanation:

- requests: Used to send HTTP requests to fetch the web page content.

- BeautifulSoup: A web scraping library that parses HTML and helps extract data.

- os: Provides functions for file system operations (e.g., creating folders, handling file paths).

- urlparse and urljoin (from urllib.parse): Used to handle and construct complete URLs (especially for images with relative paths).

### Define the Target Website and Fetch HTML Content

In [28]:
# Target website URL
url = "https://www.biography.com/"

# Step 1: Fetch the website content
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")



#### Explanation:

- The target website (https://www.biography.com/) is stored in the url variable.

- requests.get(url): Sends a request to the website and retrieves its HTML content.

- BeautifulSoup(response.text, "html.parser"): Parses the HTML response into a structured format for easy extraction of data.

### Extracting all image urls

In [30]:
images = soup.find_all('img')
for img in images:
    print(img['src'])

/_assets/design-tokens/fre/static/icons/search.f1c199c.svg
/_assets/design-tokens/fre/static/icons/close.38e3324.svg
/_assets/design-tokens/biography/static/images/logos/logo.5ec9b18.svg?primary=%2523ffffff
https://hips.hearstapps.com/hmg-prod/images/dolly-carl-7cr--dollyparton.jpg?crop=1.00xw:0.751xh;0,0.0951xh&resize=1200:*
/_assets/design-tokens/fre/static/icons/play.db7c035.svg?primary=%2523ffffff
https://hips.hearstapps.com/vidthumb/manual_upload/57b381f3e694aa572b888fe7/thumb_1471382006.png?crop=1xw:1xh;center,top&resize=1200:*
https://hips.hearstapps.com/hmg-prod/images/e3ee4c9f-2dd5-4975-83f7-8afdb7bc6cd8.jpeg?crop=1xw:0.723xh;0xw,0.117xh&resize=270:*
https://hips.hearstapps.com/hmg-prod/images/erik-menendez-and-his-brother-lyle-listen-during-a-pre-news-photo-1740604632.pjpeg?crop=0.663xw:1.00xh;0.0769xw,0&resize=270:*
https://hips.hearstapps.com/hmg-prod/images/screen-shot-2025-02-26-at-8-47-58-am-67bf1ba89d70c.png?crop=0.954xw:1.00xh;0.0238xw,0&resize=270:*
https://hips.hears

### Defining the Function to Download Images

In [42]:
def download_image(image_url, folder='images'):
    """Downloads an image from the given URL and saves it into a specified folder."""
    
    # Create the folder if it doesn't exist
    if not os.path.exists(folder):
        os.makedirs(folder)

    # Process the image URL (handle relative paths)
    parsed_url = urlparse(image_url)
    if parsed_url.scheme == "":  # If the scheme is missing, prepend "https:"
        full_url = "https:" + image_url
    else:
        full_url = image_url.split("?")[0]  # Remove query parameters

    # Generate filename based on the image URL
    filename = os.path.join(folder, os.path.basename(full_url))

    try:
        # Fetch the image data
        img_data = requests.get(full_url).content

        # Save the image file
        with open(filename, 'wb') as img_file:
            img_file.write(img_data)
        
        print(f"Downloaded: {filename}")
    except Exception as e:
        print(f"Failed to download {image_url}: {e}")


#### Explanation:

- Function Purpose: Downloads an image and saves it in a specified folder.

- folder='images': Default directory to save images.

- Folder Handling: If the images folder does not exist, os.makedirs(folder) creates it.

- Processing Image URLs:

    - urlparse(image_url): Checks if the image URL is absolute or relative.

    - If no scheme (HTTP/HTTPS) is found, "https:" + image_url is added.

    - Removes unnecessary query parameters using .split("?")[0].

- Downloading the Image:

    - requests.get(full_url).content: Fetches the image data.

    - open(filename, 'wb'): Opens the file in binary write mode.

    - Writes the image data to the file.

- Error Handling: If the image fails to download, an error message is printed.

### Iterating Over Image Tags and Downloading Images

In [43]:
for img in images:
    img_url = img["src"]
    download_image(urljoin(url, img_url))

Downloaded: images\search.f1c199c.svg
Downloaded: images\close.38e3324.svg
Downloaded: images\logo.5ec9b18.svg
Downloaded: images\dolly-carl-7cr--dollyparton.jpg
Downloaded: images\play.db7c035.svg
Downloaded: images\thumb_1471382006.png
Downloaded: images\e3ee4c9f-2dd5-4975-83f7-8afdb7bc6cd8.jpeg
Downloaded: images\erik-menendez-and-his-brother-lyle-listen-during-a-pre-news-photo-1740604632.pjpeg
Downloaded: images\screen-shot-2025-02-26-at-8-47-58-am-67bf1ba89d70c.png
Downloaded: images\sean-baker-winner-of-the-best-picture-best-directing-best-news-photo-1741014792.pjpeg
Downloaded: images\gettyimages-1382611326-crop-670ed8cb0603c.jpg
Downloaded: images\zoe-saldana-at-golden-globes-first-time-nominee-luncheon-at-news-photo-1735939071.pjpeg
Downloaded: images\conan-obrien-attends-the-paleylive-globetrotting-podcasting-news-photo-1731687503.jpg
Downloaded: images\lalisa-manobal-arrives-at-the-los-angeles-premiere-of-hbo-news-photo-1740595405.pjpeg
Downloaded: images\ap266047a754dd3e.jp

#### Explanation:
```
- Iterates through all <img> tags in the images list.

- Extracts the src attribute, which contains the image URL.

- Ensures all URLs are absolute by using urljoin(url, img_url).

- Calls download_image(img_url) to download each image.
```