# Amazon Product Scraper
This notebook demonstrates a web scraping tool designed to extract product data from Amazon Egypt using Python. It leverages libraries like `requests` and `BeautifulSoup` to fetch and parse the HTML content of Amazon product pages.

## Prerequisites
Before running the code, make sure you have the following libraries installed:
- `requests`
- `beautifulsoup4`
- `pandas`

You can install these packages using pip:
```bash
pip install requests beautifulsoup4 pandas
```

In [4]:
import requests
import functions as fn
import pandas as pd
from bs4 import BeautifulSoup

## Setup
Define the URL, headers and search parameters for the scraping session.

In [6]:
url = 'https://www.amazon.eg/s'
headers = ({
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
    'Accept-Language': 'en-US, en;q=0.5'
})

## Input Instructions

Please provide the following inputs for the web scraping process:

- **search**: Enter the product you're looking for.
- **max_results**: Enter the maximum number of products to retrieve.

In [8]:
search = 'apple watch'
max_results = 50

## Scraping Logic
The following code handles the scraping process by iterating through pages and extracting relevant product information. The process continues until the desired number of products is scraped or no more products are found.

In [10]:
page = 1
products = []
while len(products) < max_results:

    try:
        webpage = requests.get(url, headers=headers, params={'k': search, 'page': page})
        webpage.raise_for_status()  # Raise an error for bad responses (4xx and 5xx)
    except requests.exceptions.RequestException as e:
        print(f"Request failed: {e}")
        break

    soup = BeautifulSoup(webpage.content, "html.parser")
    containers = soup.findAll('div', {'class': 'sg-col-4-of-24 sg-col-4-of-12 s-result-item s-asin sg-col-4-of-16 sg-col s-widget-spacing-small sg-col-4-of-20'})

    if not containers:
        print("No more products found")
        break

    for container in containers:
        product_name = fn.get_product_name(container)
        product_rating = fn.get_product_rating(container)
        product_nreviews = fn.get_product_nreviews(container)
        product_price = fn.get_product_price(container)

        products.append({
            'Name': product_name,
            'Price (EGP)': product_price,
            'Rating': product_rating,
            'Reviews': product_nreviews,
        })

        if len(products) >= max_results:
            break

    page += 1


## Saving the Data
Once the data is scraped, it can be saved to a CSV file for further analysis or use.

In [12]:
df = pd.DataFrame(products)
df.to_csv('data/products.csv', index=False)
df.head()

Unnamed: 0,Name,Price (EGP),Rating,Reviews
0,ساعة يد ذكية آبل ووتش الفئة 9 بتقنية تحديد الم...,21450.0,4.7,493
1,ساعة آبل الذكية سيريز 9 بسوار رياضي وهيكل ألوم...,19950.0,4.7,96
2,ساعة ابل ووتش اس اي مع نظام تحديد الموقع مقاس ...,15950.0,4.6,32
3,ابل ساعة اس اي جي بي اس 44 ملم بهيكل الومنيوم ...,15950.0,4.6,96
4,ابل ساعة اس اي (الجيل الثاني) جي بي اس 44 ملم ...,,4.5,43
