# Assignment: Scraping eBay Data Using Selenium

This assignment will guide you through the steps required to scrape product data from [eBay](https://www.ebay.com/) using Selenium. Your goal is to collect data about products based on a specific search query and store the data in a CSV file for analysis.

## Instructions

Below is a step-by-step outline of the scraping process. Follow these steps and implement the required code to complete the assignment. Comment your code wherever necessary to explain your thought process.

### **Step 1: Set Up Selenium**
1. Import the necessary modules from Selenium (e.g., `webdriver`, `By`, `Keys`, etc.).
2. Set up the Chrome WebDriver to control the browser. Ensure you have downloaded the ChromeDriver executable and placed it in the correct directory.
3. Navigate to the eBay homepage using the WebDriver.

In [None]:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

path = 'C:\\avast\\chromedriver.exe' 
driver = webdriver.Chrome(executable_path=path)
driver = webdriver.Chrome()
driver.get("https://www.ebay.com")

### **Step 2: Perform a Search**
1. Identify the search bar element on the eBay homepage using an appropriate locator (e.g., `id`, `name`, `XPath`).
2. Send a specific search query (e.g., "laptops") to the search bar and simulate pressing the Enter key.
3. Wait for the search results page to load.

In [None]:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver_path = 'path/to/chromedriver'  
driver = webdriver.Chrome(executable_path=driver_path)

driver.get("https://www.ebay.com")
driver.maximize_window()

try:
    search_bar = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.NAME, "_nkw"))
    )
    search_query = "laptops"
    search_bar.send_keys(search_query + Keys.RETURN)
    WebDriverWait(driver, 10).until(
        EC.title_contains(search_query)
    )
    print(driver.title)

finally:
    driver.quit()

### **Step 3: Extract Product Data**
1. Use `find_elements` to locate product titles, prices, and other relevant data on the search results page. For example:
   - Product title: Locate elements displaying the product names.
   - Price: Locate elements showing product prices.
   - (Optional) Link: Extract the URL for each product.
2. Loop through the extracted elements and store the data in a structured format (e.g., a Python list of dictionaries).

In [None]:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver_path = 'path/to/chromedriver' 
driver = webdriver.Chrome(service=Service(executable_path=driver_path))

driver.get("https://www.ebay.com")
driver.maximize_window()

try:
    search_bar = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.NAME, "_nkw"))
    )
    
    search_query = "laptops"
    search_bar.send_keys(search_query + Keys.RETURN)

    WebDriverWait(driver, 10).until(
        EC.title_contains(search_query)
    )
    products = []
    product_elements = driver.find_elements(By.CSS_SELECTOR, ".s-item")  

    for item in product_elements:
        title_element = item.find_element(By.CSS_SELECTOR, ".s-item__title")
        price_element = item.find_element(By.CSS_SELECTOR, ".s-item__price")
        link_element = item.find_element(By.CSS_SELECTOR, ".s-item__link")

        product_data = {
            "title": title_element.text,
            "price": price_element.text,
            "url": link_element.get_attribute("href")
        }
        products.append(product_data)

    for product in products:
        print(product)

finally:
    driver.quit()

### **Step 4: Handle Pagination**
1. Check for the presence of a "Next" button to navigate to the next page of results.
2. Implement a loop to scrape multiple pages of search results. Break the loop when no more pages are available or after a set number of pages (e.g., 5 pages).

In [None]:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver_path = 'path/to/chromedriver' 
service = Service(executable_path=driver_path)
driver = webdriver.Chrome(service=service)
driver.get("https://www.ebay.com")
driver.maximize_window()

try:
    search_bar = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.NAME, "_nkw"))
    )
    search_query = "laptops"
    search_bar.send_keys(search_query + Keys.RETURN)
finally:
    driver.quit()

### **Step 5: Save Data to CSV**
1. Use the `pandas` library to convert the scraped data into a DataFrame.
2. Save the DataFrame to a CSV file with appropriate column headers.

In [None]:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd  
driver_path = 'path/to/chromedriver'
service = Service(executable_path=driver_path)
driver = webdriver.Chrome(service=service)
driver.get("https://www.ebay.com")
driver.maximize_window()

try:
    search_bar = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.NAME, "_nkw"))
    )

    search_query = "laptops"
    search_bar.send_keys(search_query + Keys.RETURN)
    WebDriverWait(driver, 10).until(
        EC.title_contains(search_query)
    )
    products = []
    max_pages = 5  
    current_page = 1

    while current_page <= max_pages:
        product_elements = driver.find_elements(By.CSS_SELECTOR, ".s-item")  

        for item in product_elements:
            try:
                title_element = item.find_element(By.CSS_SELECTOR, ".s-item__title")
                price_element = item.find_element(By.CSS_SELECTOR, ".s-item__price")
                link_element = item.find_element(By.CSS_SELECTOR, ".s-item__link")

                product_data = {
                    "title": title_element.text,
                    "price": price_element.text,
                    "url": link_element.get_attribute("href")
                }
                products.append(product_data)
            except Exception as e:
                print(f"Error extracting data from item: {e}")

        try:
            next_button = driver.find_element(By.CSS_SELECTOR, ".pagination__next")
            if "disabled" in next_button.get_attribute("class"):
                break  
            else:
                next_button.click() 
                WebDriverWait(driver, 10).until(
                    EC.title_contains(search_query) 
                )
                current_page += 1  
        except Exception as e:
            print(f"No next button found or error: {e}")
            break  
    df = pd.DataFrame(products)
    df.to_csv('ebay_products.csv', index=False, header=True)
    print("Data has been saved to ebay_products.csv")

finally:
    driver.quit()

### **Step 6: Close the Browser**
1. Once the scraping is complete, ensure the WebDriver is closed to release system resources.

In [None]:
python

Run

Copy
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd  
driver_path = 'path/to/chromedriver' 
service = Service(executable_path=driver_path)
driver = webdriver.Chrome(service=service)
driver.get("https://www.ebay.com")
driver.maximize_window()

try:
    search_bar = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.NAME, "_nkw"))
    )
    search_query = "laptops"
    search_bar.send_keys(search_query + Keys.RETURN)
    WebDriverWait(driver, 10).until(
        EC.title_contains(search_query)
    )
    products = []
    max_pages = 5 
    current_page = 1

    while current_page <= max_pages:
        product_elements = driver.find_elements(By.CSS_SELECTOR, ".s-item")

        for item in product_elements:
            try:
                title_element = item.find_element(By.CSS_SELECTOR, ".s-item__title")
                price_element = item.find_element(By.CSS_SELECTOR, ".s-item__price")
                link_element = item.find_element(By.CSS_SELECTOR, ".s-item__link")

                product_data = {
                    "title": title_element.text,
                    "price": price_element.text,
                    "url": link_element.get_attribute("href")
                }
                products.append(product_data)
            except Exception as e:
                print(f"Error extracting data from item: {e}")
        try:
            next_button = driver.find_element(By.CSS_SELECTOR, ".pagination__next")
            if "disabled" in next_button.get_attribute("class"):
                break 
            else:
                next_button.click()  
                WebDriverWait(driver, 10).until(
                    EC.title_contains(search_query) 
                )
                current_page += 1
        except Exception as e:
            print(f"No next button found or error: {e}")
            break  

    df = pd.DataFrame(products)
    df.to_csv('ebay_products.csv', index=False, header=True)
    print("Data has been saved to ebay_products.csv")

finally:
    driver.quit()
 

### **Deliverables**
- Submit the Python script you implemented on your github, following the above steps.
- Ensure that your script:
  - Extracts data for at least 50 products.
  - Includes product titles, prices, and links (if applicable).
  - Saves the data to a CSV file named `ebay_products.csv`.

### **Bonus Challenge**
1. Add functionality to scrape product ratings and the number of reviews (if available).
2. Include error handling to skip elements that might be missing data or inaccessible.

**Good luck!** 🚀