# Web Scraping Using Selenium

*Selenium*:
 Unlike BeautifulSoup which  is used for web scraping static websites like blog or articles or news, etc. Selenium can be used for websites which load their content dynamically using Javascript.
      It can automate browsers like Chrome or FireFox, wait for content ot load, click buttons, scroll and extract fully rendered web pages like a real user.
      

*WebDriver*
Used by Selenium to interact with a web browser and acts as a bridge between the python script and the web browser.

Each browser has its own WebDriver:
- Chrome: ChromeDriver
- FireFox: GeckoDriver
- Edge: EdgeDriver


In [7]:
import requests #Sends HTTP requests to get webpage content (used for static sites)
#import selenium #Automates browsers (needed for dynamic sites with JavaScript)

!pip install selenium




In [17]:
!pip install webdriver-manager

Collecting webdriver-manager
  Downloading webdriver_manager-4.0.2-py2.py3-none-any.whl (27 kB)
Collecting python-dotenv
  Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB)
Installing collected packages: python-dotenv, webdriver-manager
Successfully installed python-dotenv-1.0.1 webdriver-manager-4.0.2


In [18]:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
import time


In [19]:
element_list=[]

#Set up Chrome Options (optional)
options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")


In [21]:
#Use a proper Service Object
service=Service(ChromeDriverManager().install())

In [23]:
for page in range(1,3):
    #Initialize driver properly
    driver=webdriver.Chrome(service=service,options=options)
    
    #Load URL
    url=f"https://webscraper.io/test-sites/e-commerce/static/computers/laptops?page=%7Bpage%7D"
    driver.get(url)
    time.sleep(2)  #Optional wait to ensure page loads
    
    #Extract page details
    titles=driver.find_elements(By.CLASS_NAME,"title")
    prices=driver.find_elements(By.CLASS_NAME,"price")
    descriptions=driver.find_elements(By.CLASS_NAME,"description")
    ratings=driver.find_elements(By.CLASS_NAME,"ratings")
    
    #Store resluts in a list
    for i in range(len(titles)):
        element_list.append([
            titles[i].text,
            prices[i].text,
            descriptions[i].text,
            ratings[i].text
        ])
        
    driver.quit()
    
    
#Display extracted data
for row in element_list:
    print(row)

['Packard 255 G2', '$416.99', '15.6", AMD E2-3800 1.3GHz, 4GB, 500GB, Windows 8.1', '2 reviews']
['Aspire E1-510', '$306.99', '15.6", Pentium N3520 2.16GHz, 4GB, 500GB, Linux', '2 reviews']
['ThinkPad T540p', '$1178.99', '15.6", Core i5-4200M, 4GB, 500GB, Win7 Pro 64bit', '2 reviews']
['ProBook', '$739.99', '14", Core i5 2.6GHz, 4GB, 500GB, Win7 Pro 64bit', '8 reviews']
['ThinkPad X240', '$1311.99', '12.5", Core i5-4300U, 8GB, 240GB SSD, Win7 Pro 64bit', '12 reviews']
['Aspire E1-572G', '$581.99', '15.6", Core i5-4200U, 8GB, 1TB, Radeon R7 M265, Windows 8.1', '2 reviews']
['Packard 255 G2', '$416.99', '15.6", AMD E2-3800 1.3GHz, 4GB, 500GB, Windows 8.1', '2 reviews']
['Aspire E1-510', '$306.99', '15.6", Pentium N3520 2.16GHz, 4GB, 500GB, Linux', '2 reviews']
['ThinkPad T540p', '$1178.99', '15.6", Core i5-4200M, 4GB, 500GB, Win7 Pro 64bit', '2 reviews']
['ProBook', '$739.99', '14", Core i5 2.6GHz, 4GB, 500GB, Win7 Pro 64bit', '8 reviews']
['ThinkPad X240', '$1311.99', '12.5", Core i5-43

ChromeOptions() + --headless: Runs the browser in the background without opening a visible window — ideal for automation and speed.

ChromeDriverManager().install(): Automatically downloads the correct version of ChromeDriver based on your Chrome browser.

Service(...): Wraps the ChromeDriver path for proper configuration with Selenium 4+.

webdriver.Chrome(service=..., options=...): Launches a Chrome browser instance with the given setup.

driver.get(url): Navigates to the specified page URL.

find_elements(By.CLASS_NAME, "class"): Extracts all elements matching the given class name like titles, prices, etc.

.text: Retrieves the visible text content from an HTML element.

element_list.append([...]): Stores each product's extracted data in a structured list.