# What is webscrapping? 
Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis.

![](https://crawlbase.com/blog/best-data-memes/web-scraping-memes.jpg)

Scraping a web page involves fetching it and then extracting data from it. Fetching is the downloading of a page (which a browser does when a user views a page). Therefore, web crawling is a main component of web scraping, to fetch pages for later processing. Having fetched, extraction can take place. The content of a page may be parsed, searched and reformatted, and its data copied into a spreadsheet or loaded into a database. Web scrapers typically take something out of a page, to make use of it for another purpose somewhere else. An example would be finding and copying names and telephone numbers, companies and their URLs, or e-mail addresses to a list (contact scraping).

![](https://crawlbase.com/blog/best-data-memes/data-scraping-meme.jpg)

##  Let's web scrape the Amazon website and search for Smarphones under AED 1000. 

Please note that the chrome driver is installed automatically. So just need to download the code and run the code so that the web scrapping is done. 

## Import Modules

In [139]:
from selenium import webdriver 
from time import sleep 
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from amazoncaptcha import AmazonCaptcha
from selenium.webdriver.common.by import By

## Automatically download and match the correct driver

In [140]:
service = Service(ChromeDriverManager().install())
browser = webdriver.Chrome(service=service)

## Load the page 

In [141]:

browser.get('https://www.amazon.ae')

In [142]:
browser.maximize_window()

## Bypassing the Captcha

In [143]:
link = browser.find_element(By.XPATH,"//div[@class = 'a-row a-text-center']//img").get_attribute('src')


In [144]:
captcha = AmazonCaptcha.fromlink(link)

In [145]:
captcha_value = AmazonCaptcha.solve(captcha)

In [146]:
input_field = browser.find_element(By.ID, "captchacharacters").send_keys(captcha_value)

In [147]:
button = browser.find_element(By.CLASS_NAME, "a-button-text")

In [148]:
button.click()

## Searching the required info.


In [149]:
input_search = browser.find_element(By.ID,'twotabsearchtextbox')
search_button = browser.find_element(By.XPATH, "(//input[@type='submit'])[1]")

## Sent the input to the webpage 

In [150]:
input_search.send_keys("Smartphones under 1000")
sleep(1)
search_button.click()

## Scrape Products from Amazon 

In [158]:
products = []

for i in range(10):
    print('Scraping page', i + 1)
    
    product_elements = browser.find_elements(By.XPATH, "//a[contains(@class, 'a-link-normal s-line-clamp-4 s-link-style a-text-normal')]")
    
    for p in product_elements:
        products.append(p.text)  # or p.get_attribute("innerText")

    next_button = browser.find_element(By.XPATH, "//span[@class='a-list-item']")
    next_button.click()
    sleep(2)


Scraping page 1
Scraping page 2
Scraping page 3
Scraping page 4
Scraping page 5
Scraping page 6
Scraping page 7
Scraping page 8
Scraping page 9
Scraping page 10


In [159]:
len(products)

560

In [160]:
products[:5]

['Nothing Phone (3a) 128 GB - mobile phone with 32 MP front camera, 30x ultra zoom, 50W fast charging and 6.77" FHD+ flexible AMOLED display - Black',
 'HMD Fusion 5G Android 14 Smartphone, 24GB RAM (12GB+12GB), 256GB Storage, Ultra-Slim Transparent Design, High-Resolution Display for Gaming & Media, 108MP Rear + 50MP Selfie Camera, Noir Indigo',
 'Samsung Galaxy A06 LTE, Android Smartphone, Dual SIM Mobile Phone, 4GB RAM, 64GB Storage, Gold (UAE Version)',
 'HONOR X9c Smart 5G 8+256GB, 120Hz 6.8 Inh Display, Drop Resitant, Scratch Resistant, Water Resistant Smartphone, Ocean Cyan - UAE Version',
 'Xiaomi Redmi Note 14 Mobile (Midnight Black 8GB RAM, 256GB Storage) | 108MP AI camera system |6.67" 120Hz AMOLED display | 5500 mAh (typ) battery']

## Conclution 

We have web scapped the 10 pages of Amazon website and stored the name in Dictionary. This is basic web scrapping using selenium. Data need to be cleaned and converted to Dataframe for further analysis. 
