## Background:
Webscraping.io is a fictional website that hosts information about various products. As part of this challenge, participants are required to develop a web scraping solution to extract laptop information from the webscraping.io website. The goal is to practice web scraping techniques and data extraction using Python and relevant libraries.

### Website Navigation:
1. Develop a web scraping program that automates the task and extract the required information
2. Data Extraction: For each laptop, extract the following details:
  - Laptop Title
  - Price
  - Description
  - No of Reviews (if available)
  - URL
3. Data Storage: Save the extracted data in a structured format, such as a CSV file


### **with the help of requests library you are going to send a request to a perticular website**

In [None]:
! pip install beautifulsoup4



###  **with the help of requests library you are going to send a request to a perticular website**

In [None]:
! pip install requests



### **Selenium is a popular open-source automation framework that allows you to automate web browsers for testing and web scraping purposes.**

In [None]:
! pip install selenium



## **Importing Necessary Libraries**

In [None]:
import pandas as pd
from bs4 import BeautifulSoup
import requests
from selenium import webdriver
import time

## **Code Documentation**

In [None]:
driver = webdriver.Chrome()  # To initialize the Chrome WebDriver
url = 'https://webscraper.io/'
driver.get(url)

# Accepting the cookies
driver.find_element('xpath','//*[@id="cookieBanner"]/div[2]/a').click()  # xpath is used to navigate and select elements from XML or HTML documents.
# It is used to find the element representing the "Accept Cookies" button on the page (using an XPath selector)
# and it clicks it to dismiss the cookie banner.

time.sleep(2)  # Now we are keeping delay of 2 seconds to allow the page to respond.

#Now will Navigating to Test Site
driver.find_element('xpath','//*[@id="layout-footer"]/div/div[1]/div[3]/ul/li[6]/a').click()
time.sleep(2) # THe Delays are added between actions to ensure that the page loads properly.

#Now will Navigating to E-commerce site
driver.find_element('xpath','/html/body/div[1]/div[3]/div[1]/div[1]/h2/a').click()
time.sleep(2)

#Now will Navigating to Computer
driver.find_element('xpath','//*[@id="side-menu"]/li[3]/a').click()
time.sleep(2)

#Now will Navigating to Laptops
driver.find_element('xpath','//*[@id="side-menu"]/li[3]/ul/li[2]/a').click()
time.sleep(2)

# We are creates a BeautifulSoup Object by parsing the HTML source code of the current page using the "html.parser".
soup = BeautifulSoup(driver.page_source,'html.parser')

# Creating a Empty DataFrame to store the extracted information.
df = pd.DataFrame()

#Extarcting Laptop name
a_tag = soup.find_all('a',class_ = 'title')
df['title'] = [i.text for i in a_tag]

#Extracting Price
h_tag = soup.find_all('h4',class_ ='price')
df['price'] = [float(i.text.strip("$")) for i in h_tag]

#Extracting description
p_tag = soup.find_all('p', class_ = 'description')
df['description'] = [i.text for i in p_tag]

# Extractinf Reviwe
rev = soup.find_all('p',class_ = 'pull-right')
df['Reviwe'] = [int(i.text.split()[0]) for i in rev]

# Extracting Url
df['link'] = ['https://webscraper.io'+i.get('href') for i in a_tag]

In [None]:
df

Unnamed: 0,title,price,description,Reviwe,link
0,Asus VivoBook X4...,295.99,"Asus VivoBook X441NA-GA190 Chocolate Black, 14...",14,https://webscraper.io/test-sites/e-commerce/al...
1,Prestigio SmartB...,299.00,"Prestigio SmartBook 133S Dark Grey, 13.3"" FHD ...",8,https://webscraper.io/test-sites/e-commerce/al...
2,Prestigio SmartB...,299.00,"Prestigio SmartBook 133S Gold, 13.3"" FHD IPS, ...",12,https://webscraper.io/test-sites/e-commerce/al...
3,Aspire E1-510,306.99,"15.6"", Pentium N3520 2.16GHz, 4GB, 500GB, Linux",2,https://webscraper.io/test-sites/e-commerce/al...
4,Lenovo V110-15IA...,321.94,"Lenovo V110-15IAP, 15.6"" HD, Celeron N3350 1.1...",5,https://webscraper.io/test-sites/e-commerce/al...
...,...,...,...,...,...
112,Lenovo Legion Y7...,1399.00,"Lenovo Legion Y720, 15.6"" FHD IPS, Core i7-770...",8,https://webscraper.io/test-sites/e-commerce/al...
113,Asus ROG Strix G...,1399.00,"Asus ROG Strix GL702VM-GC146T, 17.3"" FHD, Core...",10,https://webscraper.io/test-sites/e-commerce/al...
114,Asus ROG Strix G...,1769.00,"Asus ROG Strix GL702ZC-GC154T, 17.3"" FHD, Ryze...",7,https://webscraper.io/test-sites/e-commerce/al...
115,Asus ROG Strix G...,1769.00,"Asus ROG Strix GL702ZC-GC209T, 17.3"" FHD IPS, ...",8,https://webscraper.io/test-sites/e-commerce/al...


In [None]:
df.shape

(117, 5)

### **Storing the data in csv format**

In [None]:
df.to_csv('Laptop info using webscraper_io.csv1')