# Web Scraping - Selenium.

Selenium is a web driver which interacts with your browser of choice and allows
you to programmatically interact not just with the code of the page, but with a
rendered page, in a way, how the browser would see it.

In [1]:
from selenium import webdriver
import pandas as pd
from selenium.webdriver.chrome.options import Options
import re

Here we set up Selenium. We set it up with Chrome web driver, and set it to run
without the JavaScript, because a modal pop-up in an iframe complicates the
interaction. It's possible to find that iframe and that pop-up and click it, but
for the effort's sake I just run it without JavaScript.

In [2]:
chrome_options = Options()
chrome_options.add_experimental_option('prefs', {'profile.managed_default_content_settings.javascript': 2})
driver = webdriver.Chrome('chromedriver', options=chrome_options)
url = 'https://bazar.bg/obiavi/gradski-velosipedi/varna?condition=2'
css_selector = '.awrapper .listItemContainer .listItemLink'
driver.get(url)
driver.implicitly_wait(1)
count_pages = 5
data = []

Here we iterate the pages and then click the "Next" button. Interactions with the
rendered page is something that Selenium is strong at.

In [3]:
for page in range(count_pages):
    items = driver.find_elements_by_css_selector(css_selector)
    i = 0
    for item in items:
        i += 1
        title = item.find_element_by_css_selector('span.title').text
        price = item.find_element_by_css_selector('span.price').text
        price = re.sub('[^0-9]', '', price)
        if not len(price):
            price = 0
        price = int(price)
        image = item.find_element_by_css_selector('img.cover').get_attribute('src')
        if title and price:
            data.append([title, price, image])
    try:
        link = driver.find_element_by_css_selector('.paging a.next')
        link.click()
    except:
        break

Driver quit command destroys the process and closes the window.

In [4]:
driver.quit()

## Export to Excel with Pandas.

Here we create the dataframe from the array, sort it by price, and save to Excel
document.

In [5]:
df = pd.DataFrame(data, columns=['title', 'price', 'image'])
df.sort_values(by='price', inplace=True)
df.to_excel('bikes-selenium.xlsx')
df.head()

Unnamed: 0,title,price,image
12,Степенка за велосипед,10,https://cdn1.focus.bg/bazar/d7/fp/d7ed5717b397...
14,Калъф стойка за телефон монтаж за колело,25,https://cdn5.focus.bg/bazar//86/fp/861f50d5ecc...
11,Гуми 28 цола nimbus700×32c kenda цената е за 2...,25,https://cdn5.focus.bg/bazar//da/fp/da81f5744ca...
55,Детски велосипед 16 цола - разпродажба,69,https://cdn1.focus.bg/bazar/4b/fp/4b6361302ca6...
49,"рамка 52см и вилка (феникс 28"")",70,https://cdn5.focus.bg/bazar/8d/fp/8d88b108c0cc...
