# Web scraping - Requests + BeautifulSoup

Here Requests library is used to fetch the contents of the page, and BeautifulSoup
is used to get the needed items from the page.

In [1]:
import pandas as pd
from bs4 import BeautifulSoup
import requests
import re

## Get the data from the page by CSS.

Here, we get the contents of an URL using the ```requests``` library, and then
we find the parent by css and parse the title, price, and image url form it.

In [2]:
# Get the data for a page.
def getPageData(url):
    response = requests.get(url)
    html = response.content
    soup = BeautifulSoup(html, 'html.parser')
    css_selector = '.awrapper .listItemContainer .listItemLink'
    data = []
    i = 0
    for item in soup.select(css_selector):
        i += 1
        title = item.select_one('span.title').text.strip()
        price = item.select_one('span.price').text.strip()
        price = re.sub('[^0-9]', '', price)
        if not len(price):
            price = 0
        price = int(price)
        image = item.select_one('img.cover')['src']
        if title and price:
            data.append([title, price, 'https:' + image])
    return data

## Main body - iterate pages.

Here we iterate the 5 pages and send them to ```getPageData()``` for analysis.

In [3]:
base_url = 'https://bazar.bg/obiavi/gradski-velosipedi/varna?condition=2'
count_pages = 5
data = []
for i in range(count_pages):
    cur_page = i + 1
    print(f'Get page {cur_page} of {count_pages}')
    if cur_page == 1:
        url = base_url
    else:
        url = base_url + '&page=' + str(cur_page)
    data += getPageData(url)

Get page 1 of 5
Get page 2 of 5
Get page 3 of 5
Get page 4 of 5
Get page 5 of 5


## Export to Excel with Pandas.

Here we create the dataframe from the array, sort it by price, and save to Excel
document.

In [4]:
df = pd.DataFrame(data, columns=['title', 'price', 'image'])
df.sort_values(by='price', inplace=True)
df.to_excel('bikes-bs4.xlsx')
df.head()

Unnamed: 0,title,price,image
43,Степенка за велосипед,10,https://cdn1.focus.bg/bazar/d7/fp/d7ed5717b397...
119,Гуми 28 цола nimbus700×32c kenda цената е за 2...,25,https://cdn5.focus.bg/bazar//da/fp/da81f5744ca...
120,Калъф стойка за телефон монтаж за колело,25,https://cdn5.focus.bg/bazar//86/fp/861f50d5ecc...
157,Две капли 26” оборудвани с гуми + бонус,55,https://cdn1.focus.bg/bazar/5f/fp/5f43794ab809...
36,Детски велосипед 16 цола - разпродажба,69,https://cdn1.focus.bg/bazar/4b/fp/4b6361302ca6...
