# Lab Exercise 1. Scraping Static Websites


This is the warmup task for the first laboratory exercise. It consists of scraping static Websites with BeautifulSoap.

 It should be completed at home and presented at the laboratory.

**Total points: 2**

### Task Description

Scrape the information about the products on the following page:
https://clevershop.mk/product-category/mobilni-laptopi-i-tableti/

For each product scrape:


*   Product title (selector `'.wd-entities-title'`)
*   Product regular price (selector `'.woocommerce-Price-amount'`)
*   Product discount price (if available), same selector as regular price
*   URL to the product page
*   Add to cart button URL

***Help: There are multiple product pages, for each page you need to send a separate request***


Save the results as a DataFrame object

You can add as many code cells as you need.

________________________________________________________________

### Requirements

Import libraries and modules that you are going to use

In [2]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

### Send HTTP request to the target Website

In [3]:
url = 'https://clevershop.mk/product-category/mobilni-laptopi-i-tableti/'
response = requests.get(url)

check the response status code

In [4]:
print(response.status_code)

200


### Parse the response content with BeautifulSoap

In [5]:
soup = BeautifulSoup(response.content, 'html.parser')

In [37]:
soup.prettify()



### Extract data from the BeautifulSoap object using any selectors, attribute identifiers, etc.

* Product title (selector '.wd-entities-title')
* Product regular price (selector '.woocommerce-Price-amount')
* Product discount price (if available), same selector as regular price
* URL to the product page
* Add to cart button URL

In [38]:
products = soup.select('.product-wrapper')
product_data = []

for product in products:
    title = product.select_one('.wd-entities-title').get_text().strip()
    price = product.select_one('.woocommerce-Price-amount bdi').get_text().strip() if product.select_one('.woocommerce-Price-amount bdi') else 'N/A'
    product_url = product.select_one('a')['href'] if product.select_one('a') else 'N/A'
    add_to_cart_url = product.select_one('.add_to_cart_button')['href'] if product.select_one('.add_to_cart_button') else 'N/A'
    
    product_data.append({
        'Title': title,
        'Price': price,
        'Product URL': product_url,
        'Add to Cart URL': add_to_cart_url
    })

print(product_data)

[{'Title': 'MON 27 LG 27GN850-B QHD IPS 1MS 144HZ HDMI', 'Price': '29.990\xa0ден', 'Product URL': 'https://clevershop.mk/product/mon-27-lg-27gn850-b-qhd-ips-1ms-144hz-hdmi/', 'Add to Cart URL': '?add-to-cart=12543'}, {'Title': 'MON 27 Philips 272B7QUPBEB/00 with USB-C Dock', 'Price': '20.590\xa0ден', 'Product URL': 'https://clevershop.mk/product/mon-27-philips-272b7qupbeb-00-with-usb-c-dock/', 'Add to Cart URL': '?add-to-cart=12619'}, {'Title': 'MON 27 Philips 272V8A/00', 'Price': '10.590\xa0ден', 'Product URL': 'https://clevershop.mk/product/mon-27-philips-272v8a-00/', 'Add to Cart URL': '?add-to-cart=12638'}, {'Title': 'Monitor 27 Philips 272E1GAJ/00 VA 1ms 144Hz', 'Price': '12.890\xa0ден', 'Product URL': 'https://clevershop.mk/product/monitor-27-philips-272e1gaj-00-va-1ms-144hz/', 'Add to Cart URL': '?add-to-cart=12618'}, {'Title': 'Philips 24″ 243V7QDSB', 'Price': '8.390\xa0ден', 'Product URL': 'https://clevershop.mk/product/philips-24%e2%80%b3-243v7qdsb/', 'Add to Cart URL': '?add

Repeat the extraction process for each page of products

In [29]:
all_products_info = []
all_titles = []
regular_prices_list = []
discount_prices_list = []
product_page_urls = []
add_to_cart_urls = []
product_info = {}
base_url = "https://clevershop.mk/product-category/mobilni-laptopi-i-tableti/"
for page_num in range(1, 15):
    page_url = base_url + "page/" + str(page_num) + "/"
    response = requests.get(page_url)
    page_html = response.text
    soup = BeautifulSoup(page_html, "html.parser")
    products_on_page = soup.select('.product-wrapper')
    
    for product in products_on_page:
        product_page_url = product.select_one('a').get('href')
        add_to_cart_url = product.select_one('div.wd-add-btn a.button').get('href')
        product_page_urls.append(product_page_url)
        add_to_cart_urls.append(add_to_cart_url)
        
        product_title = product.select_one('div.product-wrapper h3.wd-entities-title').text.strip()
        product_info['ProductTitle'] = product_title
        regular_price = product.select_one('span.woocommerce-Price-amount bdi').text.strip()
        product_info['ProductRegularPrice'] = regular_price
        
        discount_price = product.select_one('ins span.woocommerce-Price-amount bdi')
        product_info['ProductDiscountPrice'] = discount_price.text.strip() if discount_price else 'none'
        
        regular_prices_list.append(regular_price)
        all_titles.append(product_title)
        all_products_info.append(product)
#len(all_add_toCard_product)

In [30]:
soup.prettify()



In [31]:
def get_product_details(product):
    product_details = {}
    product_title = product.select_one('div.product-wrapper h3.wd-entities-title').text.strip()
    product_details['ProductTitle'] = product_title

    regular_price_text = product.select_one('span.woocommerce-Price-amount bdi').text.replace("ден", "")
    product_details['ProductRegularPrice'] = regular_price_text

    discount_price_element = product.select_one('ins span.woocommerce-Price-amount bdi')
    if discount_price_element is not None:
        discount_price_text = discount_price_element.text.replace("[", "").replace("]", "").replace(",", "")
        product_details['ProductDiscountPrice'] = discount_price_text
        discount_prices.append(discount_price_element)
    else:
        product_details['ProductDiscountPrice'] = 'Нема попуст'

    product_page_url = product.select_one('a').get('href')
    add_to_cart_url = product.select_one('div.wd-add-btn a.button').get('href')
    product_details['URLToTheProductPage'] = product_page_url
    product_details['URLToTheButtonAdd'] = add_to_cart_url

    return product_details


In [39]:
def get_product_details(product):
    """Извлекува детали за еден производ"""
    return {
        'title': product.select_one('.wd-entities-title').get_text().strip(),
        'regular_price': product.select_one('.woocommerce-Price-amount bdi').get_text().strip().replace("ден", ""),
        'discount_price': (
            product.select_one('ins span.woocommerce-Price-amount bdi').text.strip().replace("ден", "")
            if product.select_one('ins span.woocommerce-Price-amount bdi')
            else None
        ),
        'product_url': product.select_one('a').get('href'),
        'add_to_cart_url': product.select_one('div.wd-add-btn a.button').get('href')
    }

def scrape_clevershop(base_url, max_pages=14):
    """Извлекува податоци од сите страници"""
    all_products = []
    
    for page_num in range(1, max_pages + 1):
        try:
            # Земи ја страницата
            page_url = f"{base_url}page/{page_num}/"
            response = requests.get(page_url)
            response.raise_for_status()  # Провери дали има грешка
            
            # Парсирај ја страницата
            soup = BeautifulSoup(response.text, "html.parser")
            products = soup.select('.product-wrapper')
            
            # Извлечи ги деталите за секој производ
            for product in products:
                try:
                    product_details = get_product_details(product)
                    all_products.append(product_details)
                except AttributeError as e:
                    print(f"Грешка при извлекување на производ: {e}")
                    continue
                    
        except requests.RequestException as e:
            print(f"Грешка при вчитување на страница {page_num}: {e}")
            continue
            
    return all_products

# Користење:
base_url = "https://clevershop.mk/product-category/mobilni-laptopi-i-tableti/"
products = scrape_clevershop(base_url)

In [41]:
products

[{'title': 'Acer A315-23-A7KD',
  'regular_price': '17.590\xa0',
  'discount_price': None,
  'product_url': 'https://clevershop.mk/product/acer-a315-23-a7kd/',
  'add_to_cart_url': '?add-to-cart=21494'},
 {'title': 'Acer A315-23-R5P2',
  'regular_price': '27.490\xa0',
  'discount_price': None,
  'product_url': 'https://clevershop.mk/product/acer-a315-23-r5p2/',
  'add_to_cart_url': '?add-to-cart=21510'},
 {'title': 'ACER Aspire 1 A115-22',
  'regular_price': '18.999\xa0',
  'discount_price': '15.999\xa0',
  'product_url': 'https://clevershop.mk/product/acer-aspire-1-nx-a7pex-001/',
  'add_to_cart_url': '?add-to-cart=20826'},
 {'title': 'Acer Aspire 3 A315-23-R26A',
  'regular_price': '29.990\xa0',
  'discount_price': None,
  'product_url': 'https://clevershop.mk/product/acer-aspire-3-a315-23-r26a/',
  'add_to_cart_url': '?add-to-cart=21516'},
 {'title': 'Acer Aspire 3 A315-58-33WK',
  'regular_price': '24.490\xa0',
  'discount_price': None,
  'product_url': 'https://clevershop.mk/produ

In [32]:
all_product_details = []
for product in all_products:
    product_details = get_product_details(product)
    all_product_details.append(product_details)

### Create a pandas DataFrame with the scraped products

In [33]:
df = pd.DataFrame(all_product_details)

Save the dataframe as `.csv`

In [34]:
df

Unnamed: 0,ProductTitle,ProductRegularPrice,ProductDiscountPrice,URLToTheProductPage,URLToTheButtonAdd
0,Acer A315-23-A7KD,17.590,Нема попуст,https://clevershop.mk/product/acer-a315-23-a7kd/,?add-to-cart=21494
1,Acer A315-23-R5P2,27.490,Нема попуст,https://clevershop.mk/product/acer-a315-23-r5p2/,?add-to-cart=21510
2,ACER Aspire 1 A115-22,18.999,15.999 ден,https://clevershop.mk/product/acer-aspire-1-nx...,?add-to-cart=20826
3,Acer Aspire 3 A315-23-R26A,29.990,Нема попуст,https://clevershop.mk/product/acer-aspire-3-a3...,?add-to-cart=21516
4,Acer Aspire 3 A315-58-33WK,24.490,Нема попуст,https://clevershop.mk/product/21498/,?add-to-cart=21498
...,...,...,...,...,...
315,Monitor 27 Philips 272E1GAJ/00 VA 1ms 144Hz,12.890,Нема попуст,https://clevershop.mk/product/monitor-27-phili...,?add-to-cart=12618
316,Philips 24″ 243V7QDSB,8.390,Нема попуст,https://clevershop.mk/product/philips-24%e2%80...,?add-to-cart=12396
317,Philips 27″ 278E1A/00 4K UHD IPS,18.990,Нема попуст,https://clevershop.mk/product/hp-27%e2%80%b3-2...,?add-to-cart=12218
318,Philips 279C9-00 MON LED 27″ 3840 x 2160 5Ms 6...,26.990,Нема попуст,https://clevershop.mk/product/philips-279c9-00...,?add-to-cart=12578


In [35]:
df.to_csv("products_lab1.csv", index=False)