## Data Scraping - An ecommerce website (KILIMALL)

## OBJECTIVE


This notebook aims to **scrape product data** from the **Phones & Accessories** category on [Kilimall Kenya](https://www.kilimall.co.ke). The script collects multiple product attributes; including product name, price, review count, rating, and product URL across the first 50 pages. The goal is to build a dataset for further analysis, price comparison, or product trend insights.

Dataset Link - https://www.kilimall.co.ke/

In [6]:
#imports necessary libraries
import requests
import random
import time
import csv
import pandas as pd
from bs4 import BeautifulSoup

In [10]:
#intialise an empty list to store the data
product_data = []
# main url of the website
base_url = 'https://www.kilimall.co.ke'
# loop through each page of the website
for i in range(1, 51):
    url = f'https://www.kilimall.co.ke/category/phones-accessories?id=872&form=category&source=category|allCategory|Phones+&+Accessories&page={i}'
    #https requests
    try:
        response = requests.get(url)
        response.raise_for_status()
    except requests.exceptions.RequestException as e:
        print(f"[ERROR] Page {i} fetch failed: {e}")
        continue
    # creating a soup object
    soup = BeautifulSoup(response.content, 'html.parser')
    # finding all the products in the 50 pages
    products = soup.find_all('div', class_='product-item')
    print(f'Page {i}: found {len(products)} products')
    # looping through each product
    for product in products:
        try:
            name = product.find('p', class_='product-title').text.strip()
            price = product.find('div', class_='product-price').text
            price = int(price.replace('KSh', '').replace(',', '').strip())
            review_element = product.find('span', class_='reviews')
            review_count = review_element.text.strip('()') if review_element else '0'
            stars_rating = product.find_all('div', class_='van-rate__item')
            rating = sum(1 for s in stars_rating if s.get('aria-checked') == 'true')

            link = product.find('a', href=True)['href']
            product_url = base_url + link
            #appending the elements to the product data
            product_data.append([name, price, review_count, rating, product_url])

        except Exception as e:
            print(f"Error while scraping: {e}")

    #sleep once per page, after processing all products
    waiting_time = random.uniform(2, 5)
    print(f"Waiting {waiting_time:.1f}s before next page…")
    time.sleep(waiting_time)

print(f"Done. Total items: {len(product_data)}")

Page 1: found 36 products
Waiting 3.5s before next page…
Page 2: found 36 products
Waiting 3.5s before next page…
Page 3: found 36 products
Waiting 4.4s before next page…
Page 4: found 36 products
Waiting 3.2s before next page…
Page 5: found 36 products
Waiting 3.3s before next page…
Page 6: found 36 products
Waiting 3.2s before next page…
Page 7: found 36 products
Waiting 2.8s before next page…
Page 8: found 36 products
Waiting 3.7s before next page…
Page 9: found 36 products
Waiting 2.4s before next page…
Page 10: found 36 products
Waiting 2.7s before next page…
Page 11: found 36 products
Waiting 4.4s before next page…
Page 12: found 36 products
Waiting 4.2s before next page…
Page 13: found 36 products
Waiting 3.5s before next page…
Page 14: found 36 products
Waiting 4.4s before next page…
Page 15: found 36 products
Waiting 4.1s before next page…
Page 16: found 36 products
Waiting 4.4s before next page…
Page 17: found 36 products
Waiting 3.8s before next page…
Page 18: found 36 produ

In [11]:
with open('kilimall_phone_accessories.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.writer(f)
    #writes the titles/headers
    writer.writerow(['Product Name', 'Price (KSh)', 'Review Count', 'Rating', 'Product URL'])
    #writes the data
    writer.writerows(product_data)

In [7]:
#opening the data as a dataframe
phone_accessories_df = pd.read_csv('kilimall_phone_accessories.csv')

In [8]:
phone_accessories_df

Unnamed: 0,Product Name,Price (KSh),Review Count,Rating,Product URL
0,"Refurbished OPPO R9 oppo r9 -x9009 - 5.5'', 64...",4699,807,3,https://www.kilimall.co.ke/listing/2288414-ref...
1,Refurbished Oppo A83 Gold Red Blue Black 4G+64...,4790,28,4,https://www.kilimall.co.ke/listing/1001081296-...
2,{Anniversary}XIAOMI Redmi 14C 128GB Storage Up...,10999,593,4,https://www.kilimall.co.ke/listing/1000332494-...
3,Refurbished OPPO Reno 2z 128GB+8GB 6.5 inch 48...,8590,541,4,https://www.kilimall.co.ke/listing/2543007-ref...
4,Air Pro3 MAX TWS Macaron Color inPods13 Pro 3 ...,389,5591,4,https://www.kilimall.co.ke/listing/2467681-air...
...,...,...,...,...,...
1795,NEW ARRIVALS!!!ITEL A70 4G AWESOME 256GB Stora...,14999,0,5,https://www.kilimall.co.ke/listing/1001087857-...
1796,Wallet Flip Cover for Tecno Camon 20 / Camon 2...,1299,2,5,https://www.kilimall.co.ke/listing/2783224-wal...
1797,Refurbished Sony Xperia 5 II Single SIM 128GB ...,18999,1,5,https://www.kilimall.co.ke/listing/2737759-ref...
1798,Fast Charging powerbank with inbuilt charging ...,1800,1,5,https://www.kilimall.co.ke/listing/1001303320-...
