# Cellarcentral Website Scraper Project (BeautifulSoup)
### About Cellarcentra Website

At Cellar Central, our mission is to provide logistics solutions that deliver the best shopping experience one customer at a time. We are the leading retail distributor of premium wines and spirits in Nigeria. Our order channels include phone, email, social media and website. We deliver directly to customers’ doorstep and our reach covers major Nigerian cities including Lagos, Port Harcourt, Onitsha, Aba, Abuja and Benin.

**Store Location**

45 Rasaq Balogun Street, Surulere, Lagos

Phone: 0807 806 1111,  0906 415 6296

Email: info@cellarcentral.ng

**Opening Times**

Mondays to Fridays
8 AM - 5 PM

Saturdays
9 AM - 1 PM

**Payment Methods**
We support payments with debit/credit cards (powered by Paystack). We also accept bank transfers to our specified bank account. Orders are shipped on confirmation of payment.

visit site for more information here: https://www.cellarcentral.ng/

### Five product categories to scrape and their info
##### 1. Cellarcentral Wine Products
Wines are alcoholic beverages that are gotten from fermented grape juice. At Cellar Central; they have a wide range of wines: red, white, rose, sparkling etc. from different parts of the world. 

##### 2. Cellarcentral Spirit Products
Spirits are alcoholic beverages which are distilled rather than fermented like in wines. spirits category include rum, whisky, cognac, aperitifs & so on.

##### 3. Cellarcentral Champagne and Sparkling Products
Champagne are sparkling wines that are made in a region in France called Champagne; these are the very bubbly wines. Champagne is a type of sparkling wine with plenty lively bubbles. Sparkling wines are also bubbling wines but are not as lively as that of champagne.

##### 4. Cellarcentral Hampers Products
HAMPERS : Buy premium quality hampers and gift packs for the Christmas, New year, Valentine's Day, Anniversaries, Mother's Day, Father's Day, etc. and have them delivered to your loved ones.

##### 5. Cellarcentral Gift Ideas Products
GIFT IDEAS : Get  your ideal Father's Day, Christmas and Birthday gifts for your friends, colleagues and family members this season from our special offers.

**NOTE: All categories has ABV 5.5 and greater with the alcoholic beverages**

### Problem statement 
1. Scrape products of different categories and their respective informations as product name, price, image, capacity and link.
2. Combine Cellarcentral Wine Products, Cellarcentra Spirit Products, Cellarcentral Champagne and Sparkling Products and Cellarcentral Gift Ideas Products into one DataFrame.

##### Importing necessary labraries

In [1]:
import requests
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np 

# 1. Scraping all Products

### Scraping Cellarcentral Wine Product List

In [2]:
CellarCentra_wine_list = []

# step 1: Pagination: that is navigating through multiple pages
for i in range (1, 5):
    headers = headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36'}

    # requesting for url. In our case the status code is 200
    permission = requests.get(f'https://www.cellarcentral.ng/wines?page={i}', headers =  headers)

    # passing the html file with beautifulSoup in other to get it in a more redeable and accessible form
    soup = BeautifulSoup(permission.content, 'html.parser')

# step 2:
# saving all html tags that contains all product listings
    all_CellarCentra_wine_products = soup.find_all('div', class_ = 'product-thumb')

    # looping through the above tag to scrape info about each product
    for product in all_CellarCentra_wine_products:
        product_name = product.find('div', class_ = 'caption').a.text.strip().split('*')[0]
        product_price = product.find('div', class_ = 'caption').p.text.strip()
        product_link = product.find('div', class_ = 'image').a['href']
        
        # getting image link for each product
        image_tags = product.find_all('div', class_ = 'image')
        for image in image_tags:
            image = image.find('img', class_ = 'img-fluid')
            image['src']
        product_image = image['src']
        
        # trying a particular info about a product so that if not found set it as null value        
        try:
            product_vol = product.find('div', class_ = 'caption').a.text.strip().split('*')[1]
        except:
            product_vol =  np.NaN

# saving all product listings in pandas dictionary
        CellarCentra_wine_products = {
                        'wine_name' : product_name,
                        'wine_price' : product_price,
                        'wine_vol' : product_vol,
                        'wine_image' : product_image,
                        'wine_product_link' : product_link
                          }

# appending into the list created above in this block all product listings saved in the dictionary 
        CellarCentra_wine_list.append(CellarCentra_wine_products)
        #print('Wine Product Info Saving :' , CellarCentra_wine_products['wine_name'])
  
 #step 3: output

df_wine = pd.DataFrame(CellarCentra_wine_list)
df_wine.to_csv('CellarCentra wine data.csv', index = False)  

### Scraping Cellarcentral Spirit Product List

In [3]:
CellarCentra_spirit_list = []

# step 1
for j in range (1, 8):
    headers = headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36'}
    permission = requests.get(f'https://www.cellarcentral.ng/spirits?page={j}', headers =  headers)
    soup = BeautifulSoup(permission.content, 'html.parser')

# step 2    
    all_CellarCentra_spirit_products = soup.find_all('div', class_ = 'product-thumb')

    for product in all_CellarCentra_spirit_products:
        product_name = product.find('div', class_ = 'caption').a.text.strip().split('*')[0]
        product_price = product.find('div', class_ = 'caption').p.text.strip()
        product_link = product.find('div', class_ = 'image').a['href']
        
        # getting image link for each product
        image_tags = product.find_all('div', class_ = 'image')
        for image in image_tags:
            image = image.find('img', class_ = 'img-fluid')
            image['src']
        product_image = image['src']
        
        try:
            product_vol = product.find('div', class_ = 'caption').a.text.strip().split('*')[1]
        except:
            product_vol = np.NaN

        CellarCentra_spirit_products = {
                        'spirit_name' : product_name,
                        'spirit_price' : product_price,
                        'spirit_vol' : product_vol,
                        'spirit_image' : product_image,
                        'spirit_product_link' : product_link
                          }

        CellarCentra_spirit_list.append(CellarCentra_spirit_products)
        #print('spirit Product Info Saving :' , CellarCentra_spirit_products['spirit_name'])

    
# step 3: output
df_spirit = pd.DataFrame(CellarCentra_spirit_list)
df_spirit.to_csv('CellarCentra spirit data.csv', index = False)

### Scraping Cellarcentral Champagne Product List

In [4]:
CellarCentra_Champagne_list = []

# step 1
for x in range (1, 4):
    headers = headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36'}
    permission = requests.get(f'https://www.cellarcentral.ng/champagne-and-sparkling?page={x}', headers =  headers)
    soup = BeautifulSoup(permission.content, 'html.parser')

# step 2
    all_CellarCentra_Champagne_products = soup.find_all('div', class_ = 'product-thumb')

    for product in all_CellarCentra_Champagne_products:
        product_name = product.find('div', class_ = 'caption').a.text.strip().split('*')[0]
        product_price = product.find('div', class_ = 'caption').p.text.strip()
        product_link = product.find('div', class_ = 'image').a['href']
        
        # getting image link for each product
        image_tags = product.find_all('div', class_ = 'image')
        for image in image_tags:
            image = image.find('img', class_ = 'img-fluid')
            image['src']
        product_image = image['src']
        
        try:
            product_vol = product.find('div', class_ = 'caption').a.text.strip().split('*')[1]
        except:
            product_vol = np.NaN

        CellarCentra_Champagne_products = {
                        'Champagne_name' : product_name,
                        'Champagne_price' : product_price,
                        'Champagne_vol' : product_vol,
                        'Champagne_image' : product_image,
                         'Champagne_product_link' : product_link
                          }

        CellarCentra_Champagne_list.append(CellarCentra_Champagne_products)
        #print('Champagne Product Info Saving :' , CellarCentra_Champagne_products['Champagne_name'])

    
# step 3: output
df_champagne = pd.DataFrame(CellarCentra_Champagne_list)
df_champagne.to_csv('CellarCentra champagne data.csv', index = False)

### Scraping Cellarcentral Gift Ideas Product List

In [5]:
CellarCentra_gift_ideas_list = []

# step 1
for y in range (1, 4):
    headers = headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36'}
    permission = requests.get(f'https://www.cellarcentral.ng/special-offers?page={y}', headers =  headers)
    soup = BeautifulSoup(permission.content, 'html.parser')

# step 2
    all_CellarCentra_gift_ideas_products = soup.find_all('div', class_ = 'product-thumb')

    for product in all_CellarCentra_gift_ideas_products:
        product_name = product.find('div', class_ = 'caption').a.text.strip().split('*')[0]
        product_price = product.find('div', class_ = 'caption').p.text.strip()
        product_link = product.find('div', class_ = 'image').a['href']
        
        # getting image link for each product
        image_tags = product.find_all('div', class_ = 'image')
        for image in image_tags:
            image = image.find('img', class_ = 'img-fluid')
            image['src']
        product_image = image['src']
        
        try:
            product_vol = product.find('div', class_ = 'caption').a.text.strip().split('*')[1]
        except:
            product_vol = np.NaN

        CellarCentra_gift_ideas_products = {
                        'gift_ideas_name' : product_name,
                        'gift_ideas_price' : product_price,
                        'gift_ideas_vol' : product_vol,
                        'gift_ideas_image' : product_image,
                        'gift_ideas_product_link' : product_link
                          }

        CellarCentra_gift_ideas_list.append(CellarCentra_gift_ideas_products)
        #print('gift_ideas Product Info Saving :' , CellarCentra_gift_ideas_products['gift_ideas_name'])

    
# step 3: output
df_gift_ideas = pd.DataFrame(CellarCentra_gift_ideas_list)
df_gift_ideas.to_csv('CellarCentra gift idea data.csv', index = False)

### Scraping Cellarcentral hampers Product List

In [6]:
CellarCentra_hampers_list = []

# step 1
headers = headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36'}
permission = requests.get('https://www.cellarcentral.ng/hampers', headers =  headers)
soup = BeautifulSoup(permission.content, 'html.parser')

# step 2
all_CellarCentra_hampers_products = soup.find_all('div', class_ = 'product-thumb')

for product in all_CellarCentra_hampers_products:
    product_name = product.find('div', class_ = 'caption').a.text.strip()
    product_price = product.find('div', class_ = 'caption').p.text.strip()
    product_link = product.find('div', class_ = 'image').a['href']
    
    # getting image link for each product
    image_tags = product.find_all('div', class_ = 'image')
    for image in image_tags:
        image = image.find('img', class_ = 'img-fluid')
        image['src']
    product_image = image['src']
        
    CellarCentra_hampers_products = {
                        'hampers_name' : product_name,
                        'hampers_price' : product_price,
                        'hampers_image' : product_image,
                        'hampers_product_link' : product_link
                          }

    CellarCentra_hampers_list.append(CellarCentra_hampers_products)
    #print('hampers Product Info Saving :' , CellarCentra_hampers_products['hampers_name'])

    
# step 3: output
df_hampers = pd.DataFrame(CellarCentra_hampers_list)
df_hampers.to_csv('CellarCentra hampers data.csv', index = False)

# 2. Combine Cellarcentral all Product
### Reading the Scraped Datas into workspace

In [7]:
df_wine = pd.read_csv('CellarCentra wine data.csv')
df_spirit = pd.read_csv('CellarCentra spirit data.csv')
df_champagne = pd.read_csv('CellarCentra champagne data.csv')
df_gift_idea = pd.read_csv('CellarCentra gift idea data.csv')
df_hampers = pd.read_csv('CellarCentra hampers data.csv')

### Accesing the data

In [8]:
df_wine.head(3)

Unnamed: 0,wine_name,wine_price,wine_vol,wine_image,wine_product_link
0,B&G Cuvee Reserve Speciale Rose,"₦3,900.00",75cl,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/wines?product_id=704
1,B&G Reserve Chardonnay,"₦5,200.00",75CL,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/wines/b-and-g-res...
2,Boekenhoutskloof Semillon White,"₦13,000.00",75cl,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/wines?product_id=563


In [9]:
df_spirit.head(3)

Unnamed: 0,spirit_name,spirit_price,spirit_vol,spirit_image,spirit_product_link
0,Absolut Vanilia,"₦6,000.00",1L,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/spirits?product_i...
1,Amaro Di Angostura,"₦7,700.00",70CL,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/spirits?product_i...
2,American Honey,"₦9,600.00",75CL,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/spirits/american-...


In [10]:
df_champagne.head(3)

Unnamed: 0,Champagne_name,Champagne_price,Champagne_vol,Champagne_image,Champagne_product_link
0,Armand de Brignac Brut Gold,"₦255,200.00",75cl (Ace of Spade),https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/champagne-and-spa...
1,Armand de Brignac Rosé,"₦359,600.00",75cl (Ace of Spade),https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/champagne-and-spa...
2,Billecart Salmon Brut Rose,"₦58,500.00",75CL,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/champagne-and-spa...


In [11]:
df_gift_idea.head(3)

Unnamed: 0,gift_ideas_name,gift_ideas_price,gift_ideas_vol,gift_ideas_image,gift_ideas_product_link
0,Angostura 1824 12yrs Old,"₦29,600.00",70CL,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/special-offers?pr...
1,Armand de Brignac Rosé,"₦359,600.00",75cl (Ace of Spade),https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/special-offers?pr...
2,BARON OTARD® VSOP Cognac,"₦30,700.00",70cl,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/special-offers/ba...


### Add a new column category to each dataframe

In [12]:
categories = np.repeat('wine', df_wine.shape[0])
df_wine['category'] = categories

categories = np.repeat('spirit', df_spirit.shape[0])
df_spirit['category'] = categories

categories = np.repeat('hampagne', df_champagne.shape[0])
df_champagne['category'] = categories

categories = np.repeat('gift idea', df_gift_idea.shape[0])
df_gift_idea['category'] = categories

df_gift_idea.head(2)

Unnamed: 0,gift_ideas_name,gift_ideas_price,gift_ideas_vol,gift_ideas_image,gift_ideas_product_link,category
0,Angostura 1824 12yrs Old,"₦29,600.00",70CL,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/special-offers?pr...,gift idea
1,Armand de Brignac Rosé,"₦359,600.00",75cl (Ace of Spade),https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/special-offers?pr...,gift idea


In [13]:
df_wine.head(2)

Unnamed: 0,wine_name,wine_price,wine_vol,wine_image,wine_product_link,category
0,B&G Cuvee Reserve Speciale Rose,"₦3,900.00",75cl,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/wines?product_id=704,wine
1,B&G Reserve Chardonnay,"₦5,200.00",75CL,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/wines/b-and-g-res...,wine


In [14]:
df_spirit.head(2)

Unnamed: 0,spirit_name,spirit_price,spirit_vol,spirit_image,spirit_product_link,category
0,Absolut Vanilia,"₦6,000.00",1L,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/spirits?product_i...,spirit
1,Amaro Di Angostura,"₦7,700.00",70CL,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/spirits?product_i...,spirit


In [15]:
df_champagne.head(2)

Unnamed: 0,Champagne_name,Champagne_price,Champagne_vol,Champagne_image,Champagne_product_link,category
0,Armand de Brignac Brut Gold,"₦255,200.00",75cl (Ace of Spade),https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/champagne-and-spa...,hampagne
1,Armand de Brignac Rosé,"₦359,600.00",75cl (Ace of Spade),https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/champagne-and-spa...,hampagne


### Rename the column labels in other four dataframes



In [17]:
df_wine.rename(columns={'wine_name': 'product_name','wine_price':'product_price', 'wine_vol' : 'product_capacity', 'wine_image' : 'product_image',
                          'wine_product_link' : 'product_link'},inplace=True)

df_spirit.rename(columns={'spirit_name': 'product_name','spirit_price':'product_price', 'spirit_vol' : 'product_capacity','spirit_image' : 'product_image',
                          'spirit_product_link' : 'product_link'},inplace=True)

df_champagne.rename(columns={'Champagne_name': 'product_name','Champagne_price':'product_price', 'Champagne_vol' : 'product_capacity','Champagne_image' : 'product_image',
                          'Champagne_product_link' : 'product_link'},inplace=True)

df_gift_idea.rename(columns={'gift_ideas_name': 'product_name','gift_ideas_price':'product_price', 'gift_ideas_vol' : 'product_capacity','gift_ideas_image' : 'product_image',
                          'gift_ideas_product_link' : 'product_link'},inplace=True)

## combining all procuts into Master DataFrame

In [18]:
master_Data = pd.concat([df_wine, df_spirit,df_champagne, df_gift_idea], axis=0)

In [19]:
master_Data

Unnamed: 0,product_name,product_price,product_capacity,product_image,product_link,category
0,B&G Cuvee Reserve Speciale Rose,"₦3,900.00",75cl,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/wines?product_id=704,wine
1,B&G Reserve Chardonnay,"₦5,200.00",75CL,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/wines/b-and-g-res...,wine
2,Boekenhoutskloof Semillon White,"₦13,000.00",75cl,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/wines?product_id=563,wine
3,Cheval Des Andes,"₦56,100.00",75cl,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/wines?product_id=495,wine
4,Clarendelle Bordeaux Rose,"₦8,200.00",75cl,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/wines/clarendelle...,wine
...,...,...,...,...,...,...
33,Valentine's Gift Set for Her 1,"₦100,000.00",,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/special-offers?pr...,gift idea
34,Valentine's Gift Set for Her 2,"₦60,000.00",,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/special-offers?pr...,gift idea
35,Valentine's Gift Set for Him 1,"₦100,000.00",,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/special-offers?pr...,gift idea
36,Valentine's Gift Set for Him 2,"₦60,000.00",,https://www.cellarcentral.ng/image/cache/catal...,https://www.cellarcentral.ng/special-offers?pr...,gift idea


In [20]:
master_Data.to_csv('CellarCentral website products data.csv', index = False)