Project: Web Scraping From the Jumia Website.
    ========
The purpose of this project is to scrape the [jumia](https://www.jumia.co.ke/) website.
Its tended towards finding the most popular items Jumia is selling during this period of covid 19. This will enable us to predict which products are required most and also observe the customer trends. The requisite information is as follows:
* Product name
* Brand Name
* Number of Ratings(Popularity Index)
* Product size and finally 
* Promotional Discount.

### Exploring the page with chrome devtools.
By right clicking the page where we have the text top-selling items and then inspect, we open up the tag that containts the text 'Top Selling Items' in the elements panel of our chrome dev tools.
Always make sure the elements panel is highlighted.

The outermost element that contains all of the information 
about the "Top selling Items" is a div tag with a class name `col16 -mvs`. 
On hovering around the console, we discover that each 'Top selling Item'
is contained in a div tag within a class `item col`.
Every element is in a list  and the list has a link containing all the information we want.
This is enough for downloading the page and the parsing through it. 
Procedure
----
* ~ Download the webpage
* ~ Create a BeautifulSoup object to parse the page.
* ~ Find the div with class  `item col` and assign it to products  
* ~ extract and print the first top item. 

In [2]:
import requests
from bs4 import BeautifulSoup as bs

page = requests.get("https://www.jumia.co.ke/")
soup = bs(page.content, "html.parser")
top_items = soup.find(class_="col16 -mvs")
#print(top_items)
products = top_items.find_all('a') #using the link tags a or #using the class prd box similar methods.
product1 = products[0]#the first product in the list.
#product1
#product name,price and %discount for the first product
name = products[0].find(class_='name').get_text()
#the price of product 1
price = products[1].find(class_='prc').get_text()
#product discount
disc = product1.find(class_='tag _dsct').get_text()

#print(products)
print(name)
print(price)
print(disc)


#For a list of products
#products = top_items.select('div',class_="crs _fl-rec row _no-g -fw-nw _6cl-4cm")
prices = [pt.get_text() for pt in top_items.select('div.prc')]
name = [pt.get_text() for pt in top_items.select('div.name')]
ratings = [pt.get('data-dimension27') for pt in top_items.select('a')]
disc = [pt.get() for pt in top_items.select('div.tag _dsct')]#to be worked on
brand = [b.get('data-brand') for b in top_items.select('a')]
#print(brand)
#print(ratings)
#print(prices)
print(disc)

Gold Beer - 330ml (24 Pcs).
KSh 180
64%
[]


### adding to a pandas dataframe and converting to a CSV file.

In [22]:
import pandas as pd
dir(pd)
JumiaData = pd.DataFrame(
{
    'Product Name':name,
    'Brand Name':brand,
    'Popularity Index':ratings,
    'Product Price':prices
    #'promotional Disct':
}
)
JumiaData.to_csv('products.csv',index=False, encoding='utf-8')
data = pd.read_csv('products.csv')



In [17]:
print(dir(pd))
JumiaData['Product Price']

['Categorical', 'CategoricalDtype', 'CategoricalIndex', 'DataFrame', 'DateOffset', 'DatetimeIndex', 'DatetimeTZDtype', 'ExcelFile', 'ExcelWriter', 'Float64Index', 'Grouper', 'HDFStore', 'Index', 'IndexSlice', 'Int16Dtype', 'Int32Dtype', 'Int64Dtype', 'Int64Index', 'Int8Dtype', 'Interval', 'IntervalDtype', 'IntervalIndex', 'MultiIndex', 'NaT', 'NamedAgg', 'Period', 'PeriodDtype', 'PeriodIndex', 'RangeIndex', 'Series', 'SparseArray', 'SparseDataFrame', 'SparseDtype', 'SparseSeries', 'Timedelta', 'TimedeltaIndex', 'Timestamp', 'UInt16Dtype', 'UInt32Dtype', 'UInt64Dtype', 'UInt64Index', 'UInt8Dtype', '__builtins__', '__cached__', '__doc__', '__docformat__', '__file__', '__getattr__', '__git_version__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '__version__', '_config', '_hashtable', '_lib', '_libs', '_np_version_under1p14', '_np_version_under1p15', '_np_version_under1p16', '_np_version_under1p17', '_tslib', '_typing', '_version', 'api', 'array', 'arrays', 'bdate_rang

0     KSh 1,250
1       KSh 180
2       KSh 300
3       KSh 200
4       KSh 135
5       KSh 472
6       KSh 113
7        KSh 23
8       KSh 371
9       KSh 119
10      KSh 470
11      KSh 196
12      KSh 121
13      KSh 106
14      KSh 196
15    KSh 5,120
16      KSh 198
17       KSh 14
18      KSh 106
19      KSh 665
20       KSh 46
21      KSh 124
22      KSh 123
23      KSh 187
24      KSh 260
Name: Product Price, dtype: object

# first and last terms on the dataframe.

In [256]:
print(JumiaData.head())
print(JumiaData.tail())
JumiaData

                                        Product Name Brand Name  \
0                        Gold Beer - 330ml (24 Pcs).  Ruhr Gold   
1                          Premium White Sugar - 2kg     Kabras   
2  1x Anti-dust Mouth Face Mask Cycling Surgical ...  A General   
3  2pcs Reusable Washable Protective 3-Layer Face...    Generic   
4          Dairy Top Milk 500ml-A  Pack of 12 Pieces      Dairy   

  Popularity Index Product Price  
0              4.3     KSh 1,250  
1              4.5       KSh 200  
2                        KSh 300  
3              3.8       KSh 472  
4              4.6       KSh 470  
                           Product Name Brand Name Popularity Index  \
22            3.5kg Hand Wash Detergent      Ariel              4.6   
23  Chapati Fortified Wheat Flour - 2Kg        Exe              4.7   
24                 Fresh Milk Pouch Esl      Daima              4.6   
25              1kg Hand Wash Detergent      Ariel              4.4   
26    Dish washing Liquid Lemon 

Unnamed: 0,Product Name,Brand Name,Popularity Index,Product Price
0,Gold Beer - 330ml (24 Pcs).,Ruhr Gold,4.3,"KSh 1,250"
1,Premium White Sugar - 2kg,Kabras,4.5,KSh 200
2,1x Anti-dust Mouth Face Mask Cycling Surgical ...,A General,,KSh 300
3,2pcs Reusable Washable Protective 3-Layer Face...,Generic,3.8,KSh 472
4,Dairy Top Milk 500ml-A Pack of 12 Pieces,Dairy,4.6,KSh 470
5,Antibacterial Hand Sanitizer - 50ml,Lifebuoy,4.5,KSh 135
6,All-Purpose Fortified Wheat Flour 2Kg,Ajab,4.8,KSh 113
7,Water - 18.5 Litres - Disposable Bottle,Aquamist,4.5,KSh 371
8,All-Purpose Fortified Wheat Flour - 2Kg,Exe,4.6,KSh 119
9,Penne Rigate - 400g,Santa Maria,5.0,KSh 116


In [244]:
beer = """<div class="crs _fl-rec row _no-g -fw-nw _6cl-4cm">
<div class="itm col">
<a class="prd _box" href="/ruhr-gold-gold-beer-330ml-24-pcs.-28643270.html"
data-id="RU166DB04B2BENAFAMZ" data-name="Gold Beer - 330ml (24 Pcs)." 
data-price="10.81"data-brand="Ruhr Gold"data-category="Grocery/Drinks/Beer, Wine &amp; Spirits/Beers"data-dimension23="62206" 
data-dimension26="15" data-dimension27="4.1" data-dimension28="1"
data-dimension37="0" data-dimension43="GAM0" data-dimension44="0" data-position="1" 
data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true">
<img data-src="https://ke.jumia.is/unsafe/fit-in/300x300/filters:fill(white)/product/07/234682/1.jpg?4424"
src="https://ke.jumia.is/unsafe/fit-in/300x300/filters:fill(white)/product/07/234682/1.jpg?4424" 
class="img" width="185" height="185" alt="Gold Beer - 330ml (24 Pcs)."><div class="name">Gold Beer - 330ml (24 Pcs).</div>
<div class="prc" data-oprc="KSh 3,500">KSh 1,250</div><div class="tag _dsct">64%</div></a>
</div>
<div class="itm col"><a class="prd _box" href="/sugar-2kg-kabras-mpg164322.html" data-id="KA729OT078YENNAFAMZ" 
data-name="Premium White Sugar - 2kg" data-price="1.73" data-brand="Kabras"
data-category="Grocery/Food Cupboard/Sugar &amp; Flour/Sugar" data-dimension23="29156" data-dimension26="1016" 
data-dimension27="4.5" data-dimension28="1" data-dimension37="0" data-dimension43="COL_72|JESS|JMALL" data-dimension44="0" 
data-position="2" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true">
<img data-src="https://ke.jumia.is/unsafe/fit-in/300x300/filters:fill(white)/product/12/175151/1.jpg?5785" 
src="https://ke.jumia.is/unsafe/fit-in/300x300/filters:fill(white)/product/12/175151/1.jpg?5785" class="img"
width="185" height="185" alt="Premium White Sugar - 2kg"><div class="name">Premium White Sugar - 2kg</div>
<div class="prc" data-oprc="KSh 230">KSh 200</div><div class="tag _dsct">13%</div></a></div><div class="itm col">
<a class="prd _box" href="/a-general-1x-anti-dust-mouth-face-mask-cycling-surgical-respirator-adult-reusable-28563366.html"
data-id="AG975ST13HTMUNAFAMZ" data-name="1x Anti-dust Mouth Face Mask Cycling Surgical Respirator Adult Reusable"
data-price="2.59" data-brand="A General" data-category="Health &amp; Beauty/Health Care/First Aid/Masks" 
data-dimension23="19547" data-dimension26="" data-dimension27="" data-dimension28="0" data-dimension37="1" data-dimension43="" data-dimension44="0" data-position="3" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><img data-src="https://ke.jumia.is/unsafe/fit-in/300x300/filters:fill(white)/product/66/336582/1.jpg?8810" src="https://ke.jumia.is/unsafe/fit-in/300x300/filters:fill(white)/product/66/336582/1.jpg?8810" class="img" width="185" height="185" alt="1x Anti-dust Mouth Face Mask Cycling Surgical Respirator Adult Reusable"><div class="name">1x Anti-dust Mouth Face Mask Cycling Surgical Respirator Adult Reusable</div><div class="prc" data-oprc="KSh 432">KSh 300</div><div class="tag _dsct">31%</div></a></div><div class="itm col"><a class="prd _box" href="/reusable-washable-protective-3-layer-face-mask-2pcs-generic-mpg227127.html" data-id="GE840ST1AD4HMNAFAMZ" data-name="2pcs Reusable Washable Protective 3-Layer Face Mask" data-price="4.08" data-brand="Generic" data-category="Health &amp; Beauty/Health Care/First Aid/Masks" data-dimension23="40338" data-dimension26="7" data-dimension27="4.3" data-dimension28="0" data-dimension37="0" data-dimension43="" data-dimension44="0" data-position="4" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">2pcs Reusable Washable Protective 3-Layer Face Mask</div><div class="prc" data-oprc="KSh 700">KSh 472</div><div class="tag _dsct">33%</div></a></div><div class="itm col"><a class="prd _box" href="/dairy-top-milk-500ml-pack-of-12-pieces-dairy-mpg225978.html" data-id="DA278DB17SLIWNAFAMZ" data-name="Dairy Top Milk 500ml-A  Pack of 12 Pieces" data-price="4.05" data-brand="Dairy" data-category="Grocery/Drinks/Milk/Long Life" data-dimension23="52050" data-dimension26="155" data-dimension27="4.6" data-dimension28="1" data-dimension37="0" data-dimension43="COL_31|GAM0" data-dimension44="0" data-position="5" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Dairy Top Milk 500ml-A  Pack of 12 Pieces</div><div class="prc" data-oprc="KSh 600">KSh 470</div><div class="tag _dsct">22%</div></a></div><div class="itm col"><a class="prd _box" href="/antibacterial-hand-sanitizer-55ml-lifebuoy-mpg59091.html" data-id="LI644DR0N73T8NAFAMZ" data-name="Antibacterial Hand Sanitizer - 50ml" data-price="1.17" data-brand="Lifebuoy" data-category="Health &amp; Beauty/Health Care/First Aid/Antibiotics &amp; Antiseptics/Hand Sanitizers" data-dimension23="3196" data-dimension26="99" data-dimension27="4.5" data-dimension28="1" data-dimension37="0" data-dimension43="JESS|JMALL" data-dimension44="0" data-position="6" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Antibacterial Hand Sanitizer - 50ml</div><div class="prc" data-oprc="KSh 150">KSh 135</div><div class="tag _dsct">10%</div></a></div><div class="itm col"><a class="prd _box" href="/ajab-all-purpose-fortified-wheat-flour-2kg-28221490.html" data-id="AJ428FF05LQKANAFAMZ" data-name="All-Purpose Fortified Wheat Flour 2Kg" data-price="0.98" data-brand="Ajab" data-category="Grocery/Food Cupboard/Sugar &amp; Flour/Wheat Flour" data-dimension23="18951" data-dimension26="18" data-dimension27="4.7" data-dimension28="0" data-dimension37="0" data-dimension43="CARR|COL_72|JESS" data-dimension44="0" data-position="7" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">All-Purpose Fortified Wheat Flour 2Kg</div><div class="prc" data-oprc="KSh 119">KSh 113</div><div class="tag _dsct">5%</div></a></div><div class="itm col"><a class="prd _box" href="/water-18.5-litres-disposable-bottle..-aquamist-mpg207674.html" data-id="AQ014OT0AFLTKNAFAMZ" data-name="Water - 18.5 Litres - Disposable Bottle" data-price="3.21" data-brand="Aquamist" data-category="Grocery/Drinks/Water/Drinking Water" data-dimension23="2439" data-dimension26="363" data-dimension27="4.5" data-dimension28="1" data-dimension37="0" data-dimension43="COL_31|JESS" data-dimension44="0" data-position="8" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Water - 18.5 Litres - Disposable Bottle</div><div class="prc" data-oprc="KSh 500">KSh 371</div><div class="tag _dsct">26%</div></a></div><div class="itm col"><a class="prd _box" href="/all-purpose-fortified-wheat-flour-2kg-exe-mpg135181.html" data-id="EX715OT0PMUMHNAFAMZ" data-name="All-Purpose Fortified Wheat Flour - 2Kg" data-price="1.03" data-brand="Exe" data-category="Grocery/Food Cupboard/Sugar &amp; Flour/Wheat Flour" data-dimension23="18951" data-dimension26="292" data-dimension27="4.6" data-dimension28="0" data-dimension37="0" data-dimension43="CARR|COL_31|COL_72|JESS" data-dimension44="0" data-position="9" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">All-Purpose Fortified Wheat Flour - 2Kg</div><div class="prc">KSh 119</div></a></div><div class="itm col"><a class="prd _box" href="/santa-maria-penne-rigate-400g-13520352.html" data-id="SA189OT0F2BK3NAFAMZ" data-name="Penne Rigate - 400g" data-price="1.00" data-brand="Santa Maria" data-category="Grocery/Food Cupboard/Pasta &amp; Noodles/Pasta" data-dimension23="18951" data-dimension26="1" data-dimension27="5" data-dimension28="0" data-dimension37="0" data-dimension43="COL_39" data-dimension44="0" data-position="10" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Penne Rigate - 400g</div><div class="prc" data-oprc="KSh 117">KSh 116</div><div class="tag _dsct">1%</div></a></div><div class="itm col"><a class="prd _box" href="/uht-whole-milk-500-ml-kcc-mpg133993.html" data-id="DA410OT1DIBZDNAFAMZ" data-name="UHT Fino Whole Milk - 500ml" data-price="0.42" data-brand="Daima" data-category="Grocery/Drinks/Milk/Long Life" data-dimension23="18951" data-dimension26="97" data-dimension27="4.4" data-dimension28="0" data-dimension37="0" data-dimension43="CARR|JESS" data-dimension44="0" data-position="11" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">UHT Fino Whole Milk - 500ml</div><div class="prc">KSh 49</div></a></div><div class="itm col"><a class="prd _box" href="/margarine-1kg-blue-band-mpg16676.html" data-id="UN156OT108CFFNAFAMZ" data-name="Margarine - 1kg" data-price="2.25" data-brand="Blue Band" data-category="Grocery/Food Cupboard/Margarine, Jams, Honey &amp; Spreads/Margarine" data-dimension23="28696" data-dimension26="306" data-dimension27="4.6" data-dimension28="1" data-dimension37="0" data-dimension43="" data-dimension44="0" data-position="12" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Margarine - 1kg</div><div class="prc" data-oprc="KSh 330">KSh 260</div><div class="tag _dsct">21%</div></a></div><div class="itm col"><a class="prd _box" href="/nice-lovely-hand-sanitizing-gel-65-ml-28581787.html" data-id="NI579ST1AV7NANAFAMZ" data-name="Hand Sanitizing Gel - 65 Ml" data-price="1.06" data-brand="Nice &amp; Lovely" data-category="Health &amp; Beauty/Health Care/First Aid/Antibiotics &amp; Antiseptics/Hand Sanitizers" data-dimension23="5763" data-dimension26="21" data-dimension27="4.5" data-dimension28="1" data-dimension37="0" data-dimension43="JESS|JMALL" data-dimension44="0" data-position="13" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Hand Sanitizing Gel - 65 Ml</div><div class="prc" data-oprc="KSh 129">KSh 123</div><div class="tag _dsct">5%</div></a></div><div class="itm col"><a class="prd _box" href="/jogoo-maize-meal-2kg-jogoo-mpg133055.html" data-id="JO728OT0PVBEZNAFAMZ" data-name="Maize Meal  - 2kg" data-price="1.07" data-brand="Jogoo" data-category="Grocery/Food Cupboard/Sugar &amp; Flour/Maize - Corn Flour" data-dimension23="18951" data-dimension26="564" data-dimension27="4.4" data-dimension28="0" data-dimension37="0" data-dimension43="CARR|COL_31|COL_72" data-dimension44="0" data-position="14" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Maize Meal  - 2kg</div><div class="prc">KSh 124</div></a></div><div class="itm col"><a class="prd _box" href="/kensalt-iodated-table-salt-1kg-12565062.html" data-id="KE885OT0FIHC9NAFAMZ" data-name="Iodated Table Salt - 1kg" data-price="0.24" data-brand="Kensalt" data-category="Grocery/Food Cupboard/Cooking Ingredients/Salt" data-dimension23="18951" data-dimension26="110" data-dimension27="4.6" data-dimension28="0" data-dimension37="0" data-dimension43="CARR|COL_31|COL_72|JESS" data-dimension44="0" data-position="15" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Iodated Table Salt - 1kg</div><div class="prc">KSh 28</div></a></div><div class="itm col"><a class="prd _box" href="/omo-hand-washing-powder-extra-fresh-3.5kg-25398360.html" data-id="OM483DR03SY20NAFAMZ" data-name="Hand Washing Powder Extra Fresh - 3.5kg" data-price="5.75" data-brand="Omo" data-category="Grocery/Household Supplies/Laundry/Powder Detergent (Hand)" data-dimension23="3196" data-dimension26="62" data-dimension27="4.8" data-dimension28="1" data-dimension37="0" data-dimension43="COL_31|COL_71|GAM0|JMALL" data-dimension44="0" data-position="16" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Hand Washing Powder Extra Fresh - 3.5kg</div><div class="prc" data-oprc="KSh 950">KSh 665</div><div class="tag _dsct">30%</div></a></div><div class="itm col"><a class="prd _box" href="/mountain-dew-soft-drink-600ml-12702149.html" data-id="MO228OT1K1C0HNAFAMZ" data-name="Soft Drink - 600ml" data-price="0.41" data-brand="Mountain Dew" data-category="Grocery/Drinks/Carbonated Drinks/Other Soft Drinks" data-dimension23="18951" data-dimension26="8" data-dimension27="4.4" data-dimension28="0" data-dimension37="0" data-dimension43="CARR" data-dimension44="0" data-position="17" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Soft Drink - 600ml</div><div class="prc">KSh 47</div></a></div><div class="itm col"><a class="prd _box" href="/maccoffee-classic-cofee-sachet-1.6g-13359057.html" data-id="MA909OT18PJXFNAFAMZ" data-name="Classic Cofee Sachet 1.6g" data-price="0.03" data-brand="Maccoffee" data-category="Grocery/Drinks/Coffee, Tea &amp; Cocoa/Coffee" data-dimension23="18951" data-dimension26="9" data-dimension27="4.6" data-dimension28="0" data-dimension37="0" data-dimension43="CARR|COL_31|COL_70" data-dimension44="0" data-position="18" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Classic Cofee Sachet 1.6g</div><div class="prc">KSh 4</div></a></div><div class="itm col"><a class="prd _box" href="/alpro-soya-wholebean-milk-1-litre-12699019.html" data-id="AL234OT1I8KX1NAFAMZ" data-name="Soya Wholebean Milk - 1 Litre" data-price="3.86" data-brand="Alpro" data-category="Grocery/Drinks/Milk/Milk Substitutes" data-dimension23="18951" data-dimension26="" data-dimension27="" data-dimension28="0" data-dimension37="0" data-dimension43="CARR" data-dimension44="0" data-position="19" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Soya Wholebean Milk - 1 Litre</div><div class="prc">KSh 446</div></a></div><div class="itm col"><a class="prd _box" href="/salted-snacks-20g-wow-mpg149162.html" data-id="WO343OT0OESCXNAFAMZ" data-name="Salted Snacks - 20g" data-price="0.08" data-brand="Wow" data-category="Grocery/Food Cupboard/Snacks, Crisps &amp; Nuts" data-dimension23="18951" data-dimension26="" data-dimension27="" data-dimension28="0" data-dimension37="0" data-dimension43="CARR" data-dimension44="0" data-position="20" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Salted Snacks - 20g</div><div class="prc">KSh 9</div></a></div><div class="itm col"><a class="prd _box" href="/indomie-noodles-chicken-flavour-20-pack-8489067.html" data-id="IN727DR19B1YONAFAMZ" data-name="Noodles - Chicken Flavour - 20 Pack" data-price="4.32" data-brand="Indomie" data-category="Grocery/Food Cupboard/Pasta &amp; Noodles/Pasta" data-dimension23="7221" data-dimension26="96" data-dimension27="4.5" data-dimension28="1" data-dimension37="0" data-dimension43="COL_31|GAM0" data-dimension44="0" data-position="21" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Noodles - Chicken Flavour - 20 Pack</div><div class="prc" data-oprc="KSh 600">KSh 500</div><div class="tag _dsct">17%</div></a></div><div class="itm col"><a class="prd _box" href="/red-grape-juice-1-litre-pick-n-peel-mpg147874.html" data-id="PI756OT17EXZLNAFAMZ" data-name="Red Grape Juice – 1 Litre Pick N' Peel" data-price="1.62" data-brand="Pick N' Peel" data-category="Grocery/Drinks/Juices &amp; Other Non Carbonated Drinks/Others" data-dimension23="18951" data-dimension26="3" data-dimension27="4.3" data-dimension28="0" data-dimension37="0" data-dimension43="CARR|COL_25" data-dimension44="0" data-position="22" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Red Grape Juice – 1 Litre Pick N' Peel</div><div class="prc" data-oprc="KSh 188">KSh 187</div><div class="tag _dsct">1%</div></a></div><div class="itm col"><a class="prd _box" href="/3.5kg-hand-wash-detergent-ariel-mpg65283.html" data-id="AR681DR0M9UOUNAFAMZ" data-name="3.5kg Hand Wash Detergent" data-price="6.91" data-brand="Ariel" data-category="Grocery/Household Supplies/Laundry/Powder Detergent (Hand)" data-dimension23="4028" data-dimension26="56" data-dimension27="4.6" data-dimension28="1" data-dimension37="0" data-dimension43="COL_71|JMALL" data-dimension44="0" data-position="23" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">3.5kg Hand Wash Detergent</div><div class="prc" data-oprc="KSh 1,000">KSh 799</div><div class="tag _dsct">20%</div></a></div><div class="itm col"><a class="prd _box" href="/chapati-fortified-wheat-flour-2kg-exe-mpg225484.html" data-id="EX715OT07RUH5NAFAMZ" data-name="Chapati Fortified Wheat Flour - 2Kg" data-price="1.05" data-brand="Exe" data-category="Grocery/Food Cupboard/Sugar &amp; Flour/Wheat Flour" data-dimension23="18951" data-dimension26="82" data-dimension27="4.7" data-dimension28="0" data-dimension37="0" data-dimension43="CARR|COL_31|COL_72|JESS" data-dimension44="0" data-position="24" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Chapati Fortified Wheat Flour - 2Kg</div><div class="prc">KSh 121</div></a></div><div class="itm col"><a class="prd _box" href="/daima-fresh-milk-pouch-esl-24048739.html" data-id="DA410DR1JU48ANAFAMZ" data-name="Fresh Milk Pouch Esl" data-price="0.36" data-brand="Daima" data-category="Grocery/Drinks/Milk/Long Life" data-dimension23="18951" data-dimension26="11" data-dimension27="4.6" data-dimension28="0" data-dimension37="0" data-dimension43="CARR|COL_31" data-dimension44="0" data-position="25" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Fresh Milk Pouch Esl</div><div class="prc" data-oprc="KSh 43">KSh 42</div><div class="tag _dsct">2%</div></a></div><div class="itm col"><a class="prd _box" href="/1kg-hand-wash-detergent-ariel-mpg52406.html" data-id="AR681DR0EOATTNAFAMZ" data-name="1kg Hand Wash Detergent" data-price="2.50" data-brand="Ariel" data-category="Grocery/Household Supplies/Laundry/Powder Detergent (Hand)" data-dimension23="4028" data-dimension26="19" data-dimension27="4.4" data-dimension28="1" data-dimension37="0" data-dimension43="COL_71|JMALL" data-dimension44="0" data-position="26" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">1kg Hand Wash Detergent</div><div class="prc">KSh 290</div></a></div><div class="itm col"><a class="prd _box" href="/lemon-dish-washing-liquid-750ml-sunlight-mpg173632.html" data-id="SU503DR0XCZNCNAFAMZ" data-name="Dish washing Liquid Lemon - 750ml" data-price="1.56" data-brand="Sunlight" data-category="Grocery/Household Supplies/Kitchen Cleaner/Dishwasher Detergent" data-dimension23="3196" data-dimension26="179" data-dimension27="4.6" data-dimension28="1" data-dimension37="0" data-dimension43="JESS|JMALL" data-dimension44="0" data-position="27" data-track-onclick="eecProduct" data-track-onview="eecProduct" data-track-onclick-bound="true"><div class="name">Dish washing Liquid Lemon - 750ml</div><div class="prc" data-oprc="KSh 240">KSh 180</div>
<div class="tag _dsct">25%</div></a></div></div>"""
from bs4 import BeautifulSoup as bs

beer = bs(beer,'lxml')#findng the html content
#beer.status_code only when online!!
#find the price, disc, rating, name,brand.#testing
price = beer.find(class_='prc').get_text()
disc = beer.find(class_='tag _dsct').get_text()
rating = beer.find('a')
rate = rating['data-dimension27']
print(rate)
print(disc)
name = beer.find(class_='name').get_text()
brand = beer.find('a')
brand = brand['data-brand']
print(brand)

prices = [pt.get_text() for pt in beer.select('div.prc')]
name = [pt.get_text() for pt in beer.select('div.name')]
ratings = [pt.get('data-dimension27') for pt in beer.select('a')]
print(ratings)
disc = [pt.get() for pt in beer.select('div.tag _dsct')]#to be worked on
print(disc)
brand = [b.get('data-brand') for b in beer.select('a')]
print(brand)
print(prices)

4.1
64%
Ruhr Gold
['4.1', '4.5', '', '4.3', '4.6', '4.5', '4.7', '4.5', '4.6', '5', '4.4', '4.6', '4.5', '4.4', '4.6', '4.8', '4.4', '4.6', '', '', '4.5', '4.3', '4.6', '4.7', '4.6', '4.4', '4.6']
[]
['Ruhr Gold', 'Kabras', 'A General', 'Generic', 'Dairy', 'Lifebuoy', 'Ajab', 'Aquamist', 'Exe', 'Santa Maria', 'Daima', 'Blue Band', 'Nice & Lovely', 'Jogoo', 'Kensalt', 'Omo', 'Mountain Dew', 'Maccoffee', 'Alpro', 'Wow', 'Indomie', "Pick N' Peel", 'Ariel', 'Exe', 'Daima', 'Ariel', 'Sunlight']
['KSh 1,250', 'KSh 200', 'KSh 300', 'KSh 472', 'KSh 470', 'KSh 135', 'KSh 113', 'KSh 371', 'KSh 119', 'KSh 116', 'KSh 49', 'KSh 260', 'KSh 123', 'KSh 124', 'KSh 28', 'KSh 665', 'KSh 47', 'KSh 4', 'KSh 446', 'KSh 9', 'KSh 500', 'KSh 187', 'KSh 799', 'KSh 121', 'KSh 42', 'KSh 290', 'KSh 180']


In [143]:
names = []
prices = []
discs = []
for a in products:
    name = a.find('div', {'class':'name'})#.get_text()
    price = a.find('div',{'class':'prc'})#.get_text()
    disc = a.find('div', {'class':'tag _dsct'})#.get_text()
    try:
        names.append(name.text())
    except:
        products.append('----')
    
print(names)
print(prices)
print(discs)
#(products[0])

TypeError: slice indices must be integers or None or have an __index__ method

using tags to find attributes

In [81]:
img=product1.find("img")
img = img['alt']
print(img)

info = product1.find_all('data-brand')
#brand = info['data-brand']
print(info)

Gold Beer - 330ml (24 Pcs).
[]


### Extracting all thee information from the page.
----

Combining the knowledge we know with css selectors and list comprehension to extract everything at once. 
#### procedure.
- select all items with the class `prd _box` inside and item with the class `name` in our div tag.
- use list comprehension to call the `get_text()` method on each BS object.

In [None]:
#jm > main > div.row.-pvm > div.col16.-mvs > div > div > div

In [65]:
product_tag = top_items.select('a')#all the products in the topitem tag
prods = [pt.get_text() for pt in product_tag] #list containing everything
print(len(prods))#number of items in the list.

27


In [95]:
item1=products[0].__getattr__('class')
print(item1)

None


In [85]:
#list of lists.
brand = products['data-brand']

TypeError: list indices must be integers or slices, not str

In [91]:
dir(products)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattr__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__module__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort',
 'source']

product reviews.

add 2 reviews, one negative(1), second postive(5). 

example. 
a1, a2, a3 = 1000, 10, 0
b1, b2, b3 = 4.5, 5, 0

```python
a1, a2, a3 = 1000, 10, 0
b1, b2, b3 = 4.5, 5, 0
actural1 = round((a1*b1)+5+1)/(a1+2)
actural2 = round((a2*b2)+5+1)/(a2+2)
actural3 = round((a3*b3)+5+1)/(a3+2)
print(actual1,actual2,actual3)
```

In [4]:
a1, a2, a3 = 1000, 10, 0
b1, b2, b3 = 4.5, 5, 0
actual1 = round(((a1*b1)+5+1)/(a1+2),4)
actual2 = round(((a2*b2)+5+1)/(a2+2),4)
actual3 = round(((a3*b3)+5+1)/(a3+2),4)
print(actual1,actual2,actual3)

4.497 4.6667 3.0


In [15]:
from bs4 import BeautifulSoup as bs
import requests
import re
# Scrape for links from any div to get products.
url = "https://www.jumia.co.ke/"

response = requests.get(url)
html = response.text
soup = bs(html,'html.parser')


topselling = soup.find_all("div", {"class": "crs-w _main -pvs -phxs"})
links = re.findall('href="/(\S+\.html)"',topselling)

TypeError: expected string or bytes-like object

In [16]:
url  = "https://en.wikipedia.org/wiki/Politics_of_Kenya"

In [23]:
response = requests.get(url)
html = response.text
soup = bs(html,"html.parser")


In [25]:
word_list = []
for para in soup.find_all('p'):
    x= para.text
    words = x.split
    for i in words:
        word_list.append(i)
        
print(word_list)

TypeError: 'builtin_function_or_method' object is not iterable

In [28]:
draft1 = list()
#remove citations
for word in word_list:
    cited = re.findall("(\S+)\[[0-9]+\]",word)
    if len(cited) > 0:
        print(cited)
        
draft2 = list()
#remove acronyms and numbers
for word in draft1:
    if re.search("[A-Z][A-Z]", word):
        pass
    elif re.search("[A-Z][a-a][A-Z]", word):
        pass
    elif re.search("[0-9]", word):
        pass
    else:
        draft2.append(word)
        
#remove punctuations
draft3 = list()

for word in draft2:
    if re.search("[A-Za-z][\.][A-Za-z]", word):
        a = re.sub(",."," ", word)
        for word in a.split():
            draft3.append(word)        
    elif re.search('[,\./\(\)"]+\S+', word):
        b = re.search('[,\./\(\)"]+(\S+)', word)
        draft3.append(b[0])
    else:
        draft.append(word)
        
        
draft4 = list()

for word in draft3:
    if re.search('\S+[,\./\(\)"]+', word):
        c = re.search('\S+[,\./\(\)"]+', word)
        draft4.append(c[0])
    else:
        draft4.append(word)
        
#removing unnecessay one-charactr words
final = list()
for word in draft4:
    if len(word) == 1:
        if word.lower() == "a" or word == "I":
            final.append(word.lower())
        else:
            pass
    else:
        final.append(word.lower())
        


SyntaxError: unexpected EOF while parsing (<ipython-input-28-db5b1832a094>, line 25)

In [None]:
#most commonly used words.
word_count = dict{}

for word in final:
    if word not in word_count:
        word_count[word]=1
    else:
        word_count[word]