## Requirement:
1. Your function should be able to take in an URL and return a pandas dataframe
2. The final dataframe should contain the following informations: 
    * Product ID
    * Seller ID
    * Product title
    * Price
    * URL of the product image
    * URL of that product page

Bonus information:
* Is it TikiNow (delivery within 2 hours) <img src="https://salt.tikicdn.com/ts/upload/9f/32/dd/8a8d39d4453399569dfb3e80fe01de75.png">?
* Is it free delivery?
* Number of reviews?
* How many stars or percentage of stars?
* Does it got "badge under price" (Rẻ hơn hoàn tiền) <img src="https://salt.tikicdn.com/ts/upload/51/ac/cc/528e80fe3f464f910174e2fdf8887b6f.png">?
* Discount percentage?
* Does it got "shocking price" badge ? <img src="https://salt.tikicdn.com/ts/upload/75/34/d2/4a9a0958a782da8930cdad8f08afff37.png">
* Does it allowed to be paid by installments? <img src="https://salt.tikicdn.com/ts/upload/ba/4e/6e/26e9f2487e9f49b7dcf4043960e687dd.png">
* Does it comes with free gifts? <img src="https://salt.tikicdn.com/ts/upload/47/35/8c/446f61d046eba9a305d3f39dc0834c4a.png">
    

In [1]:
import requests
import pandas as pd
from bs4 import BeautifulSoup

In [2]:
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'}

r = requests.get('https://tiki.vn/laptop-may-vi-tinh/c1846?src=c.1846.hamburger_menu_fly_out_banner', headers=headers)
# r.text is a HTML file so we will use html.parser
soup = BeautifulSoup(r.text, 'html.parser')

# Make the soup object look nicer
print(soup.prettify()[:2000])

<!DOCTYPE html>
<html class="no-js" lang="vi">
 <head>
  <meta charset="utf-8" class="next-head"/>
  <link class="next-head" href="https://frontend.tikicdn.com/_desktop-next/static/css/_sprite.css?v=20201118650DJpNqKCF5S3z3mHCoAm6B" rel="stylesheet" type="text/css"/>
  <title class="next-head">
   Mua online Laptop - Máy Vi Tính - Linh kiện giá cực tốt | Tiki
  </title>
  <meta class="next-head" content="Mua online Laptop - Máy Vi Tính - Linh kiện giá cực tốt | Tiki" property="og:title"/>
  <meta class="next-head" content="Nhiều mặt hàng Laptop - Máy Vi Tính - Linh kiện hàng chất lượng, mẫu mã đa dạng, freeship và giao siêu nhanh. Shop uy tín" property="og:description"/>
  <meta class="next-head" content="Nhiều mặt hàng Laptop - Máy Vi Tính - Linh kiện hàng chất lượng, mẫu mã đa dạng, freeship và giao siêu nhanh. Shop uy tín" name="description"/>
  <link class="next-head" href="https://tiki.vn/laptop-may-vi-tinh/c1846.html" rel="canonical"/>
  <meta class="next-head" content="1846" nam

In [3]:
import re

In [4]:
# All occurences of the products in that page
print("\nAll occurences of the product div sections:")
products = soup.find_all('a', {'class':'product-item'})

print("Type:", type(products))
print("Number of products:", len(products))


All occurences of the product div sections:
Type: <class 'bs4.element.ResultSet'>
Number of products: 48


In [5]:
product_id_ls = []
product_title_ls = []
price_ls = []
image_url_ls = []
product_url_ls = []
within_2_hour_ls = []
free_delivery_ls = []
num_reviews_ls = []
percentage_ratings_ls = []
badge_under_price_ls = []
discount_percent_ls = []
shocking_price_ls = []
paid_installment_ls = []
free_gift_ls = []

In [10]:
regex = re.compile('-p\d*\.')


for product in products:
    
    try:
        product_link = product['href']
        product_url = product_link
        product_id = regex.findall(product_link)[0][1:-1]
        print(product_url)
        product_url_ls.append(product_url)
    except:
        print('product link got error. move on to next product')
        continue
        
    # grab image url
    try:
        image_url = product.img['src']
        print(image_url)
    except:
        image_url = "NA"
        
    image_url_ls.append(image_url)
    print(product_id)
    product_id_ls.append(product_id)
    
    # find name
    try:
        product_name = product.find('div', {'class':'name'}).span.text
    except:
        product_name = "NA"
    print(product_name)
    product_title_ls.append(product_name)
    
    # find price
    try:
        product_price = product.find('div', {'class': 'price-discount__price'}).text
    except:
        product_price = "NA"
        
    print(product_price)
    price_ls.append(product_price)

/phan-mem-diet-virus-bkav-profressional-1-pc-12-thang-hang-chinh-hang-p3054369.html
https://salt.tikicdn.com/cache/280x280/ts/product/7d/fa/73/dbb927adfeb9a57fb57ce86c593f9152.jpg
p3054369
Phần Mềm Diệt Virus BKAV Profressional 1 PC 12 Tháng - Hàng Chính Hãng
195.000 ₫
/usb-kingston-dt100g3-32gb-usb-3-0-hang-chinh-hang-p405243.html
https://salt.tikicdn.com/cache/280x280/ts/product/09/04/dd/74d5a81177e5cff5fee4b920dfaaf0ca.jpg
p405243
USB Kingston DT100G3 32GB USB 3.0 - Hàng Chính Hãng
87.000 ₫
/bo-kich-song-wifi-repeater-300mbps-totolink-ex200-hang-chinh-hang-p547563.html
https://salt.tikicdn.com/cache/280x280/ts/product/9d/fa/1e/30d0c22525743d5a2e850e76dd52fe72.jpg
p547563
Bộ Kích Sóng Wifi Repeater 300Mbps Totolink EX200 - Hàng Chính Hãng
195.000 ₫
/chuot-co-day-logitech-b100-hang-chinh-hang-p356188.html
https://salt.tikicdn.com/cache/280x280/ts/product/11/95/4d/a9c21fbe61ce96d66c06582a49791381.jpg
p356188
Chuột Có Dây Logitech B100 - Hàng Chính Hãng
59.000 ₫
/usb-kingston-dt100g3-16