## Requirement:
1. Your function should be able to take in an URL and return a pandas dataframe
2. The final dataframe should contain the following informations: 
    * Product ID
    * Seller ID
    * Product title
    * Price
    * URL of the product image
    * URL of that product page

Bonus information:
* Is it TikiNow (delivery within 2 hours) <img src="https://salt.tikicdn.com/ts/upload/9f/32/dd/8a8d39d4453399569dfb3e80fe01de75.png">?
* Is it free delivery?
* Number of reviews?
* How many stars or percentage of stars?
* Does it got "badge under price" (Rẻ hơn hoàn tiền) <img src="https://salt.tikicdn.com/ts/upload/51/ac/cc/528e80fe3f464f910174e2fdf8887b6f.png">?
* Discount percentage?
* Does it got "shocking price" badge ? <img src="https://salt.tikicdn.com/ts/upload/75/34/d2/4a9a0958a782da8930cdad8f08afff37.png">
* Does it allowed to be paid by installments? <img src="https://salt.tikicdn.com/ts/upload/ba/4e/6e/26e9f2487e9f49b7dcf4043960e687dd.png">
* Does it comes with free gifts? <img src="https://salt.tikicdn.com/ts/upload/47/35/8c/446f61d046eba9a305d3f39dc0834c4a.png">
    

In [1]:
import requests
import pandas as pd
from bs4 import BeautifulSoup

In [3]:
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'}

r = requests.get('https://tiki.vn/laptop-may-vi-tinh/c1846?src=c.1846.hamburger_menu_fly_out_banner', headers=headers)
# r.text is a HTML file so we will use html.parser
soup = BeautifulSoup(r.text, 'html.parser')

# Make the soup object look nicer
print(soup.prettify()[:2000])

<!DOCTYPE html>
<html class="no-js" lang="vi">
 <head>
  <meta charset="utf-8" class="next-head"/>
  <link class="next-head" href="https://frontend.tikicdn.com/_desktop-next/static/css/_sprite.css?v=20201112541spBxgF4xs8idwuOEtdK4i" rel="stylesheet" type="text/css"/>
  <title class="next-head">
   Mua online Laptop - Máy Vi Tính - Linh kiện giá cực tốt | Tiki
  </title>
  <meta class="next-head" content="Mua online Laptop - Máy Vi Tính - Linh kiện giá cực tốt | Tiki" property="og:title"/>
  <meta class="next-head" content="Nhiều mặt hàng Laptop - Máy Vi Tính - Linh kiện hàng chất lượng, mẫu mã đa dạng, freeship và giao siêu nhanh. Shop uy tín" property="og:description"/>
  <meta class="next-head" content="Nhiều mặt hàng Laptop - Máy Vi Tính - Linh kiện hàng chất lượng, mẫu mã đa dạng, freeship và giao siêu nhanh. Shop uy tín" name="description"/>
  <link class="next-head" href="https://tiki.vn/laptop-may-vi-tinh-linh-kien/c1846.html" rel="canonical"/>
  <meta class="next-head" content=

In [4]:
import re

In [5]:
# All occurences of the products in that page
print("\nAll occurences of the product div sections:")
products = soup.find_all('a', {'class':'product-item'})

print("Type:", type(products))
print("Number of products:", len(products))


All occurences of the product div sections:
Type: <class 'bs4.element.ResultSet'>
Number of products: 48


In [6]:
product_id_ls = []
product_title_ls = []
price_ls = []
image_url_ls = []
product_url_ls = []
tikinow_ls = []
free_delivery_ls = []
num_reviews_ls = []
percentage_ratings_ls = []
badge_under_price_ls = []
discount_percent_ls = []
shocking_price_ls = []
paid_installment_ls = []
free_gift_ls = []


In [7]:
tiki_now_img_url = 'https://salt.tikicdn.com/ts/upload/9f/32/dd/8a8d39d4453399569dfb3e80fe01de75.png'


In [8]:
regex = re.compile('-p\d*\.')


for product in products:
    # print(product.prettify())
    try:
        product_link = product['href']
        product_url = 'http://tiki.vn' + product_link
        product_id = regex.findall(product_link)[0][1:-1]
        
        product_url_ls.append(product_url)
    except:
        print('product link got error. move on to next product')
        continue
        
    # grab image url
    try:
        image_url = product.img['src']
        
    except:
        image_url = "NA"
        
    image_url_ls.append(image_url)
    
    product_id_ls.append(product_id)
    
    # find name
    try:
        product_name = product.find('div', {'class':'name'}).span.text
    except:
        product_name = "NA"
    
    product_title_ls.append(product_name)
    
    # find price
    try:
        product_price = product.find('div', {'class': 'price-discount__price'}).text
    except:
        product_price = "NA"
        
    
    price_ls.append(product_price)
    
    # Shocking price - FreeShip
    shock_price = None
    freeship = None
    
    try:
        addon = product.find('div', {'class': 'item top'})
        if addon.text == 'Freeship':
            print('Got free ship!')
            freeship = 1
            shock_price = 0
        else:
            print('Got shocking price!')
            freeship = 0
            shock_price = 1
    except:
        # print('cant find div item top')
        shock_price = "NA"
        freeship = "NA"
    
    shocking_price_ls.append(shock_price)
    free_delivery_ls.append(freeship)
    
    # Extract review information
    num_review = None
    rating_pct = None
    try:
        review_rating = product.find('div', {'class': 'rating-review'})
        rating_pct = review_rating.find('div', {'class': 'rating__average'})['style'][6:]
        #print(rating_pct)
        num_review = product.find('div', {'class': 'review'}).text[1:-1]
        #print(num_review)
    except:
        num_review = "NA"
        rating_pct = "NA"
    
    num_reviews_ls.append(num_review)
    percentage_ratings_ls.append(rating_pct)
    
    
    # check TikiNow
    tikinow = 0
    try:
        badge_service = product.find('div', {'class': 'badge-service'})
        print('Got tikinow')
        if badge_service.img['src'] == tiki_now_img_url:
            tikinow = 1
    except:
        tikinow = "NA"
    
    tikinow_ls.append(tikinow)

Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got shocking price!
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got shocking price!
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got free ship!
Got tikinow
Got tikinow
Got tikinow
Got tikinow
Got shocking price!
Got tikinow
