# **Web Scrapped Tabelog Restaurants Data**

This scraper is constructed using **BeautifulSoup** to scrape Japanese restaurants data on [Tabelog](https://tabelog.com/en/) website for [@qlub](https://qlub.io/)'s business use.

@Copyright 2022 [@JenniferZheng](https://github.com/JenniferZheng0430)


##**1. Import packages and tools**

In [9]:
from bs4 import BeautifulSoup
import requests
import csv
import pandas as pd

##**2. URLs**

We only scrape restaurants in tokyo right now.
If users want to scrape restaurants in other area, just replace the tokyo in RESTAURANT_URL with targeted region.
For available regions, please check out [Tabelog](https://tabelog.com/en/) website for further information

In [None]:
DOMAIN = 'https://tabelog.com/'

In [2]:
targeted_region = 'tokyo'

In [None]:
RESTAURANT_URL = 'https://tabelog.com/en/' + targeted_region + '/rstLst/'

##**3. Get restaurants links**

Please input how many pages of restaurants links you wish to get.
1 page = 20 restaurant

In [None]:
#How many page of restaurants do you want to get?
#1 page == 20 restaurants
page = 50

In [3]:
def get_url(page):
    url_list = []
    
    for i in range(page):
        url = RESTAURANT_URL + str(i) + '/'

        r = requests.get(url)
        soup = BeautifulSoup(r.content, 'html.parser')
        
        # <a href="https://tabelog.com/en/tokyo/A1317/A131712/13171774/" class="list-rst__name-main js-detail-anchor" target="_blank">Yakitonasachan</a>
        restaurants = soup.findAll("a", {"list-rst__name-main js-detail-anchor"},href=True)

        for res in restaurants:
            url_list.append(res['href'])
    return url_list

In [4]:
links = get_url(page)

In [None]:
links

##**4. Get restaurant infos on each restaurant web page**

The detailed info table on each website is scraped which includes information such:
1. Restaurant Name
2. Category
3. TEL/reservation
4. Addresses
5. Transportation
6. Operating Hours
7. Shop holidays
8. Budget
9. Method of payment
10. Table money/charge

In [6]:
def scrape(url):
    r = requests.get(url)
    soup = BeautifulSoup(r.content, 'html.parser')

    # <table class="c-table rd-detail-info">
    table = soup.find("table", class_ = "c-table rd-detail-info")
    rows = table.tbody.find_all('tr')
    
    res_info = {}
    for row in rows:
        res_info[row.find('th').text.strip()] = row.find('td').text.strip().replace('\n','')
#     print(res_info)
    
    return res_info

In [7]:
restaurants_info = []
for url in links:
    cur_restaurant = scrape(url)
    restaurants_info.append(cur_restaurant)

##**5. Convert data dictionary to dataframe and save in csv file**

In [10]:
df = pd.DataFrame.from_dict(restaurants_info)
df

Unnamed: 0,Restaurant name,Categories,TEL/reservation,Addresses,Transportation,Operating Hours,Shop holidays,Budget,Method of payment,Table money/charge,Awards
0,Yakitonasachan(やきとん　あさちゃん 戸越銀座店),"Izakaya (Tavern), Grilled pork, Yakitori (Gril...",050-5570-4876 (+81-50-5570-4876) (For ...,1-chome-8-1 Hiratsuka Shinagawa City Tokyo-to ...,東急池上線　戸越銀座駅　徒歩10秒47 meters from Togoshi Ginza.,月～木曜日　15時～23時(ラストオーダー 22:30)金曜日 15時～23時30分(ラ...,12/31 1/1 休業,"Dinner:￥2,000～￥2,999Lunch:￥2,000～￥2,999",Credit Cards AcceptedElectronic money Not Acce...,お通し(席料込み)　220円,
1,sakabafutamata(酒場フタマタ 西新橋店),"Izakaya (Tavern), Dumplings, Yakitori (Grilled...",050-5456-6727 (+81-50-5456-6727) (For ...,1-chome-3-1 Nishishinbashi Minato City Tokyo-t...,JR新橋駅徒歩5分銀座線虎ノ門駅徒歩5分都営三田線内幸町駅A4ｂ出口直結155 meters...,[月～金]11:30～14:3017:00～23:00,土日祝 12/29~1/3 休業,"Dinner:￥2,000～￥2,999Lunch:～￥999",Credit Cards Accepted (VISA、M...,お通し：429円※コース利用の場合はなし,
2,Maggiore(マッジョーレ),"Dining bar, Italian, Pasta",03-5610-5151 (+81-3-5610-5151) 予約可※ご予約...,Taihei Sumida City Tokyo ...,ＪＲ総武線錦糸町駅北口から徒歩７分東京メトロ半蔵門線錦糸町駅４番出口から徒歩３分401 me...,【ランチ）　11：30～14：00（L.O)平日　ランチ専用メニュー土、日、祝　アラカルトメ...,月曜日,"Dinner:￥4,000～￥4,999Lunch:￥1,000～￥1,999",Credit Cards Accepted (VISA、M...,,
3,Tanyashe(譚鴨血老火鍋 池袋東口店),"Chinese hot pot / fire pot, Izakaya (Tavern), ...",050-5872-8185 (+81-50-5872-8185) (For ...,1-chome-26-2 Minamiikebukuro Toshima City Toky...,池袋駅東口徒歩2分JR池袋駅東口を出て、目の前の大通りの左側の道をまっすぐ、マクドナルド・g...,全日11:00～23:00(Lo:22:30)Open on sundays,年中無休,"Dinner:￥3,000～￥3,999Lunch:￥1,000～￥1,999",Credit Cards Accepted (VISA、M...,チャージなし,
4,Yakinikuinami(焼肉一七三 向山),"Yakiniku (BBQ Beef), Horumon (BBQ Offel), Izak...",050-5890-0396 (+81-50-5890-0396) (For ...,1-chome-13-3 Ebisuminami Shibuya City Tokyo ...,JR山手線ほか　恵比寿駅　西口　徒歩3分151 meters from Ebisu.,[火～日]17:00～24:00(L.O)Open on sundays,月曜日,"Dinner:￥8,000～￥9,999",Credit Cards Accepted (VISA、M...,,
...,...,...,...,...,...,...,...,...,...,...,...
995,TAMURO(酒肴場 屯),"Seafood, Izakaya (Tavern), Nihonshu (Japanese ...",050-5594-4178 (+81-50-5594-4178) (For ...,2-chome-8-17 Shinbashi Minato City Tokyo-to ...,JR新橋駅 日比谷口 徒歩2分地下鉄銀座線 新橋駅 8番出口 徒歩2分都営三田線 内幸町駅 ...,昼【月〜金】11:30〜14:30※スープがなくなり次第終了となります。夜【月〜金】17:0...,日曜、祝日,"Dinner:￥6,000～￥7,999",Credit Cards Accepted (VISA、M...,お通し(2種)代800円（税込）、サービス料5%,
996,kanzenkoshitsukaisenshungyohananomai(完全個室・海鮮旬魚...,"Izakaya (Tavern), Seafood, Yakitori (Grilled c...",050-5592-7469 (+81-50-5592-7469) (For ...,1-chome-2-7 Kajicho Chiyoda City Tokyo-to ...,ＪＲ神田南口／西口：徒歩１分　銀座線神田駅：徒歩３分東京駅 方面より【ガード 高架下】　23...,【ディナー】月～土　17:00～23:30,日曜日、祝日,"Dinner:￥2,000～￥2,999Lunch:～￥999",Credit Cards Accepted (VISA、M...,当店は、【先付 （お通し）】として、300円 を頂いております。心ばかりのものですが、お料理...,
997,shintoukyouyakinikuyuushinutage(新東京焼肉 遊心 宴 日本橋...,"Yakiniku (BBQ Beef), Horumon (BBQ Offel)",050-5457-0559 (+81-50-5457-0559) (For ...,1-chome-32-2 Nihonbashikakigaracho Chuo City T...,東京メトロ半蔵門線　水天宮前駅 6番出口 徒歩3分東京メトロ日比谷線　人形町駅 A2番出口 ...,ランチ全日11:30～14:00（LO13:30）ディナー月〜土17:00～23:00（フー...,不定休,"Dinner:￥5,000～￥5,999Lunch:￥1,000～￥1,999",Credit Cards Accepted (JCB、AM...,,
998,Wine no Ruisuke(Wine no Ruisuke 渋谷ストリーム店),"Bar, Fowl, Dining bar",050-5596-1074 (+81-50-5596-1074) (For ...,3-chome-21-3 Shibuya Shibuya City Tokyo-to ...,以下、沿線「渋谷駅」C2出口直結・東急東横線、田園都市線・東京メトロ半蔵門線、副都心線※JR...,★営業時間★【ランチ】平日 12：00～15：00土日祝　12：00～14：00【ディナー...,無休(ストリームに準ずる),"Dinner:￥3,000～￥3,999Lunch:￥1,000～￥1,999",Credit Cards Accepted (VISA、M...,お通し代として￥300いただいております。当店の表示価格は全て税込み価格です。別途10％のサ...,


In [11]:
df.to_csv('restaurants_tokyo.csv')