# 53 Stations of the Tōkaidō - Wikipedia Extracts


source:

- [東海道五十三次 - Wikipedia](https://ja.wikipedia.org/wiki/%E6%9D%B1%E6%B5%B7%E9%81%93%E4%BA%94%E5%8D%81%E4%B8%89%E6%AC%A1)
- [東海道五十三次 (浮世絵) - Wikipedia](https://ja.wikipedia.org/wiki/%E6%9D%B1%E6%B5%B7%E9%81%93%E4%BA%94%E5%8D%81%E4%B8%89%E6%AC%A1_(%E6%B5%AE%E4%B8%96%E7%B5%B5))

In [1]:
import json
from time import sleep

import pandas as pd
import requests
from tqdm.notebook import tqdm

In [2]:
df = pd.read_csv("stations.csv")
df["no"] = df["no"].astype("string")

df.head().T

Unnamed: 0,0,1,2,3,4
no,0,1,2,3,4
name,日本橋,品川宿,川崎宿,神奈川宿,保土ヶ谷宿
name_kana,にほんばし,しながわ,かわさき,かながわ,ほどがや
name_roman,Nihonbashi,Shinagawa,Kawasaki,Kanagawa,Hodogaya
province,武蔵国,武蔵国,武蔵国,武蔵国,武蔵国
img_caption,朝之景 / 行列振出,日之出,六郷渡舟,台之景,新町橋
latitude,35.683611,35.621944,35.535556,35.472778,35.444028
longitude,139.774444,139.739167,139.707778,139.632278,139.595556
wikipedia_ja,https://ja.wikipedia.org/wiki/%E6%97%A5%E6%9C%...,https://ja.wikipedia.org/wiki/%E5%93%81%E5%B7%...,https://ja.wikipedia.org/wiki/%E5%B7%9D%E5%B4%...,https://ja.wikipedia.org/wiki/%E7%A5%9E%E5%A5%...,https://ja.wikipedia.org/wiki/%E4%BF%9D%E5%9C%...
wikipedia_en,https://en.wikipedia.org/wiki/Nihonbashi,https://en.wikipedia.org/wiki/Shinagawa-juku,https://en.wikipedia.org/wiki/Kawasaki-juku,https://en.wikipedia.org/wiki/Kanagawa-juku,https://en.wikipedia.org/wiki/Hodogaya-juku


In [3]:
def get_wikipedia_ja_extract(title):
    url = "https://ja.wikipedia.org/w/api.php"\
                "?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1"\
                f"&titles={title}"
    
    res = requests.get(url)
    res.raise_for_status()
    
    pages = res.json()["query"]["pages"]
    assert len(pages) == 1
    extract = list(pages.values())[0]["extract"]

    return extract

In [4]:
def get_wikipedia_en_extract(title):
    url = "https://en.wikipedia.org/w/api.php"\
                "?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1"\
                f"&titles={title}"
    
    res = requests.get(url)
    res.raise_for_status()
    
    pages = res.json()["query"]["pages"]
    assert len(pages) == 1
    extract = list(pages.values())[0]["extract"]

    return extract

In [5]:
extracts_dict = {}

for row in tqdm(df.itertuples(index=False), total=df.shape[0]):
    title_ja = row.wikipedia_ja.replace("https://ja.wikipedia.org/wiki/", "")
    extract_ja = get_wikipedia_ja_extract(title_ja)
    
    title_en = row.wikipedia_en.replace("https://en.wikipedia.org/wiki/", "")
    extract_en = get_wikipedia_en_extract(title_en)
    
    extracts_dict[row.no] = {
        "ja": extract_ja,
        "en": extract_en
    }

    sleep(1)
    
len(extracts_dict)

  0%|          | 0/55 [00:00<?, ?it/s]

55

In [6]:
extracts_dict["0"]

{'ja': '日本橋（にほんばし）は、もともとは1603年（慶長8年）に江戸で最初に町割りが行われた場所にあった川に架けられた木造の橋で、その後何代にもわたり掛け替えられ、現在のものは1911年に完成したもので、東京都中央区の日本橋川に架かり、石造りの2連アーチ橋となっている。',
 'en': 'Nihonbashi (日本橋) is a business district of Chūō, Tokyo, Japan which grew up around the bridge of the same name which has linked two sides of the Nihonbashi River at this site since the 17th century.  The first wooden bridge was completed in 1603. The current bridge,  designed by Tsumaki Yorinaka and constructed of stone on a steel frame, dates from 1911.  The district covers a large area to the north and east of the bridge, reaching Akihabara to the north and the Sumida River to the east. Ōtemachi is to the west and Yaesu and Kyobashi to the south.\nNihonbashi, together with Kyobashi and Kanda, is the core of Shitamachi, the original downtown center of Edo-Tokyo, before the rise of newer secondary centers such as Shinjuku and Shibuya.'}

In [7]:
with open("./extracts.json", "w") as fp:
    json.dump(extracts_dict, fp, ensure_ascii=False, indent=2)

In [8]:
!head ./extracts.json

{
  "0": {
    "ja": "日本橋（にほんばし）は、もともとは1603年（慶長8年）に江戸で最初に町割りが行われた場所にあった川に架けられた木造の橋で、その後何代にもわたり掛け替えられ、現在のものは1911年に完成したもので、東京都中央区の日本橋川に架かり、石造りの2連アーチ橋となっている。",
    "en": "Nihonbashi (日本橋) is a business district of Chūō, Tokyo, Japan which grew up around the bridge of the same name which has linked two sides of the Nihonbashi River at this site since the 17th century.  The first wooden bridge was completed in 1603. The current bridge,  designed by Tsumaki Yorinaka and constructed of stone on a steel frame, dates from 1911.  The district covers a large area to the north and east of the bridge, reaching Akihabara to the north and the Sumida River to the east. Ōtemachi is to the west and Yaesu and Kyobashi to the south.\nNihonbashi, together with Kyobashi and Kanda, is the core of Shitamachi, the original downtown center of Edo-Tokyo, before the rise of newer secondary centers such as Shinjuku and Shibuya."
  },
  "1": {
    "ja": "品川宿（しながわしゅく、しながわじゅく）は、東海道五十三次の宿場の一つ。東海道の第一宿であり、中山道の板橋宿、