<h1 style="background-color: #cf6868; color: white; padding: 24px; line-height: 32px;">Exercise 2: <br>
Population change </h1>
<p>Hideki Kozima (xkozima@tohoku.ac.jp)</p>

<hr style="border-color: #cf6868; border-width: 12px;" />
## Preparation

In [None]:
# read the file as a list of lines
file = open("data/popuPrefChangeCP932.csv", "r", encoding="cp932")
lineList = file.readlines()
file.close()

The file (in CSV format) looks like:
<img src="img/resasExample1.png" width="90%" style="border: solid 1px #cccccc">

<hr style="border-color: #cf6868; border-width: 12px;" />
## Preparation

"data/popuPrefChangeCP932.csv" is downloaded from "https:resas.go.jp" (Regional Economy and Society Analyzing System) run by the Japanese government.  To download the file, visit  "https://resas.go.jp/population-composition/#".  Note that most of the files on the site are CSV encoded in "cp932".  

Other sources of official statistical information of Japan:
* Statistics Bureau: http://www.e-stat.go.jp/SG1/estat/eStatTopPortalE.do
* Data.go.jp: http://www.data.go.jp/?lang=english
* METI (Ministry of Economy, Trade and Industry, Japan): http://datameti.go.jp/?lang=en
* JMA (Japan Meteological Agency): http://www.jma.go.jp/jma/menu/menureport.html

<hr style="border-color: #cf6868; border-width: 12px;" />
## What's in the dataset

The data contains population of each prefecture on 1960, 1965, ..., 2015 (actual) and on 2020, 2025, ..., 2040 (estimated). 47 x 17 = 799 lines.  The file comes with a "header" line.

In [None]:
# let's peep the data
print("number of lines:", len(lineList))
print("header:", lineList[0], end="")
print("line 1:", lineList[1], end="")
print("line 2:", lineList[2], end="")
print("line 3:", lineList[3], end="")
print("line 4:", lineList[4], end="")

The first couple of lines contain the following information:
<table style="font-size: 75%">
<tr><th>集計年</th><th>都道府県コード</th><th>都道府県名</th><th>総人口（人）</th><th>年少人口（人）</th><th>生産年齢人口（人）</th><th>老年人口（人）</th><th>年少人口割合</th><th>生産年齢人口割合</th><th>老年人口割合</th></tr>
<tr><td>1960</td><td>1</td><td>北海道</td><td>5039206</td><td>1681479</td><td>3145664</td><td>212063</td><td>0.33</td><td>0.62</td><td>0.04</td></tr>
<tr><td>1960</td><td>2</td><td>青森県</td><td>1426606</td><td>513397</td><td>848838</td><td>64371</td><td>0.36</td><td>0.6</td><td>0.05</td></tr>
<tr><td>1960</td><td>3</td><td>岩手県</td><td>1448517</td><td>501782</td><td>870492</td><td>76243</td><td>0.35</td><td>0.6</td><td>0.05</td></tr>
<tr><td>1960</td><td>4</td><td>宮城県</td><td>1743195</td><td>584497</td><td>1063732</td><td>94966</td><td>0.34</td><td>0.61</td><td>0.05</td></tr>
</table>

The header says, <br>
(0) "year of census", (1) "prefecture code", (2) "prefecture name", <br>
(3) "total population", <br>
(4) "junior (~14y)", (5) "productive (15~64y)", (6) "senior (65~y)", <br>
(7) "junior (in prop)", (8) "productive (in prop)", (9) "senior (in prop)".

In [None]:
prefDict = {
 '北海道' : 'Hokkaido',  '青森県' : 'Aomori',    '岩手県' : 'Iwate',
 '宮城県' : 'Miyagi',    '秋田県' : 'Akita',     '山形県' : 'Yamagata',
 '福島県' : 'Fukushima', '茨城県' : 'Ibaraki',   '栃木県' : 'Tochigi',
 '群馬県' : 'Gunma',     '埼玉県' : 'Saitama',   '千葉県' : 'Chiba',
 '東京都' : 'Tokyo',     '神奈川県':'Kanagawa',  '新潟県' : 'Nigata',
 '富山県' : 'Toyama',    '石川県' : 'Ishikawa',  '福井県' : 'Fukui',
 '山梨県' : 'Yamanashi', '長野県' : 'Nagano',    '岐阜県' : 'Gifu',
 '静岡県' : 'Shizuoka',  '愛知県' : 'Aichi',     '三重県' : 'Mie',
 '滋賀県' : 'Shiga',     '京都府' : 'Kyoto',     '大阪府' : 'Osaka',
 '兵庫県' : 'Hyogo',     '奈良県' : 'Nara',      '和歌山県':'Wakayama',
 '鳥取県' : 'Tottori',   '島根県' : 'Shimane',   '岡山県' : 'Okayama',
 '広島県' : 'Hiroshima', '山口県' : 'Yamaguchi', '徳島県' : 'Tokushima',
 '香川県' : 'Kagawa',    '愛媛県' : 'Ehime',     '高知県' : 'Kochi',
 '福岡県' : 'Fukuoka',   '佐賀県' : 'Saga',      '長崎県' : 'Nagasaki',
 '熊本県' : 'Kumamoto',  '大分県' : 'Oita',      '宮崎県' : 'Miyazaki',
 '鹿児島県':'Kagoshima', '沖縄県' : 'Okinawa' }
print(prefDict)

In [None]:
# (0) remove header, now 799 lines
lineListBody = lineList[1:]
# (1) remove trailing "\n"
# (2) Romanize the prefecture names
# (3) make strings into numbers
dataList = []
for line in lineListBody:
    if line[-1] == "\n":
        line = line[0:-1]    
    stringList = line.split(",")
    recordList = [int(stringList[0]), int(stringList[1]), prefDict[stringList[2]], 
                  int(stringList[3]), 
                  int(stringList[4]), int(stringList[5]), int(stringList[6]), 
                  float(stringList[7]), float(stringList[8]), float(stringList[9]) ]
    dataList.append(recordList)
# peep for the first 5 lines
for i in range(0, 4):
    print(dataList[i])
print("...")
for i in range(795, 799):
    print(dataList[i])

<hr style="border-color: #cf6868; border-width: 12px;" />
## Total population change

In [None]:
# generate yearList
yearList = []
for data in dataList:
    year = data[0]
    if year not in yearList:
        yearList.append(year)
print(yearList)
# generate prefList
prefList = []
for data in dataList:
    pref = data[2]
    if pref not in prefList:
        prefList.append(pref)
print(prefList)

In [None]:
# total population of Japan from 1960 to 2040
totalPopuList = []
for year in yearList:
    popuList = [data[3] for data in filter(lambda data: data[0] == year, dataList)]
    total = sum(popuList)
    totalPopuList.append((year, total))
print(totalPopuList)

In [None]:
import matplotlib.pyplot as plt
x = [tpl[0] for tpl in totalPopuList]
y = [tpl[1] for tpl in totalPopuList]
plt.title("Total population of Japan")
plt.ylim(ymin=0, ymax=1.4e8)
plt.plot(x, y)
plt.show()

<hr style="border-color: #cf6868; border-width: 12px;" />
## Population change in each prefecture

In [None]:
# population of each prefecture from 1960 to 2040
prefPopuList = []
for pref in prefList:
    prefData = list(filter(lambda data: data[2] == pref, dataList))
    popuList = []
    for year in yearList:
        popuList += [data[3] for data in prefData if data[0] == year]
    prefPopuList.append((pref, popuList))
# print(prefPopuList) --- too long to print all
print(prefPopuList[0])
print(prefPopuList[1])
print(prefPopuList[2])
print(prefPopuList[3])
print(prefPopuList[4])

In [None]:
# draw it
plt.title("Population of each prefecture")
plt.ylim(ymin=0, ymax=1.4e7)
for prefPopu in prefPopuList:
    pref = prefPopu[0]
    popuList = prefPopu[1]
    plt.plot(yearList, popuList, label=pref)
# plt.legend() --- too big to show
plt.show()

In [None]:
# population of major prefectures from 1960 to 2040
prefPopuList2 = []
for pref in ["Tokyo", "Osaka", "Hokkaido", "Fukuoka", "Miyagi"]:
    prefData = list(filter(lambda data: data[2] == pref, dataList))
    popuList = []
    for year in yearList:
        popuList += [data[3] for data in prefData if data[0] == year]
    prefPopuList2.append((pref, popuList))
# draw it
plt.title("Population of major prefectures")
plt.ylim(ymin=0, ymax=1.4e7)
for prefPopu in prefPopuList2:
    pref = prefPopu[0]
    popuList = prefPopu[1]
    plt.plot(yearList, popuList, label=pref)
plt.legend()
plt.show()

<hr style="border-color: #cf6868; border-width: 12px;" />
## Age balance in population in each prefecture

In [None]:
# elderly population of each prefecture from 1960 to 2040
prefPopuList2 = []
for pref in ["Akita", "Miyagi", "Tokyo", "Okinawa"]:
    prefData = list(filter(lambda data: data[2] == pref, dataList))
    popuList = []
    for year in yearList:
        popuList += [data[9] for data in prefData if data[0] == year]
    prefPopuList2.append((pref, popuList))
# print(prefPopuList2) --- too long to print all
for prefPopu in prefPopuList2:
    pref = prefPopu[0]
    popuList = prefPopu[1]
    plt.plot(yearList, popuList, label=pref)
plt.title("Elderly population (relative)")
plt.ylim(ymin=0, ymax=0.5)
plt.legend()
plt.show()

In [None]:
# elderly population of each prefecture from 1960 to 2040
prefPopuList2 = []
for pref in ["Akita", "Miyagi", "Tokyo", "Okinawa"]:
    prefData = list(filter(lambda data: data[2] == pref, dataList))
    popuList = []
    for year in yearList:
        popuList += [data[7] for data in prefData if data[0] == year]
    prefPopuList2.append((pref, popuList))
# print(prefPopuList2) --- too long to print all
for prefPopu in prefPopuList2:
    pref = prefPopu[0]
    popuList = prefPopu[1]
    plt.plot(yearList, popuList, label=pref)
plt.title("Junior population (relative)")
plt.ylim(ymin=0, ymax=0.5)
plt.legend()
plt.show()

<h3 style="background-color: #cf6868; color: white; padding: 24px; text-align: center;">(cc) Koziken, MMXVII</h3>