# 気象データから1時間後の降雨を予測する

過去～現時点までのデータから１時間後の降雨（有無または量）を予測する

***

## １．データの概要を把握する

### CSVファイルからデータを読み込む

In [1]:
#ライブラリーを読み込む
import pandas as pd
pd.set_option('precision', 2)

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# データの読込と表示
df = pd.read_csv(
    "17_preprocessed.csv",
    sep="," ,
    skiprows = 1,
    header= 1,
    encoding = "Shift_JISx0213").set_index("年月日時")
df.head()            # 統計量の表示

Unnamed: 0_level_0,気温(℃),降水量(mm),降雪(cm),積雪(cm),日照時間(時間),風速(m/s),風向,露点温度(℃),蒸気圧(hPa),相対湿度(％),海面気圧(hPa),現地気圧(hPa),日射量(MJ/㎡),天気,視程(km),雲量(10分比)
年月日時,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2017/1/1 1:00,-0.5,0.0,0,0,0.0,0.7,3.93,-1.5,5.5,93.0,1025.7,973.5,0.0,2,15.0,4
2017/1/1 2:00,-0.4,0.0,0,0,0.0,1.2,1.18,-1.4,5.5,93.0,1025.8,973.6,0.0,2,15.0,4
2017/1/1 3:00,-0.8,0.0,0,0,0.0,1.5,4.71,-1.5,5.5,95.0,1025.8,973.5,0.0,2,15.0,4
2017/1/1 4:00,-2.0,0.0,0,0,0.0,1.1,1.18,-2.8,5.0,94.0,1026.3,973.8,0.0,2,15.0,4
2017/1/1 5:00,-1.7,0.0,0,0,0.0,1.2,5.11,-2.2,5.2,96.0,1026.1,973.6,0.0,2,15.0,4


***

### 統計データを表示

In [2]:
df.describe()

Unnamed: 0,気温(℃),降水量(mm),降雪(cm),積雪(cm),日照時間(時間),風速(m/s),風向,露点温度(℃),蒸気圧(hPa),相対湿度(％),海面気圧(hPa),現地気圧(hPa),日射量(MJ/㎡),天気,視程(km),雲量(10分比)
count,8758.0,8759.0,8759.0,8759.0,8758.0,8759.0,8645.0,8758.0,8758.0,8758.0,8758.0,8759.0,8759.0,8759.0,8759.0,8759.0
mean,11.92,0.14,0.02,1.98,0.23,2.38,2.66,7.22,12.09,75.3,1014.16,964.79,0.61,4.37,18.37,7.36
std,9.95,1.01,0.21,6.62,0.38,1.72,1.84,9.32,7.49,16.77,7.65,6.49,0.93,3.42,8.15,3.58
min,-9.4,0.0,0.0,0.0,0.0,0.0,0.0,-13.3,2.2,12.0,982.4,935.1,0.0,1.0,0.4,0.0
25%,2.5,0.0,0.0,0.0,0.0,1.1,1.18,-1.4,5.5,65.0,1008.6,960.6,0.0,2.0,15.0,5.0
50%,12.1,0.0,0.0,0.0,0.0,1.8,1.96,6.9,9.9,78.0,1013.6,964.6,0.02,4.0,20.0,10.0
75%,20.6,0.0,0.0,0.0,0.4,3.4,4.32,15.2,17.3,89.0,1019.9,969.3,1.0,4.0,25.0,10.0
max,34.8,52.0,6.0,49.0,1.0,11.1,5.89,24.8,31.2,100.0,1033.0,979.7,3.78,15.0,50.0,10.0


***

### 相関係数を表示

In [26]:
df_corr = df.corr()
df_corr

Unnamed: 0,気温(℃),降水量(mm),降雪(cm),積雪(cm),日照時間(時間),風速(m/s),風向,露点温度(℃),蒸気圧(hPa),相対湿度(％),海面気圧(hPa),現地気圧(hPa),日射量(MJ/㎡),天気,視程(km),雲量(10分比)
気温(℃),1.0,0.0509,-0.14,-0.405,0.202,0.194,0.0932,0.92,0.885,-0.29,-0.539,-0.33,0.34,-0.15,0.03,0.157
降水量(mm),0.05,1.0,0.11,0.00847,-0.0805,-0.00259,-0.0112,0.11,0.123,0.16,-0.11,-0.11,-0.08,0.169,-0.09,0.0831
降雪(cm),-0.14,0.112,1.0,0.25,-0.0588,-0.0432,-0.0348,-0.1,-0.0932,0.12,-0.0179,-0.06,-0.06,0.205,-0.15,0.0706
積雪(cm),-0.4,0.00847,0.25,1.0,-0.044,-0.0958,-0.0285,-0.36,-0.3,0.15,0.214,0.13,-0.06,0.209,-0.09,0.00707
日照時間(時間),0.2,-0.0805,-0.06,-0.044,1.0,0.215,-0.198,-0.04,-0.0426,-0.61,0.00939,0.07,0.84,-0.261,0.05,-0.318
風速(m/s),0.19,-0.00259,-0.04,-0.0958,0.215,1.0,-0.0804,0.04,0.01,-0.41,-0.202,-0.17,0.29,0.0576,0.14,0.0492
風向,0.09,-0.0112,-0.03,-0.0285,-0.198,-0.0804,1.0,0.13,0.126,0.09,-0.0006,0.03,-0.19,-0.0796,0.03,0.00621
露点温度(℃),0.92,0.111,-0.1,-0.363,-0.0442,0.042,0.131,1.0,0.976,0.11,-0.537,-0.35,0.07,-0.0133,-0.05,0.27
蒸気圧(hPa),0.88,0.123,-0.09,-0.3,-0.0426,0.01,0.126,0.98,1.0,0.13,-0.535,-0.35,0.07,0.00198,-0.05,0.266
相対湿度(％),-0.29,0.161,0.12,0.153,-0.612,-0.411,0.0855,0.11,0.134,1.0,0.0402,-0.03,-0.66,0.369,-0.25,0.26


### 相関係数の大きい要素のみ表示

In [27]:
df_corr.applymap(lambda x: x if abs(x) > 0.3 else "---")

Unnamed: 0,気温(℃),降水量(mm),降雪(cm),積雪(cm),日照時間(時間),風速(m/s),風向,露点温度(℃),蒸気圧(hPa),相対湿度(％),海面気圧(hPa),現地気圧(hPa),日射量(MJ/㎡),天気,視程(km),雲量(10分比)
気温(℃),1,---,---,-0.4,---,---,---,0.92,0.88,---,-0.54,-0.33,0.34,---,---,---
降水量(mm),---,1,---,---,---,---,---,---,---,---,---,---,---,---,---,---
降雪(cm),---,---,1,---,---,---,---,---,---,---,---,---,---,---,---,---
積雪(cm),-0.4,---,---,1,---,---,---,-0.36,---,---,---,---,---,---,---,---
日照時間(時間),---,---,---,---,1,---,---,---,---,-0.61,---,---,0.84,---,---,-0.32
風速(m/s),---,---,---,---,---,1,---,---,---,-0.41,---,---,---,---,---,---
風向,---,---,---,---,---,---,1,---,---,---,---,---,---,---,---,---
露点温度(℃),0.92,---,---,-0.36,---,---,---,1,0.98,---,-0.54,-0.35,---,---,---,---
蒸気圧(hPa),0.88,---,---,---,---,---,---,0.98,1,---,-0.54,-0.35,---,---,---,---
相対湿度(％),---,---,---,---,-0.61,-0.41,---,---,---,1,---,---,-0.66,0.37,---,---


### 降水量と他データとの相関

In [32]:
pd.reset_option("display.precision")
pd.DataFrame(df_corr["降水量(mm)"])

Unnamed: 0,降水量(mm)
気温(℃),0.050889
降水量(mm),1.0
降雪(cm),0.112437
積雪(cm),0.008469
日照時間(時間),-0.080501
風速(m/s),-0.002594
風向,-0.011188
露点温度(℃),0.111477
蒸気圧(hPa),0.123179
相対湿度(％),0.16069


→ <font color="red">残念なから降水量と相関の強いデータはなし</font>

***

## 方向づけ

* 相関係数からでは降水量の予測に適した要素を見つけられなかった。
* 決定木から降水量に関連性の強い要素を見つける
* 関連性の強い要素から降水量を推定する