Skip to content

Latest commit

 

History

History
75 lines (47 loc) · 3.53 KB

convLSTM.md

File metadata and controls

75 lines (47 loc) · 3.53 KB

paper-note

nips2015: Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

1.概述

论文目的是利用历史数据预测往后降水量 (precipitation nowcasting

作者将此问题视为输入和输出都是时空序列的 时空序列预测问题

通过给FC-LSTM的 输入到隐层 和隐层到隐层 转移都加上卷积机构,作者提出了 convolutional LSTM (convLSTM)

作者将实验结果与FC-LSTM与state-of-art方法ROVER算法比较,convLSTM更胜一筹。

2. 绪言

2.1 问题形式化

将空间区域划分为$M\times N$的网格,每个网格中有$P$个观测指标,则一个观测值可表示为$\mathcal{X}\in\mathrm{R}^{P\times M \times N}$,则给定前$J$个观测值,预测后$K$个值的问题可形式化为如下: $$ \large \tilde{\mathcal{X}}{t+1},\cdots,\tilde{\mathcal{X}}{t+K} = \text{arg max}{\mathcal{X}{t+1},\cdots,\mathcal{X}{t+K}}\quad p(\mathcal{X}{t+1}, \cdots, \mathcal{X}{t+K} | \hat{\mathcal{X}}{t-J+1}, \hat{\mathcal{X}}{t-J+2},\cdots,\hat{\mathcal{X}}{t}) $$

在本问题中,每个时间点上获得的观测值为一个二维的雷达图像(radar echo map),文中将图像划分为网格,并且将每个网格中的==???==(view the pixels inside a patch as its measurements ),则此问题转化为一个时空序列预测问题

问题复杂度与可解性:

Although the number of free variables in a length-K sequence can be up to O(MKNKPK), in practice we may exploit the structure of the space of possible predictions to reduce the dimensionality and hence make the problem tractable.

2.2 LSTM 与FC-LSTM

略去对LSTM的介绍,下面对FC-LSTM作简要介绍:$\circ$符号为==?Hadamard乘==, $$ \begin{aligned} i_t =& \sigma(W_{xi}x_t+W_{hi}h_{t-1}+\textcolor{red}{W_{ci}\circ{c}{t-1}}+b_i)\ f_t =& \sigma(W{xf}x_t+W_{hf}h_{t-1}+\textcolor{red}{W_{cf}\circ{c}{t-1}}+b_f)\ c_t =& f_t \circ c{t-1} + i_t \circ \tanh (W_{xc}x_t+W_{hc}h_{t-1}+b_c)\ o_t =& \sigma(W_{xo}x_t+W_{ho}h_{t-1}+\textcolor{red}{W_{co}\circ{c}_{t}}+b_o) \end{aligned} $$ 即为在LSTM单元内部添加了三个peepholes,如下图所示:

LSTM with peepholes

图片来自link

其中,

the input, cell output and states are all 1D vectors.

3. 模型介绍

FC-LSTM缺点:

  • 尽管能发掘时序相关性,但包含过多的空间信息冗余(contains too much redundancy for spatial data)

  • 在input-to-state和state-to-state转移中使用的全连接并未包含任何的空间信息。

convLSTM特性:所有输入$\mathcal{X}1, \cdots,\mathcal{X}t$,cell输出$\mathcal{C}1,\cdots,\mathcal{C}t$,隐层状态$\mathcal{H}1,\cdots,\mathcal{H}t$,以及门输出$i_t, f_t, o_t$均为三维张量,其中最后两个维度为空间信息。 $$ \begin{aligned} i{t} &=\sigma\left(W{x i} * \mathcal{X}{t}+W{h i} * \mathcal{H}{t-1}+W{c i} \circ \mathcal{C}{t-1}+b{i}\right) \ f_{t} &=\sigma\left(W_{x f} * \mathcal{X}{t}+W{h f} * \mathcal{H}{t-1}+W{c f} \circ \mathcal{C}{t-1}+b{f}\right) \ \mathcal{C}{t} &=f{t} \circ \mathcal{C}{t-1}+i{t} \circ \tanh \left(W_{x c} * \mathcal{X}{t}+W{h c} * \mathcal{H}{t-1}+b{c}\right) \ o_{t} &=\sigma\left(W_{x o} * \mathcal{X}{t}+W{h o} * \mathcal{H}{t-1}+W{c o} \circ \mathcal{C}{t}+b{o}\right) \ \mathcal{H}{t} &=o{t} \circ \tanh \left(\mathcal{C}_{t}\right) \end{aligned} $$

! TODO