Feature/design of v2 layer converter #2104

reyoung · 2017-05-11T09:51:16Z

Redesign of v2 layer converter.
This design will fix the following defects:

The demo code is #2096.

Here may be better to review.

Add example of branching topology

Topo

typhoonzero · 2017-05-11T12:02:20Z

doc/design/v2_layer.md

+
+## What are the problems
+
+Paddle V2 API give a flexible way to configure neural network topology. The user can create a neural network topology layer by layer. We use the final layer to represent the neural network topology.


neural network => Neural Network

The user => Users

typhoonzero · 2017-05-11T12:05:10Z

doc/design/v2_layer.md

+* Memory. We use memory layers in Recurrent Neural Network; a memory layer represents the output of some layer in the last time step. However, the memory layer connects to its input layer implicitly by sharing the same name. We also cannot traverse to memory layer because maybe there is no layer using this memory layer.
+* Recurrent Group.  The recurrent group is a sub-topology config using in Recurrent Neural Network. It represents the layers in each recurrent time-step. We could traverse back to some topology in a recurrent group, but the sub-topology is non-splittable. The recurrent group should be either entirely in the topology or not in the topology. 
+
+## Thinking how to resolve these problems


Thinking how to resolve these problems => How to resolve this problem

wangkuiyi

I am not familiar with how PaddlePaddle represents and runs RNN, so I am not yet ready to judge if 1. we should introduce "tape" or 2. we should change the way we represent RNN, to handle the problem stated in this PR -- "some layers are not traversable".

Also, I am curious that if RNN layers are not traversable, how could we have those RNN-based examples in book.paddlepaddle.org?

wangkuiyi · 2017-05-12T03:36:08Z

doc/design/v2_layer.md

+
+We use `cost` to represent the entire topology of the neural network.  We use the last node to traverse the entire topology by `depth first search` algorithm. The connection between each layer is represented by the `input` parameter.  It is fit for representing a plain neural network topology. However, there are some special layers in Paddle, which are not connected explicitly by `input` parameter. They are:
+
+* Evaluator.  An evaluator is used to compute metrics(such as error rate, f1 score) in Paddle. An evaluator can not be the input of other layers. So we cannot access evaluators by simply traversing back from the final layer.


There is a broken logic here. From the first statement

An evaluator can not be the input of other layers.

we cannot derive the conclusion

So we cannot access evaluators by simply traversing back from the final layer.

because logically an evaluator could be the "final layer".

wangkuiyi · 2017-05-12T03:37:31Z

doc/design/v2_layer.md

+We use `cost` to represent the entire topology of the neural network.  We use the last node to traverse the entire topology by `depth first search` algorithm. The connection between each layer is represented by the `input` parameter.  It is fit for representing a plain neural network topology. However, there are some special layers in Paddle, which are not connected explicitly by `input` parameter. They are:
+
+* Evaluator.  An evaluator is used to compute metrics(such as error rate, f1 score) in Paddle. An evaluator can not be the input of other layers. So we cannot access evaluators by simply traversing back from the final layer.
+* Memory. We use memory layers in Recurrent Neural Network; a memory layer represents the output of some layer in the last time step. However, the memory layer connects to its input layer implicitly by sharing the same name. We also cannot traverse to memory layer because maybe there is no layer using this memory layer.


Is the problem here that we shouldn't have memory layer at all? All, the concept of memory shouldn't be implemented as a layer?

Memory is a very fondamental concept in Paddle Topology. And it is just a special layer currently.

wangkuiyi · 2017-05-12T03:39:59Z

doc/design/v2_layer.md

+
+* Evaluator.  An evaluator is used to compute metrics(such as error rate, f1 score) in Paddle. An evaluator can not be the input of other layers. So we cannot access evaluators by simply traversing back from the final layer.
+* Memory. We use memory layers in Recurrent Neural Network; a memory layer represents the output of some layer in the last time step. However, the memory layer connects to its input layer implicitly by sharing the same name. We also cannot traverse to memory layer because maybe there is no layer using this memory layer.
+* Recurrent Group.  The recurrent group is a sub-topology config using in Recurrent Neural Network. It represents the layers in each recurrent time-step. We could traverse back to some topology in a recurrent group, but the sub-topology is non-splittable. The recurrent group should be either entirely in the topology or not in the topology. 


I just realized that I am not familiar with that how PaddlePaddle represent RNN yet. Any suggestion on how I can get familiar with it so could I help review this design doc? @reyoung

I give a demo code about RNN in this PR, hope it could be helpful to explain the problem.

Yancey1989 · 2017-05-15T06:58:06Z

doc/design/v2_layer.md

@@ -0,0 +1,151 @@
+# Using tape to refactor current paddle.v2 configuration parsing


Using tape to refactor current paddle.v2 configuration parsing => Using Tape to Refactor Current Paddle.V2 Configuration Parsing

http://www.titlecase.com is a useful tool and thanks @helinwang 's recommendation.

lcy-seso

Some understanding about Tape.

lcy-seso · 2017-05-16T06:45:24Z

doc/design/v2_layer.md

+
+    with Tape():
+        paddle.train(topology(False))
+    ```


看完之后说一些我的感受，希望继续和@reyoung 商讨。

我的一些理解

从这篇 design doc 中我可以理解 V2解析配置的核心思想，以及所衍生出的一些问题。引入Tape可以更加容易解决目前的几个核心问题

但我还无法判断引入Tape 这个概念的必然性

为什么需要引入 Tape 这个概念

我理解，基于V2 目前 BFS 的思想，这篇文章中提到的三个问题应该也有办法解决，但都是逐个以人工规则进行修正，不找到正确的方案，这种修正可能没有尽头

Tape 的引入可以解决 3个问题：

出错信息里包含了大量调用栈，真正的出错信息被淹没，一旦出错非常难以查找错误

在BFS过程中，目前存在三种行为特殊的Layer（在逻辑上他们都是神经网络中的一个Layer）：（1）evaluator ；（2）memory；（3）recurrent_layer_group 他们的处理策略需要手工添加规则去修正

目前未发现或者未来会出现的特殊Layer 可能都需要逐一添加处理逻辑，代码维护困难

Tape 的优点

我可以感受到 Tape 的优点包括：

Tape 处理网络拓扑结构的顺序和定义神经网络的直觉逻辑是一致的（也就是用户在配置中定义神经网络的顺序），因此，一旦出错，可以报一些正常的错误，而不是现在这样的“非正常错误”（BFS这个逻辑用户是不知道的，现在报的都是遍历错误），出错信息容易被理解

可以不需要考虑复杂的遍历逻辑，会简化 recurrent_layer_group 的处理

基于 BFS 的思想解决提出的三个问题

Evaluator

神经网络是一个有向无环图，Evaluator 和 Cost 在网络中出现的位置等同，只会出现在网络的末端。按照广度优先搜索的逻辑，Evaluator 应该是可以被创建出来的，只是如果不显示和Cost 区别，有可能不知道网络的优化准则。

为什么不可以继续使用 Cost （可以是多个cost）来代表网络的拓扑结构，显示地指定Evaluator 以区别于Cost呢？

关于Memory

Memory 是一个特殊的Layer，是循环神经网络中的一个重要概念，有着特殊的功能，无法被去掉。Memory 对应了特殊的Layer（各类Agent Layer）

Memory 没有 inputs ，不需要其他layers的输出作为输入

Memory 的 inputs 有别于其他Layer，不是通过input获取，而是通过：

boot_layer 指定第0时刻输入（已经处理）

name 指定第1时刻开始的输入（遍历时没有考虑，目前存在部分layers无法被创建的bugs）

直接添加处理策略，修复上面提到的 2 这种错误

关于 recurrent_layer_group

recurrent_layer_group 在PaddlePaddle中是一个submodel，不能被切割，这一点必须考虑，v2 目前有没有加以考虑，我不是非常确定

recurrent_layer_group 在PaddlePaddle中可以嵌套两次（recurrnet_layer_group的step函数又是一个recurrent_layer_group），v2 目前没有实现 recurrent_layer_group的嵌套，想必这个逻辑会非常的绕，但写肯定也能写出来，我想再看看v2 目前的处理逻辑

我的两个问题

有两个问题我还没有完全理解到位，

Tape 这个概念是否需要暴露给用户？如果是，暴露给用户的逻辑接口有哪些呢？或者Tape 是一个不会被用户感知到的概念？

有了 Tape 还需要 BFS 过程吗？Tape 是用来辅助 BFS，还是为了彻底地抛弃 BFS 过程。

Evaluator 和 Cost 在网络中出现的位置等同

这个不对吧，Evaluator其实可以出现在神经网络的任何位置。

name 指定第1时刻开始的输入（遍历时没有考虑，目前存在部分layers无法被创建的bugs）

如果回溯的时候没有回溯到Memory同名的Layer，这个Layer没有办法用其他的办法回溯到。因为原来的写法没有在任何地方记录额外信息，只能靠回溯输出层。

Tape 这个概念是否需要暴露给用户？如果是，暴露给用户的逻辑接口有哪些呢？或者Tape 是一个不会被用户感知到的概念？

支持Paddle现有功能，不用暴露Tape给普通用户。

不过如果要实现动态神经网络，会让用户可以清空Tape或者重置tape。API可以为:

with Tape(): ...

有了 Tape 还需要 BFS 过程吗？Tape 是用来辅助 BFS，还是为了彻底地抛弃 BFS 过程。

使用Tape，不需要DFS搜索。

这个不对吧，Evaluator其实可以出现在神经网络的任何位置。

我的理解是这样，目前PaddlePaddle中，Evaluator 和 Cost 一样，不会再成为下一个Layer/Evaluator的输出。Evaluator 和 Cost 都是有向无环图中只有入，没有出的结点，出度为0，但 Evaluator 不参与优化，需要特别指定。

我不太理解的一点，“用一个变量（目前看上去是一个cost）表示Topology” 。为什么强调是一个变量呢？

网络如果有多个Cost，目前是如何表示Topology呢？

网络可能有多个Cost，Evaluator 目前也是通过inputs 来和其它Layer连接，并没有例外

在我的理解里面，用有向无环图的最后一层（包括多个Cost和Evaluator）来定义拓扑结构更合理一些，为什么不是这样呢？

潜在的，如果 Cost 之后还可以继续接 Cost（比如 sum 之类），Evaluator 之后还可以接Evaluator , 会打破“Cost/Evaluator”是最后一层这以假设。

emailweixu · 2017-05-25T17:25:36Z

现在有人在写这个的代码吗？

luotao1 · 2019-02-01T04:43:45Z

感谢您给PaddlePaddle贡献代码。由于Paddle V1/V2版本已不再维护，相关代码也已从develop分支上删除，因此关闭您的PR，欢迎您向Paddle最新版-Fluid贡献代码。
Thanks for contributing to PaddlePaddle! Since V1/V2 will not be maintained anymore, and related codes have been deleted from develop branch as well, we close this PR. Welcome to contribute to Fluid——the latest version of PaddlePaddle.

reyoung added 6 commits May 11, 2017 15:40

Init commit

94830de

Update v2_layer.md

5761d39

Add example of branching topology

Update v2_layer.md

f3665f1

Topo

Update v2_layer.md

32a5ff0

Refine english

12cbe35

Refine documentation

23546b2

reyoung requested review from wangkuiyi and lcy-seso May 11, 2017 09:53

typhoonzero reviewed May 11, 2017

View reviewed changes

wangkuiyi reviewed May 12, 2017

View reviewed changes

Follow comments

ea3581d

reyoung force-pushed the feature/design_of_v2_layer_converter branch from 8437fd5 to ea3581d Compare May 15, 2017 03:50

Yancey1989 reviewed May 15, 2017

View reviewed changes

reyoung added this to Current Week ToDo in Defects board May 16, 2017

reyoung moved this from Current Week ToDo to Doing in Defects board May 16, 2017

lcy-seso requested changes May 16, 2017

View reviewed changes

emailweixu mentioned this pull request May 26, 2017

Fix V2 API #2288

Merged

luotao1 closed this Feb 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/design of v2 layer converter #2104

Feature/design of v2 layer converter #2104

reyoung commented May 11, 2017 •

edited

Loading

typhoonzero May 11, 2017

reyoung May 15, 2017

typhoonzero May 11, 2017

reyoung May 15, 2017

wangkuiyi left a comment

wangkuiyi May 12, 2017

reyoung May 15, 2017

wangkuiyi May 12, 2017

reyoung May 15, 2017

wangkuiyi May 12, 2017

reyoung May 15, 2017

Yancey1989 May 15, 2017

lcy-seso left a comment

lcy-seso May 16, 2017 •

edited

Loading

reyoung May 16, 2017

lcy-seso May 18, 2017 •

edited

Loading

emailweixu commented May 25, 2017

luotao1 commented Feb 1, 2019


		## What are the problems

		Paddle V2 API give a flexible way to configure neural network topology. The user can create a neural network topology layer by layer. We use the final layer to represent the neural network topology.


		We use `cost` to represent the entire topology of the neural network. We use the last node to traverse the entire topology by `depth first search` algorithm. The connection between each layer is represented by the `input` parameter. It is fit for representing a plain neural network topology. However, there are some special layers in Paddle, which are not connected explicitly by `input` parameter. They are:

		* Evaluator. An evaluator is used to compute metrics(such as error rate, f1 score) in Paddle. An evaluator can not be the input of other layers. So we cannot access evaluators by simply traversing back from the final layer.

		@@ -0,0 +1,151 @@
		# Using tape to refactor current paddle.v2 configuration parsing

Feature/design of v2 layer converter #2104

Feature/design of v2 layer converter #2104

Conversation

reyoung commented May 11, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wangkuiyi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lcy-seso left a comment

Choose a reason for hiding this comment

lcy-seso May 16, 2017 • edited Loading

Choose a reason for hiding this comment

我的一些理解

为什么需要引入 Tape 这个概念

Tape 的优点

基于 BFS 的思想解决提出的三个问题

我的两个问题

Choose a reason for hiding this comment

lcy-seso May 18, 2017 • edited Loading

Choose a reason for hiding this comment

emailweixu commented May 25, 2017

luotao1 commented Feb 1, 2019

reyoung commented May 11, 2017 •

edited

Loading

lcy-seso May 16, 2017 •

edited

Loading

lcy-seso May 18, 2017 •

edited

Loading