scope design doc #2548

jacquesqiao · 2017-06-21T10:26:17Z

@reyoung @kuke and @jacquesqiao will coedit this doc on the design of Scope

Markdown link is here

wangkuiyi · 2017-06-21T14:41:29Z

doc/design/scope.md

+
+private:
+  /// variable name -> variable
+  std::unordered_map<std::string, VariablePtr> vars_;


Why unordered_map other than map? I am not sure which is better -- I'd always believed the former is a hash map and the latter is an RD-tree.

Anyway, I noticed that Caffe2 is using map in its Workspace:

private: BlobMap blob_map_;

where BlobMap was defined as a CaffeMap at here:

typedef CaffeMap<string, unique_ptr<Blob> > BlobMap;

and CaffeMap is defined here

using CaffeMap = std::map<Key, Value>; // using CaffeMap = std::unordered_map;

yangqing has metioned why he prefer map instead unordered_map
here

Note(Yangqing): NVCC does not play well with unordered_map on some platforms,
// forcing us to use std::map instead of unordered_map.

wangkuiyi · 2017-06-21T14:42:50Z

doc/design/scope.md

+using VariablePtr = std::shared_ptr<Variable>;
+
+class Scope final {
+public:


Just a reminder -- I've put .clang-format files that strictly follows Google style in new source directories paddle/platform, paddle/framework. So this line would trigger clang-format error because Google style requires a single space before public:.

wangkuiyi · 2017-06-21T14:47:20Z

doc/design/scope.md

+
+```cpp
+class Variable;
+using VariablePtr = std::shared_ptr<Variable>;


I don't think we need this using alias because in a different scenario we might use different smart pointers. I'd suggest that we write everything in the definition of vars_ like:

using string = std::string; // So we can switch to other implementation of strings later. std::map<string, unique_ptr<Variable> > vars_;

Maybe we should have a global header called TypeDefs.h. In this header, we specify all default types for Map, String, Set, etc. Like

namespace paddle { template <typename KEY, typename VAL> using Map = std::unordered_map<KEY, VAL>; using String = std::string; template <typename T> using Set = std::unordered_set<T>; }

Let us do such kind of summarization later -- after we make sure that some types are common to various situations.

wangkuiyi · 2017-06-21T14:47:31Z

doc/design/scope.md

+
+```cpp
+class Variable;
+using VariablePtr = std::shared_ptr<Variable>;


Also, Caffe2 is using unique_ptr in its Workspace definition:

typedef CaffeMap<string, unique_ptr<Blob> > BlobMap;

Why do we use shared_ptr here?

To my understand, unique_ptr is the right choice here because it transfers ownership of the pointed object when copy, whereas shared_ptr increment the reference count when copy.

When using unique_ptr and return a raw pointer, users must not hold this pointer inside private member or any global variable. Because when scope is destroyed, all pointers get before are invalid.

I think to use shared_ptr inside a Scope, but return a weak_ptr when GetVariable can resolve this issue. But this code is a little bit complicated.

Maybe we should have an agreement. We cannot store any variable pointer in our code.

Let us don't save pointers from unique_ptr as a class data member. I don't think gemm need to hold such a pointer when we write an operator.

wangkuiyi · 2017-06-21T14:51:11Z

doc/design/scope.md

+
+```cpp
+Scope global;
+auto x = newVar("X");  // x is created in scope global, implicitly.


Just a remind -- the following functions/methods do not following Google naming style:

newVar ==> NewVar addOp ==> AddOp run ==> Run

wangkuiyi · 2017-06-21T14:53:14Z

doc/design/scope.md

+};
+```
+
+You need to specify a scope to run a Net. One net can run in different scopes and update different variable in the scope. If you did not specify one, It will run in a default scope.


Agreed!

I second removing the requirement that a net belongs to a workspace and am happy to see that our Scope doesn't have a line like the one in caffe2::Workspace:

class Workspace { ... NetMap net_map_;

wangkuiyi · 2017-06-21T14:58:03Z

doc/design/scope.md

+
+  //! Get Variable in this scope.
+  //! @return nullptr if no such variable.
+  const VariablePtr& getVar(const std::string& name) const;


Here the return type must be Variable* if we use unique_ptr other than shared_ptr in vars_.

Also, following Google code style, this method should be named GetVar.

wangkuiyi · 2017-06-21T14:59:50Z

doc/design/scope.md

+  const VariablePtr& getVar(const std::string& name) const;
+
+  //! Create or get a variable in this scope.
+  VariablePtr& createOrGetVar(const std::string& name);


Maybe rename it into GetOrCreateVar? After all, the primary purpose of this method is to "get", other than "create" -- it creates only if it cannot get.

…ature/scope_design

But user can not hold this pointers.

jacquesqiao · 2017-06-22T09:52:25Z

doc/design/scope.md

-1. Scope should destruct all Variables within it when itself is destructed.
-
-    Because Variable can only be got from Scope, when destroying Scope, we also need to destroy all the Vars in it.
+   Variable can not belong to many scopes. If you want to use variables from parent scope, you can use `parent scope`.


use multi instead of many is better?

maybe multiple ?

QiJune · 2017-06-22T10:18:00Z

doc/design/scope.md

+
+1.  We can create local variables in a local scope. When that local scope are destroyed, all local variables should also be destroyed.
+2.  Variables in a parent scope can be retrieved from local scopes of that parent scope, i.e., when user get a variable from a scope, it will try to search this variable in current scope. If there is no such variable in the local scope, `scope` will keep searching from its parent, until the variable is found or there is no parent.
+


在编程语言里面，scope是可以多层嵌套的。这里scope可以嵌套多层吗？比如如果local没有，就先找parent，然后再找parent的parent，直到找到为止。

读到文档末尾看到了，是可以嵌套的

QiJune · 2017-06-22T10:22:56Z

doc/design/scope.md

+class Scope {
+ public:
+  Scope(const std::shared_ptr<Scope>& scope): parent_(scope) {}
+


这里传递的参数是指针的引用，是说parent_这个指针还会发生变化吗？也就是说一个Scope的parent scope是可以自己修改的？

对于复杂类型的传参，应该传递const 引用。而parent_调用了shared_ptr的复制方法，复制了一份scope的指针。

如果这里改成传递std::shared_ptr<Scope>，会在传参的时候创建一个临时变量。shared_ptr的开销在于每次创建变量的时候，要对这个变量加一个全局的Mutex，再增加一下计数器。所以开销也不算特别小。

一个Scope的parent scope自己不可以修改。

Let us don't over use smart pointers. I believe the following would be enough for this case.

explicit Scope(const Scope& parent) : parent_(parent) {}

FYI, Caffe2 has the following:

explicit Workspace(Workspace* const shared) : shared_(shared) {}

QiJune · 2017-06-22T10:24:54Z

doc/design/scope.md

+
+ private:
+  std::shared_ptr<Scope> parent_;
+  std::unordered_map<std::string, std::unique_ptr<Attribute>> vars_;


这里似乎是笔误吧，放置了一个Attribute

QiJune · 2017-06-22T10:26:36Z

doc/design/scope.md

+    if (var != nullptr) {
+      return var;
+    } else if (parent_ != nullptr) {
+      return parent_->Get(name);


既然parent也是scope，在这里调用的Get方法，那么Scope里面需要定义一个Get接口吧？还是说这里笔误了，应该是return parent_->GetVariable(name)

hedaoyuan · 2017-06-22T11:23:24Z

doc/design/scope.md

+```
+## Only scope can create a variable
+
+To ensure `only scope can create a variable`, we should mark `Variable`'s constructor as a private member function, and Scope is a friend class of Variable. And then only `CreateVariable` can construct `Variable`.


If I have a variable_test.cc, this test file must contain scope.h?

我在想我们是在写代码的时候就要限制死用户不能在其他地方创建Variable，还是这个只是个『约定』。

感觉如果写代码的时候限制死，用户在其他地方创建Variable的时候直接报编译错，似乎更科学。

如果要这样的话，variable_test.cc如果需要写单测就要include scope.h

I second @hedaoyuan . I don't think we need the restriction that Variables can only be created by Scope.

hedaoyuan

I remember this afternoon we discussed the third point in this comments(#2545 (comment)).
The grad of the value is recorded by the Scope. But I did not seem to see the design.

reyoung · 2017-06-22T14:12:26Z

I remember this afternoon we discussed the third point in this comments(#2545 (comment)).
The grad of the value is recorded by the Scope. But I did not seem to see the design.

如何根据一个参数的名字，来找到他的梯度。这件事情并不是Scope设计中的事情。Scope只存储了名字到变量的映射。

如何根据一个参数的名字找到他的梯度，目前来看可以有两种方法:

方法1: 参数的名字和梯度的名字二者具有一定规律。例如参数都叫做xxx_param，而梯度都叫做xxx_param_grad。这样二者就可以相互对应了。
方法2: 在另一个地方存储一份映射表。里面的key是参数的名字，而value是梯度的名字。

不过，这两种方法究竟应该选择哪个，如何做，应该是另一个issue和PR确定的事情。

jacquesqiao · 2017-06-22T14:13:04Z

一点疑问：
Variable是不是一定只能由Scope创建，Op.Run()内部可能会创建一些临时的Variable么，为什么？

reyoung · 2017-06-22T14:16:56Z

Op内部可能会创建一些临时的Variable

Op是无状态的，不能在Op中保存临时变量。
- Op的run函数是const的，它也没办法修改类的private member。
如果Op是有状态的，一个神经网络，运行时切换Scope就不能完成了。因为如果Op在显卡1上申请了临时变量，而切换到显卡2的Scope，就根本没办法跑了。
如果每一次run函数需要临时的计算局部变量，也应该创建一个Tensor而不是Variable。Variable是Op间的输入输出，而Tensor可以随时创建销毁。
```
 void run(...) const {
    Tensor<Place> tmp;
 }
```
一个Op函数，如果有某些变量需要充当这个Op的状态，例如BatchNorm中的MovingMeans和MovingStd，也都应该是存在Scope中的Variable。

所以Variable只能由Scope创建。

wangkuiyi · 2017-06-23T02:37:44Z

I don't think Operators are "stateless". an RNNOp would even create a sub-scope as it runs.

Actually, I used to think that Net and Scope can be decoupled, but above seems an objective example. What do you think about this?

reyoung · 2017-06-23T02:45:11Z

I don't think Operators are "stateless". an RNNOp would even create a sub-scope as it runs.

RNN Op will create its sub-scopes actually. But it should return sub-scopes as Operator's output. The sub-scopes should not store locally in the operator class.

In general, if an Op holds a GPU memory by private data member, we cannot switch Scope when running this Op.

If an Op needs some temporary memory while computing, we can just use Tensor<GPUPlace> to create a temporary tensor. As we will write an effective memory allocation module, to create temporary tensor while running should be fast.

If an Op needs some temporary memory sharing between multiple ops, it should be a variable in scope by our definition.

QiJune · 2017-06-23T02:49:59Z

在caffe2中，除了workspace中的blobMap保存了所有blob的信息，python端还提供了一个ModleHelper的类，抽象出了一个Model的概念，用来索引所有参数相关的blob；在dynet中，也存在一个Model的抽象，用来索引所有的参数。
当然，我们可以只需要scope，让用户来自己写optimize中参数更新部分的逻辑，对应更新相应的参数。但这样给用户的负担可能比较大，因此建议引入Model的抽象。

那么我们可能的做法有以下两个：

在C++端就能够实现配置一个网络，完成网络的训练，那么我觉得在Scope之外，还需要一个Model的概念；
C++端仍然只是有一个Scope的概念，在python端做一个model（这样就与python强绑定了）

reyoung · 2017-06-23T02:58:00Z

在C++端就能够实现配置一个网络，完成网络的训练，那么我觉得在Scope之外，还需要一个Model的概念；

还是在C++这边实现这些吧。。不要和Python强绑定为好。

hedaoyuan · 2017-06-23T03:01:32Z

Op是无状态的，不能在Op中保存临时变量
如果Op是有状态的，一个神经网络，运行时切换Scope就不能完成了。因为如果Op在显卡1上申请了临> 时变量，而切换到显卡2的Scope，就根本没办法跑了。

我也不认为OP只能是stateless的，另外，是否stateless和有无临时变量也没关系。卷积OP里面有临时变量并不妨碍什么。
另外，我记得讨论的时候说的Op是带Device模板参数的，确定Scope可以从显卡1切换到显卡2吗？

reyoung · 2017-06-23T05:19:09Z

是否stateless和有无临时变量也没关系
线下交流，道远比较担心在 Run函数中每次使用Tensor<GPU> tmp;申请临时变量比较耗费时间。特别是变量大小比较大的情况。

我们肯定期待使用高效的alloc算法会让临时变量的申请和释放不耗时。但是如果Tensor<GPU> tmp申请释放比较耗时，那么我们可以将临时的变量加到这个Op的output中。这样，这个Variable也会在Scope中被创建出来，不会消亡。

即:

class OpBase {
 private:
  vector<string> outputs_;
};

class SomeOpUseHugeMemory : public OpBase {
 public:
  SomeOpUseHugeMemory() {
    outputs_.push_back("HugeTempVariableName");
  }
};

jacquesqiao · 2017-06-23T05:41:13Z

在C++端就能够实现配置一个网络，完成网络的训练，那么我觉得在Scope之外，还需要一个Model的概念；

看了下ModelHelper貌似是对Cpp中的net和parameters的一层封装，net就是core.Net()， parameter是BlobReference，按照现在的设计，在实现Python部分的时候，应该也需要这样一层抽象

hedaoyuan · 2017-06-23T06:02:48Z

doc/design/scope.md

+Just like [scope](https://en.wikipedia.org/wiki/Scope_(computer_science)) in programming languages, `Scope` in the neural network can also be a local scope. There are two attributes about local scope.
+
+1.  We can create local variables in a local scope. When that local scope are destroyed, all local variables should also be destroyed.
+2.  Variables in a parent scope can be retrieved from local scopes of that parent scope, i.e., when user get a variable from a scope, it will try to search this variable in current scope. If there is no such variable in the local scope, `scope` will keep searching from its parent, until the variable is found or there is no parent.


Looks like a local scope can have multiple parent scopes.
If a local scope Ls has two parent scopes PsA and PsB; PsA and PsB have two variables called a, and b, respectively.
Ls want to use PsA.a and PsB.b how to do?

Currently, user cannot access PsA.a and PsB.b in one local scope.

The scope is a linked-list. It will get local variable firstly, and local variable will hide parent variables.

Just not at present, or never? Whether to consider later?

I think this situation is not quite useful right now.

The original discuess in * PaddlePaddle#2548 (comment) * PaddlePaddle#2579 (comment) This commit is just a proposal, let's do such kind of summarization in this PR.

scope design doc

bcac91a

jacquesqiao assigned reyoung, jacquesqiao and kuke Jun 21, 2017

reyoung added 2 commits June 21, 2017 21:12

Rearrange docs

002a6c9

Update code

674b1d3

wangkuiyi reviewed Jun 21, 2017

View reviewed changes

wangkuiyi mentioned this pull request Jun 21, 2017

add net_design doc #2547

Closed

fix code style

7a48507

jacquesqiao mentioned this pull request Jun 22, 2017

Scope Design Documentation #2546

Closed

reyoung and others added 19 commits June 22, 2017 14:21

Add scope doc

3e09978

Merge branch 'scope' of https://github.com/jacquesqiao/Paddle into fe…

acf0b75

…ature/scope_design

Add Scope Parent & Local section

04ad9b6

Parent & local scope done

581e4c1

Refining english

0b70361

Refine English

76e2a3c

some properties of scope

8282138

Merge branch 'scope' of https://github.com/jacquesqiao/Paddle into scope

4413726

Update API

d7aca77

Add interfaces

2d5507f

Merge branch 'scope' of https://github.com/jacquesqiao/Paddle into fe…

64a1cdf

…ature/scope_design

add overview for scope design doc

73b1c5b

Update interface

1f0056b

Merge branch 'scope' of https://github.com/jacquesqiao/Paddle into scope

0b07583

Update key attributes

17eed33

some detailed explaination of the Scope properties

37fd48b

refine style of markdown

c3a4b8b

Use unique_ptr instead of shared_ptr/weak_ptr.

db96c0e

But user can not hold this pointers.

fix a mistake share by nets -> share by scopes

f104ce2

reyoung added 6 commits June 22, 2017 17:20

To google code style

eab0e52

Change typo

63a56b4

Remove delete

921fa13

Typo

5d88249

Rearrange description.

f8a209c

Change title

c5ad89a

jacquesqiao commented Jun 22, 2017

View reviewed changes

Fix markdown

237efc2

QiJune reviewed Jun 22, 2017

View reviewed changes

hedaoyuan reviewed Jun 22, 2017

View reviewed changes

Typo

3bac2d0

reyoung mentioned this pull request Jun 23, 2017

move net_design to framework #2553

Closed

hedaoyuan reviewed Jun 23, 2017

View reviewed changes

reyoung mentioned this pull request Jun 23, 2017

Propose a typedefs header for paddle framework #2584

Closed

wangkuiyi approved these changes Jun 27, 2017

View reviewed changes

jacquesqiao merged commit 718eff9 into PaddlePaddle:develop Jun 27, 2017

jacquesqiao added this to Doing in PaddlePaddle Refactoring: Phase 1 Jun 27, 2017

jacquesqiao moved this from Doing to Done in PaddlePaddle Refactoring: Phase 1 Jun 27, 2017


		1. We can create local variables in a local scope. When that local scope are destroyed, all local variables should also be destroyed.
		2. Variables in a parent scope can be retrieved from local scopes of that parent scope, i.e., when user get a variable from a scope, it will try to search this variable in current scope. If there is no such variable in the local scope, `scope` will keep searching from its parent, until the variable is found or there is no parent.

scope design doc #2548

scope design doc #2548

Conversation

jacquesqiao commented Jun 21, 2017 • edited by wangkuiyi Loading

Choose a reason for hiding this comment

dzhwinter Jun 21, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacquesqiao Jun 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reyoung Jun 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wangkuiyi Jun 23, 2017 • edited Loading

Choose a reason for hiding this comment

QiJune Jun 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QiJune Jun 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hedaoyuan left a comment

Choose a reason for hiding this comment

reyoung commented Jun 22, 2017

jacquesqiao commented Jun 22, 2017 • edited Loading

reyoung commented Jun 22, 2017 • edited Loading

wangkuiyi commented Jun 23, 2017

reyoung commented Jun 23, 2017 • edited Loading

QiJune commented Jun 23, 2017

reyoung commented Jun 23, 2017

hedaoyuan commented Jun 23, 2017

reyoung commented Jun 23, 2017 • edited Loading

jacquesqiao commented Jun 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacquesqiao commented Jun 21, 2017 •

edited by wangkuiyi

Loading

dzhwinter Jun 21, 2017 •

edited

Loading

jacquesqiao Jun 22, 2017 •

edited

Loading

reyoung Jun 22, 2017 •

edited

Loading

wangkuiyi Jun 23, 2017 •

edited

Loading

QiJune Jun 22, 2017 •

edited

Loading

QiJune Jun 22, 2017 •

edited

Loading

jacquesqiao commented Jun 22, 2017 •

edited

Loading

reyoung commented Jun 22, 2017 •

edited

Loading

reyoung commented Jun 23, 2017 •

edited

Loading

reyoung commented Jun 23, 2017 •

edited

Loading

jacquesqiao commented Jun 23, 2017 •

edited

Loading