Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move net_design to framework #2553

Closed
wants to merge 22 commits into from
Closed

Conversation

Superjomn
Copy link
Contributor

@Superjomn Superjomn commented Jun 22, 2017

I fixed all the comments in the previous PR

// The minimum a network should be implemented.
class NetworkBase {
public:
BaseNetwork(const NetDef &def, Scope *scope);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2547 (comment)
If Network won`t hold scope, why we feed a scope to constructor?

A simple implemention is as followed:

```c++
class Network final : public BaseNetwork {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you fogot modifying 'BaseNetwork'.

@wangkuiyi
Copy link
Collaborator

Please consider the following use case while we go on improving the design:

Workspace w1(GPUPlace(0));
Workspace w2(CPUPlace());

w1.CreateVariables(net_desc);
w1.InitVariables(net_desc);
w1.GetVar("weights")->FixValue();

w2.CreateVariables(net_desc);
w2.InitVariables(net_desc);
w2.GetVar("hello")->SetValue(...);

auto net = CreateNet(net_desc);

net.Run(&w1);

net.Run(&w2);


// run all the operators and return success(true) or not, all the
// variables are located in `scope`.
virtual bool Run(Scope* scope) = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a discussion in here. Should our Operator be stateless? If our Operator is stateless, then NetBase::Run() should be const.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

A method of factory pattern can be defined like

```c++
std::unique<NetworkBase* CreateNet(const NetDef& def) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lost a '>'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


protected:
// Create operators accordding to `def`.
bool CreateNet(const NetDef &def);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What`t the difference ScratchNet::CreateNet and CreateNet in line#43. Or shall we rename ScratchNet::CreateNet to a better name, such as: CreateOps ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure ScratchNet.CreateNet just will create operators, maybe some other things should do like some network-level initializations.

Maybe we should rename it to InitNet ? @wanghaoshuang

@Superjomn Superjomn requested a review from wangkuiyi June 23, 2017 05:44
class NetworkBase {
public:
// `def` is a proto message that describe the structure of a network.
NetworkBase(const NetDef& def);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe addOperator method is very useful, especially when we generating backward ops and generating optimization ops.

Also, the constructor of NetworkBase should invoke addOperator as well.

class NetworkBase {
 public:
  Error AddBackwardOps(const std::string& lossVariableName, float lossGrad = 1.0f) {
    for (eachOp reversely) {
      for (gradOp in GetGradientOps(eachOp)) {
        addOperator(gradOp)
      }
    }
  }

  Error AddOptimizerOps(const Map<string, string> paramToGradMap) {
    for (paramName, gradName in paramToGradMap) {
      addOperator(SGD(input={gradName, paramName}, output={paramName}));
    }
  }
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to unify all the AddXXOps to AddOp(const OpDef &def) in base class ?

Or move the above interfaces into ScratchNet? @reyoung

class NetworkBase {
public:
// `def` is a proto message that describe the structure of a network.
NetworkBase(const NetDef& def);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, If NetworkBase is a base class, the constructor may not include NetDef protobuf. Because it is not used in base class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


protected:
// keys of the input variables feed into the network.
std::vector<string> inputs_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a network needs inputs_ & outputs_ field?

public:
// `def` is a proto message that describe the structure of a network.
NetworkBase(const NetDef& def);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think addOperator should return a Index. Maybe code like this,

class NetworkBase {
public:
  using OpIndex = size_t;

  OpIndex addOperator(const OpDesc& op);
};

So that Run method can take a begin and end argument like:

bool Run(Scope* scope, OpIndex begin = -1UL, OpIndex end = -1UL);

It will be very helpful if a network can run partially.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reyoung Network maybe run from multiple start-operators to multiple end-operators?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we use the OpName instead of the OpIndex?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @wanghaoshuang. It seems that operators within [begin_index, end_index] cannot define arbitrary sub-DAG.


// run all the operators and return success(true) or not, all the
// variables are located in `scope`.
virtual bool Run(Scope* scope) = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Run can return paddle::Error.

A very basic implemention is as followed, all it does is simply to run every operators in sequence.

```c++
class ScratchNet final : public NetworkBase {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand why the basic implementation is called Scratch

Copy link
Contributor Author

@Superjomn Superjomn Jun 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build a network from scratch?

Or name it "BasicNet"? @reyoung

Or SimpleNet as same as Caffe2?


## A Simple Network Implemention

A very basic implemention is as followed, all it does is simply to run every operators in sequence.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implemention -> implementation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. as followed --> as follows.
  2. ", all" --> "All"

// Add a operator which is identified as `type` and has attributes described
// in `attrs`, the `inputs` are the keys of readonly input variables,
// `outputs` are keys of mutable output variables.
bool AddOp(const std::string &type, const std::vector<string> &inputs,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe addOp should be an interface of NetworkBase. Because there may be some instance will manipulate NetworkBase in CPP, such as adding gradient ops, adding optimization ops, etc.

If we want user passing a protobuf message contains backward ops, we need to write AddGradientOps in Python. It will let Paddle coupled with Python language, which we did not want to.


private:
// the operators owned by `Network`.
std::vector<std::unique_ptr<Operator>> ops_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just std::vector<Operator> ops_ is enough? I am not sure. But it seems in interface, we are not using ops_ as a pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


```c++
// create an empty scope located on CPU device.
Scope scope(CPUPlace());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think scope do not take Place as an argument. A scope is just an association from name to variable. Each tensor in variables could have different Place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sample code
As described in the sample code, Scope has a place, maybe more discussion here? @wangkuiyi @reyoung

Benefit from the decoupling of `ScratchNet.Run` and `Scope`, `ScratchNet` is compatible with future RNN design,
for example we can implement a simple recurrent neural network as followed

```c++
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this part should be rewritten.

Scope contains a parent argument. The local scope can access the parent scope variables. It should not copy variables between scopes.

Scope parent;

// in RNN
NetworkBase& stepNet;
std::vector<Scope> timestepScopes;
for (size_t i=0; i < seqLen; ++i) {
  timestepScopes.push_back(Scope(&parent));
  stepNet.Run(&timestepScopes);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will give another commit soon to fix this.


Network is designed as the container of operators, to make it more extendable,
we decompling it from the related variable resources.
A `scope` is provided to `Run` so that the network structure can be reused
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think

Run(Scope* scope) takes a scope as the parameter and so that it can run in different scopes

is better for understanding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


protected:
// Create operators accordding to `def`.
bool CreateNet(const NetDef &def);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe InitNet() is better? Because I think CreatNet will return a created Net and InitNet will fulfill and init itself.

@@ -0,0 +1,159 @@
# Network Design

`Network` is the container and controller of a set of operators in a network,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Network is the container and controller of a set of operators in a network.
=>
Network is the container and controller of a set of operators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

# Network Design

`Network` is the container and controller of a set of operators in a network,
users can build a real network from a `NetDef` in protobuf message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or use AddOp to modify this Net

public:
// `def` is a proto message that describe the structure of a network.
NetworkBase(const NetDef& def);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @wanghaoshuang. It seems that operators within [begin_index, end_index] cannot define arbitrary sub-DAG.

virtual bool Run(Scope* scope) const = 0;

protected:
// keys of the input variables feed into the network.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feed --> fed


All network implementations should build networks from a protobuf message which
describes the structure of a real network; `Run` method should be implemented by
all implementations to offer a universal method to forward or backward compute a network.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be implemented by all implementations --> should be implemented by all its inheriting class

}
```

Network is designed as the container of operators, to make it more extendable,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--> Network is designed as a container for operators, with related variable resources decoupled from it to make it more extendable.

A `scope` is provided to `Run` so that the network structure can be reused
in different scopes.

Finally, `NetworkBase` can be used as followed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as followed --> as follows:


## A Simple Network Implemention

A very basic implemention is as followed, all it does is simply to run every operators in sequence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. as followed --> as follows.
  2. ", all" --> "All"

@reyoung
Copy link
Collaborator

reyoung commented Jun 23, 2017

@wangkuiyi

我觉得NetworkBase中,包含AddOperator(const proto::OperatorDesc& );的接口是很必要的。

因为:

  1. Paddle让用户配置的只有神经网络的前向部分。而Backward和SGD的一些Op(每一个前向Op都对应一个或者多个Backward Op,每一个参数都对应着一个SGD Op)都是由Paddle框架添加的。
  2. 如何添加BackwardSGD Op的逻辑应该写在C++端。因为这段逻辑如果写在Python端,如果我们需要其他语言的language binding,我们还要再其他语言重写一遍。
    1.添加BackwardSGD Ops的函数,应该直接操作NetworkBase类型。没必要通过protobuf中转一遍。因为这些函数都是C++函数了。
  3. 所以,在AddBackwardOps函数中,需要调用NetworkBase::AddOperator(const proto::OperatorDesc&)接口。

NetworkBase::AddOperator接口是必要的,也得是public的。

Scope scope(CPUPlace());

// create and init variables described in `net_desc`.
scope.CreateVariables(net_desc);
Copy link
Contributor Author

@Superjomn Superjomn Jun 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NetworkBase has a member function called InferShape(Scope*) to infer shapes and create variables in scope, is its function duplicate with scope.CreateVariables?

Because the variables' information is located in OpDef and there is no field called vars in NetDef that holds all the information of variables required by the network, is scope.CreateVariables necessary?
@wangkuiyi @reyoung @jacquesqiao

Copy link
Collaborator

@reyoung reyoung Jun 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable information is not stored in NetDef, and in OpDef, there will only be input/output names for Variable. So CreateVariables be performed only by NetDef.

Maybe there should be a high-level concept called NetworkBuilder or Model. A NetworkBuilder takes a NetworkBase and a Scope as its inputs. In NetworkBuilder, the user can add layers, add backward ops, and get all parameters. During AddLayer functions, the variable is created by Network Builder.

NetworkBase network;
Scope scope;
Variable* image = scope.CreateVariable("Image");
Variable* label = scope.CreateVariable("Label");
NetworkBuilder builder(&network, &scope);

Variable* fc_out = builder.FCLayer(input=image, size=100, activation="Sigmoid");
Variable* prediction = builder.FCLayer(input=fc_out, size=10, activation="Sigmoid");
Variable* loss = builder.CrossEntropy(input=prediction, label=label);
Variable* avg_loss = builder.Mean(loss);

auto allParams = builder.Parameters();
builder.BackwardFrom(avg_loss)
builder.AddOptimization(1e-4, "adam");

// train one mini-batch
network.run(&scope);

Copy link
Contributor Author

@Superjomn Superjomn Jun 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we merge NetworkBuiler into NetworkBase?
NetworkBase is the base class of all "network" implementations, with a different implementation, NetBuilder may be different.
For example, SimpleNet 's network structure is different with DAGNet 's structure,
so a different NetBuilder.ApplyGradient may be needed. In other words, NetBuilder is coupled to specific implementation of NetworkBase.

If we merge NetBuilder's functions into NetworkBase 's specific implementations, then it is natural to have different "BackwardFrom" implementations.
@reyoung

Copy link
Collaborator

@reyoung reyoung Jun 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the logic of ApplyGradient is same for each implementation of NetworkBase. The different between PlainNet and other implementation is how to implement AddOperator.

Copy link
Contributor Author

@Superjomn Superjomn Jun 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I create another PR with NetBuilder
NetBuilder is treated as a user-friendly syntax API.
NetBase is the detailed implementation of Network-related APIs.

Users can create BaseNet using NetBuilder, but the detail of BaseNet is not visible to them.
With this two-level design, we will focus our use-related API on NetBuilder and implementation details only on NetBase.

Both of them are not coupled with each other.

@reyoung

@Superjomn Superjomn closed this Jun 29, 2017
@Superjomn Superjomn moved this from Doing to Done in PaddlePaddle Refactoring: Phase 1 Aug 2, 2017
heavengate pushed a commit to heavengate/Paddle that referenced this pull request Aug 16, 2021
* add senet

* add annotations according to review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

7 participants