Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paddle Function #892

Closed
hedaoyuan opened this issue Dec 14, 2016 · 10 comments
Closed

Paddle Function #892

hedaoyuan opened this issue Dec 14, 2016 · 10 comments
Assignees

Comments

@hedaoyuan
Copy link
Contributor

hedaoyuan commented Dec 14, 2016

The main purpose of refactoring Paddle's computationally related code is to use a unified interface to represent each function. The function here can be understood as an operator in a neural network. Refactoring these computationally related codes mainly to solve the following three issues.

  • Lack of unified function interface description
    In Paddle, the definition of Layer has a unified interface, but a Layer often call a number of functions, these functions do not have a unified interface description.
    If I just want to write an FFT-based convolution forward calculation for model inference, which interface should I refer to?
  • Refactor the interfaces listed in ISSUE Refactoring the Computation API in Paddle Based on Function #977
    Most of these functions are defined as a member function of the Matrix, but it does not specify how the member function interface of the Matrix should be defined. This has encountered some problems in interface readability and interface testing (Making it easier to write unittest for comparing gpu and cpu version of a function  #385).
  • Import third-party library, such as cuDNN, NNPACK, to improve the computational performance of Paddle on different hardware platforms(such as Need to add some new algorithm(like direct, winograd) for convolution calculation. #2177).
    To import a third-party library requires a piece of code to wrap the interface provided by the third-party library to facilitate calls in Paddle. So, what needs to be considered is how to simplify the writing of this wrapper code.

A unified function interface description

Paddle has many computationally related code(ISSUE #997), by analyzing these interfaces, we can understand that what issues needs to solve.

  1. A calculation function that has one input and one output(Paddle Function #892 (comment)).
  2. A calculation function that has multiple inputs and multiple outputs(Paddle Function #892 (comment)).
  3. Some of the interfaces will assign the result to the output, while others will add the result to the output(Paddle Function #892 (comment)).
  4. The same function is implemented as multiple versions based on different devices. At present, for CPU and GPU two devices, the realization of the CpuMatrix::memberFunction and GpuMatrix::memberFunction two functions.
  5. The same function is implemented as multiple versions based on different data type(like double, float). At present, it is done by switching the definition of real at compile time.
  6. Using Matrix and Vector to represent one-dimensional and two-dimensional dense data types;
  7. With a Matrix and a Vector two arguments to represent an argument of the sequence type;
  8. SparseMatrix represents the sparse matrix type of NO_VALUE or FLOAT_VALUE;

If we have designed an interface to solve these problems, we can reorganize the code(#892 (comment)) of the interface listed in #997.

@hedaoyuan
Copy link
Contributor Author

hedaoyuan commented Dec 15, 2016

Reorganize the code

Reorganize the code in order to make these computational functions more readable and easy to test.
Consider adding CPU and GPU functions for the X algorithm.

  • The original code structure
  1. The CPU code for algorithm X is implemented in the member function of the CpuMatrix class of the Matrix.cpp file.
  2. The GPU code for algorithm X is implemented in a cu file of the cuda module and then encapsulated into the member functions of the GpuMatrix class of the Matrix.cpp file;
  3. In order to be able to compile the version of ONLY_CPU, need to add PADDLE_ONLY_CPU macros to each member function in GpuMatrix(this is ugly), or implement an empty interface in the cuda\stub(this will bring some confusion).
  4. Add a test case in the test_matrixCompare.cpp file to compare the code implementation of the CPU and the GPU(the test case is quite tedious).
  • New code structure
  1. All the code is placed in the function directory, the CPU code placed in the x_op.cpp file, the GPU code placed in the x_op_gpu.cu file;
  2. Implement the test case in the x_op_test.cpp file to compare the code implementation of the CPU and the GPU.

One of the obvious drawbacks of the original code structure is that the Matrix.cpp files will become larger and often changed.

@hedaoyuan
Copy link
Contributor Author

hedaoyuan commented Dec 15, 2016

Function Interface

According to the issues of a unified function interface description, the simplest form of the interface should be based on one input to calculate one output.

Output = Function(Input);

In order to support multiple inputs and multiple outputs, the interface form needs to be slightly modified.

Outputs = Function(Inputs);

In order to enable the calculation result to be assigned to the output or to be added to the output, the interface needs to be designed to support both modes.

Outputs = Function(Inputs);
Outputs += Function(Inputs);

@hedaoyuan
Copy link
Contributor Author

FAQ
@tianbingsz 在基于Function开发上的一些问题:

  1. FunctionBase::calc只有一个类型,在重构Matrix API时有些不是很好写,是否可以增加虚函数类型。
    不用增加虚函数类型。FunctionBase设计的初衷就是一个calc接口类型,当前的class Tensor只能表示dense的数据类型,后续会扩展。
  2. FunctionCompare::cmpWithArg也只有一个接口类型,有些test case参数是在cmpWithArg外面初始化的,调用cmpWithArg时参数会被覆盖
    这里同样只需要一个接口,本意是test case只需要给出被测Function类型和参数类型(上面commit中的Function Arguments Type Checking实现后参数类型也不用给了),参数的分配初始化都在接口里面实现,简化test case的写法。

@hedaoyuan
Copy link
Contributor Author

hedaoyuan commented Jan 5, 2017

Function Argument Type
Function的calc接口包含inputs, outputs, inouts三个参数,参数的类型是BufferArgs结构,一个BufferArgs结构包含a list of BufferArgBufferArg描述每个input/output/inout的类型。

Paddle中Function的参数类型主要有以下四种;

  1. TENSOR_NORMAL用于描述参数是一个任意维度的稠密Buffer,对应BufferArg结构;
  2. TENSOR_SEQUENCE_ID描述参数是一个一维的向量,不同于TENSOR_NORMAL的是Buffer中的值是一个递增的序列,用于描述sequenceStartPositions这种特殊的Buffer,另外隐含着number sequences信息,对应SequenceIdArg结构,一般和一个TENSOR_NORMAL一起描述一个TENSOR_SEQUENCE_DATA
  3. TENSOR_SEQUENCE_DATA是用于描述一个sequence数据的Buffer,paddle的sequence data支持一个mini-batch中包含多个不等长的序列,实现上用一个TENSOR_NORMALTENSOR_SEQUENCE_ID一起来描述一个sequence data,对应SequenceArg结构;
  4. TENSOR_SPARSE用于描述一个SparseMatrix Buffer,当前Paddle的Sparse支持CSR/CSC两种Format及NO_VALUE和FLOAT_VALUE两种DataType;对应SparseMatrixArg结构。
enum BufferType {
  TENSOR_NORMAL = 0,
  TENSOR_SEQUENCE_ID = 1,
  TENSOR_SEQUENCE_DATA = 2,
  TENSOR_SPARSE = 3
};

@hedaoyuan
Copy link
Contributor Author

hedaoyuan commented Jan 6, 2017

A new design (#892 (comment))

---------------------- obsolete-------------------------------
Function outputs and inouts Argument
Function::calc接口原型中包含inputsoutputsinouts三个参数。

  virtual void calc(const BufferArgs& inputs,
                    const BufferArgs& outputs,
                    const BufferArgs& inouts) {}

其中outputsinouts 的区别是,如果Function计算结果赋值到output buffer是Assign模式,则会通过outputs来传递output buffer;如果计算结果是Add模式,则会通过inouts来传递output buffer。相应的Function::calc在实现上要通过判断参数来区分。
其中add模式主要包括如下一些场景:

  1. 使用MixedLayer连接多个input buffer及Function到同一个output buffer,需要用inouts来传递output buffer;
  2. 一个inputLayer链接多个outputLayer,每个outputLayer在调用backward Function时需要用inouts来传递inputGrad参数;

@tianbingsz
Copy link
Contributor

@hedaoyuan , 对于1 MixedLayer, 为什么要用inouts, 这里的output是Assign还是Add?能否给个小例子?

@hedaoyuan
Copy link
Contributor Author

MixedLayer只链接一个inputLayer到output,则只需要output = Function(input);
如果连接多个,并对应多个Function则需要:
output = Function1(input1);
output += Function2(input2);

@tianbingsz
Copy link
Contributor

tianbingsz commented Jan 9, 2017 via email

@hedaoyuan
Copy link
Contributor Author

hedaoyuan commented Jan 12, 2017

Some Primary Commit

Function prototype

/**
 * \brief Base class for Function.
 * The basic Function implementation requires override init and calc interfaces.
 *
 * Function inputs are readonly, Function outputs have two modes: ASSIGN_TO
 * and ADD_TO.
 * If output.getArgType() == ASSIGN_TO, this is assign mode, and the calculation
 * result of Function assigned to the output BufferArg.
 * If output.getArgType() == ADD_TO, this is add mode, and the calculation
 * result of Function need added to the output BufferArg.
 *
 * For example:
 * ASSIGN_TO: output = Function(inputs)
 * ADD_TO: output += Function(inputs)
 * If Function has more than one output, each output can have different modes.
 */

@hedaoyuan
Copy link
Contributor Author

Function comments

Function是Paddle Computation的抽象,被其他模块(Layer,ParameterUpdate等)调用,所以每个Function需要有清晰的注释,包括具体的实现,输入输出参数类型,以及Function对于output参数(ASSIGN_TO/ADD_TO)的支持等;注释的目的是能够让使用者不必阅读具体代码也能明白Function的实现逻辑并正确的调用。

一些基本的Function注释方法:

  1. 如果Funciton实现来自于某篇论文公式,可在注释中注明来源和公式,并将Function的参数和公式的参数之间的关系描述清楚;
  2. 参数的类型在注释中要说明清楚,以便调用者能够正确的输入参数;

zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this issue Sep 25, 2019
* split layers,test=develop

* split nn layers

* remove nn header, test=develop
lizexu123 pushed a commit to lizexu123/Paddle that referenced this issue Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants