Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fea/init tensorrt engine #10003

Merged
merged 34 commits into from
Apr 25, 2018
Merged

Conversation

Superjomn
Copy link
Contributor

@Superjomn Superjomn commented Apr 18, 2018

fixes: #10004

@Superjomn Superjomn force-pushed the fea/tensorrt_engine branch 2 times, most recently from c64c5ac to 4da8cbd Compare April 18, 2018 02:33
@Superjomn Superjomn requested review from luotao1 and Xreki April 18, 2018 02:36
@Xreki Xreki added the 预测 原名Inference,包含Capi预测问题等 label Apr 18, 2018
@Superjomn Superjomn force-pushed the fea/tensorrt_engine branch 2 times, most recently from f7d80a2 to 74ea1f6 Compare April 18, 2018 05:28
Copy link
Contributor

@Xreki Xreki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议issue里面对engine的设计初衷描述的详细些。因为没有设计文档,很难判断所有的接口是不是都是必须的。

distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copyright的格式调整一下吧。另外新加入的文件copyright里面的年份应该是2018年。

#include "paddle/fluid/framework/framework.pb.h"

namespace paddle {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

按照Paddle对于命名空间的使用规则,这里应该还有一层namespace inference
另外,有个问题确认一下,TensorRT实现相关的所有代码,包括tensorrt_op,都放置在inference目录下吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可能会有下面的目录

  • inference/engine.h, engine 宏观接口
  • inference/tensorrt 装tensorrt 相关的内容
    • engine_op[.h/.cc], 包含 TensorrtEngineOp
    • convert[.h/.cc] 帮convert fluid op -> tensorrt layer
  • inference/alajin 装 alajin的
    类似 tensorrt的

virtual void Build(const PbType& paddle_model) = 0;

// Execute the engine, that will run the inference network.
virtual void Execute(int batch_size) = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execute函数是不是最好以Paddle的LoDTensor类型作为参数?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个 engine 只是个工具类,会由 TensorrtEngineOp 调用。

我的理解, TensorrtEngineOp 应该会把 engine 封装成 fluid op

engine 的 input 和 output 以及个数都不固定,有 DeclInputDeclOutput 两个分别来创建 tensorrt 的 Input 和 output 节点。

这个类会暴露出比较多的小接口,这些接口会帮助构建 tensorrt 的network 以及 runtime engine, 中间小接口基本是不可少的。 最重要的用处是在 TensorrtEngineOp 里,额外的会在各种 UT 里使用, 比如 fluid_op 与convert后的 tensorrt layer 之间做无diff, 用这个类可以帮助跑 tensorrt layer和取结果。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeclInput 和 DeclOutput 两个分别来创建 tensorrt 的 Input 和 output 节点

这两个和Convert类里的ConvertInput和ConvertOutput里,除了转的那步,功能很类似。有更好的设计办法么?

Decl这四个字母含义不够清晰,能否就叫Add呢?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

会改成 DeclareInput,表示在 TensorRT network中添加 data 节点

@Superjomn
Copy link
Contributor Author

这个engine是工具类,会尽快加一个宏观的design,但这个类应该不需要涉及到接口设计,只会在 TensorrtEngineOp 这个层次加接口设计。 @Xreki


/*
* EngineBase is the base class of all inference engines. An inference engine
* takes a paddle program as input, and output the result in paddle Tensor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. output->outputs
  2. paddle Tensor format->fluid tensor format

* EngineBase is the base class of all inference engines. An inference engine
* takes a paddle program as input, and output the result in paddle Tensor
* format. It can be used to optimize performance of computation subgraphs, for
* example, break down the original model into subgraphs and execute each
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用block概念是不是更为统一,下同

  1. original model:original block?
  2. subgraphs->sub-block?

* When inference, the resnet50 model can put most of the model into subgraph
* and run it on a TensorRT engine.
*
* There are several engines such as TensorRT and other internal frameworks, so
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other internal frameworks,可以去掉internal

*/
class EngineBase {
public:
// TODO fix it latter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请问39行是要fix什么呢?能否写的详细一点
PbType是指什么?

Copy link
Contributor Author

@Superjomn Superjomn Apr 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里 PbType 表示 desc 的格式,会换一个明确的名字

// them, and an macro like this is more extensible when underlying TensorRT
// library add new layer supports.
#define TRT_ENGINE_ADD_LAYER(engine__, layer__, ARGS...) \
engine__->network()->add##layer__(ARGS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请问这个宏定义可以去掉么?

  • 直接用原来的函数也很清晰;
  • 因为convert类里面也需要add不同的layer,那么convert类需要包含engine类的头文件,是不是不太合理?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个宏会

  • 提供统一add layer 的接口,而不需要为每种layer增加一个函数,比如 addFullyConnected

PADDLE_ENFORCE(output != nullptr);
output->setName(name.c_str());
infer_network_->markOutput(*output);
buffer_sizes_[name] = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么156行设成0?

Copy link
Contributor Author

@Superjomn Superjomn Apr 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ comment

}

void*& TensorrtEngine::buffer(const std::string& name) {
PADDLE_ENFORCE(infer_engine_ != nullptr, "call freezenetwork first.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

freeze network 中间加空格

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FreeNetwork

}

void TensorrtEngine::DeclOutput(nvinfer1::ILayer* layer, int offset,
const std::string& name) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

前两个参数也是const类型

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const layer

}

void TensorrtEngine::SetInputFromCPU(const std::string& name, void* data,
size_t size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t size也是const类型

TensorrtEngine::Weight bias(TensorrtEngine::data_type::kFLOAT, raw_bias,
size);
auto* x = engine_->DeclInput("x", TensorrtEngine::data_type::kFLOAT,
nvinfer1::DimsCHW{1, 1, 1});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. data_type没有必要包装一遍,可以直接用nvinfer原来的格式,更加清晰。这里nvinfer1::DimsCHW也是用的nvinfer的格式。
  2. 如果要包装的话,要在TensorrtEngine类下么,那convert类使用这些type的时候,要调用TensorrtEngine类?
  3. weight类也是如此。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

engine 的.cc 里应该要调用 convert,两者交叉调用应该没有问题。

float x_v = 1234;
engine_->SetInputFromCPU("x", (void*)&x_v, 1 * sizeof(float));
LOG(INFO) << "to execute";
engine_->Execute(1);
Copy link
Contributor

@luotao1 luotao1 Apr 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the model to create an engine and an execution context这一部分不封装了么?

Logger logger;
nvinfer1::IRuntime* runtime = createInferRuntime(logger);
nvinfer1::ICudaEngine* engine =
runtime->deserializeCudaEngine(model->data(), model->size(), nullptr);
model->destroy();
nvinfer1::IExecutionContext* context = engine->createExecutionContext();
// Execute the network.
float input = 1234;
float output;
Execute(*context, &input, &output);
EXPECT_EQ(output, input * 2 + 3);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里就不需要 serialize 了

@Xreki Xreki added this to Integrate TensorRT in Inference Framework Apr 20, 2018
* There are two alternative ways to use it, one is to build from a paddle
* protobuf model, another way is to manully construct the network.
*/
class TensorrtEngine : public EngineBase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TensorrtEngine->TensorRTEngine,rt大写,和TensorRT一致。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@luotao1 luotao1 mentioned this pull request Apr 23, 2018
Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -1 +1,4 @@
nv_test(test_tensorrt SRCS test_tensorrt.cc DEPS dynload_cuda device_context dynamic_loader)
if(WITH_TESTING)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可以不加if(WITH_TESTING),因为在nv_test里面会做判断。可以之后的PR修改。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

4 // kINT32
};

// The following two API are implemented in TensorRT's header file, cannot load
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API-》APIs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@Superjomn Superjomn merged commit 2d57158 into PaddlePaddle:develop Apr 25, 2018
@Superjomn Superjomn deleted the fea/tensorrt_engine branch April 25, 2018 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
预测 原名Inference,包含Capi预测问题等
Projects
No open projects
Inference Framework
Integrate TensorRT
Development

Successfully merging this pull request may close these issues.

trt_engine_op/init engine
3 participants