Feature/fc converter #11043

Superjomn · 2018-05-30T10:30:58Z

No description provided.

…/fc_converter

luotao1 · 2018-06-01T02:41:57Z

paddle/fluid/inference/tensorrt/convert/fc_op.cc

+    framework::OpDesc op_desc(op, nullptr, nullptr);
+    PADDLE_ENFORCE_EQ(op_desc.Input("X").size(), 1);
+    PADDLE_ENFORCE_EQ(op_desc.Input("Y").size(), 1);     // Y is a weight
+    PADDLE_ENFORCE_EQ(op_desc.Output("Out").size(), 1);  // Y is a weight


61行的注释错了。

luotao1 · 2018-06-01T02:45:02Z

paddle/fluid/inference/tensorrt/convert/fc_op.cc

+  for (int h = 0; h < shape.h(); ++h) {
+    for (int w = 0; w < shape.w(); ++w) {
+      odata[h * ostrides.h() + w * ostrides.w()] =
+          idata[h * ostrides.h() + w * ostrides.w()];


从函数实现看，odata[i] = idata[i]，所以还需要转么？

从 TF 里拷贝的，试了下，很奇怪，这个函数好像必须要用。还有一个 Reorder4的函数，后面用到的时候我再仔细看下。

后面写Reorder4的时候，可以这里再优化下：

for (int h = 0; h < shape.h(); ++h) { for (int w = 0; w < shape.w(); ++w) { int index = h * ostrides.h() + w * ostrides.w(); odata[index] = idata[index];

luotao1 · 2018-06-01T02:45:42Z

paddle/fluid/inference/tensorrt/convert/fc_op.cc

+namespace inference {
+namespace tensorrt {
+
+template <typename T>


Reorder2和ReorderCKtoKC和函数功能和参数含义，请加一下注释。

luotao1 · 2018-06-01T02:47:54Z

paddle/fluid/inference/tensorrt/convert/fc_op.cc

+    PADDLE_ENFORCE_NOT_NULL(Y_v);
+    auto* Y_t = Y_v->GetMutable<framework::LoDTensor>();
+    // This may trigger a CPU->GPU copy.
+    // TODO(Superjomn) use some smarter mutable_data.


Y_v从scope里读取的话，可以一开始就在gpu环境下，这里就不用拷贝了。

更新了 comment

luotao1 · 2018-06-01T03:24:37Z

paddle/fluid/inference/tensorrt/convert/ut_helper.h

+  void DeclParamVar(const std::string& name, const nvinfer1::Dims& dims) {
+    DeclVar(name, dims);
+  }
+
  void DeclOutputVar(const std::string& name, const nvinfer1::Dims& dims) {
    DeclVar(name, dims);
  }


DeclParamVar和DeclOutputVar是一模一样的，需要封两个函数么？

用的时候的确需要不同语义，这里区分下 program 和 output，不然代码里两处都是 DeclVar，看不出区别

luotao1 · 2018-06-01T03:26:00Z

paddle/fluid/inference/tensorrt/convert/ut_helper.h


    ASSERT_FALSE(op_desc_->OutputArgumentNames().empty());
    for (const auto& output : op_desc_->OutputArgumentNames()) {
      std::vector<float> fluid_out;
-      std::vector<float> trt_out(200);
+      std::vector<float> trt_out(200, 2008.);


请注释说下200和2008的含义。

luotao1 · 2018-06-01T03:26:41Z

paddle/fluid/operators/tensorrt_engine_op.cc

@@ -31,8 +31,10 @@ void paddle::operators::TensorRTEngineKernel<DeviceContext, T>::Prepare(
  auto max_workspace = context.Attr<int>("max_workspace");
  engine_.reset(new inference::tensorrt::TensorRTEngine(
      max_batch_, max_workspace, nullptr));
+  // TODO(Superjomn) parameters should be passed be analysised and passed from
+  // outside.


34-35行注释请更新下，有两个be？

…/fc_converter

luotao1

LGTM

luotao1 · 2018-06-01T06:33:36Z

paddle/fluid/inference/tensorrt/convert/fc_op.cc

+  for (int h = 0; h < shape.h(); ++h) {
+    for (int w = 0; w < shape.w(); ++w) {
+      odata[h * ostrides.h() + w * ostrides.w()] =
+          idata[h * ostrides.h() + w * ostrides.w()];


后面写Reorder4的时候，可以这里再优化下：

for (int h = 0; h < shape.h(); ++h) { for (int w = 0; w < shape.w(); ++w) { int index = h * ostrides.h() + w * ostrides.w(); odata[index] = idata[index];

Superjomn added 5 commits May 29, 2018 17:33

init

768cf55

init

7fdbd92

init

8df9096

fix compile error

96fe6a1

fix ut

5871e04

Superjomn force-pushed the feature/fc_converter branch from 943a90c to 5871e04 Compare May 31, 2018 12:39

Superjomn added 2 commits May 31, 2018 20:40

Merge branch 'develop' of github.com:PaddlePaddle/Paddle into feature…

20739a8

…/fc_converter

fix compile

5d24919

Superjomn requested a review from luotao1 June 1, 2018 00:26

luotao1 reviewed Jun 1, 2018

View reviewed changes

Superjomn added 2 commits June 1, 2018 13:56

fix follow review

d5bd249

Merge branch 'develop' of github.com:PaddlePaddle/Paddle into feature…

c0829e5

…/fc_converter

luotao1 approved these changes Jun 1, 2018

View reviewed changes

Superjomn merged commit 0c0c5df into PaddlePaddle:develop Jun 1, 2018

Superjomn deleted the feature/fc_converter branch June 1, 2018 07:39

Superjomn mentioned this pull request Jun 1, 2018

fix compile error #11119

Merged

luotao1 mentioned this pull request Jun 12, 2018

fc/mul TRT converter #10632

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/fc converter #11043

Feature/fc converter #11043

Superjomn commented May 30, 2018

luotao1 Jun 1, 2018

luotao1 Jun 1, 2018

Superjomn Jun 1, 2018

luotao1 Jun 1, 2018

luotao1 Jun 1, 2018

luotao1 Jun 1, 2018

Superjomn Jun 1, 2018 •

edited

Loading

luotao1 Jun 1, 2018

Superjomn Jun 1, 2018

luotao1 Jun 1, 2018

luotao1 Jun 1, 2018

luotao1 left a comment

luotao1 Jun 1, 2018

Feature/fc converter #11043

Feature/fc converter #11043

Conversation

Superjomn commented May 30, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Superjomn Jun 1, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luotao1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Superjomn Jun 1, 2018 •

edited

Loading