Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MKL-DNN] Fully Connected #15226

Merged
merged 35 commits into from
May 24, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
0217527
Reimplement the FC forward operator
yihuaxu Aug 30, 2018
333123d
fuse mul and elementwise add to fc
bingyanghuang Aug 24, 2018
a086125
Fix FC MKLDNN integration by transposing weights
Sand3r- Oct 3, 2018
a6ed9d6
Add FC MKLDNN Pass
Sand3r- Oct 9, 2018
3a758f4
FC MKLDNN Pass: change memcpy to std::copy
Sand3r- Oct 12, 2018
1b038bf
Fix MKLDNN FC handling of mismatch input and weights dims
Sand3r- Oct 17, 2018
165a65c
Lower tolerance for MKL-DNN in resnet50 test
Sand3r- Oct 17, 2018
15cb840
Adjust FC to support MKLDNN Op placement
Sand3r- Oct 19, 2018
278b3ed
Adjust Placement Op to set use_mkldnn attribute for graph
Sand3r- Oct 22, 2018
161c9ee
MKLDNN FC: fix weights format so that gemm version is called
Sand3r- Nov 6, 2018
a36e01c
FC MKLDNN: Remove tolerance decrease from tester_helper
Sand3r- Dec 12, 2018
98aef65
FC MKL-DNN: Refactor the code, change input reorder to weight reorder
Sand3r- Dec 12, 2018
df52aa1
MKL-DNN FC: Introduce operator caching
Sand3r- Dec 18, 2018
4160b65
FC MKL-DNN: Fix the tensor type in ExpectedKernelType
Sand3r- Dec 18, 2018
48373a6
FC MKL-DNN: fix style changes
Sand3r- Dec 18, 2018
f689a45
FC MKL-DNN: fallback to native on non-supported dim sizes
Sand3r- Jan 8, 2019
20d919e
FC MKLDNN: fix CMake paths
Sand3r- Feb 26, 2019
85226b5
FC MKLDNN: Refine placement pass graph mkldnn attribute
Sand3r- Feb 26, 2019
5bd0f7b
Fix Transpiler error for fuse_conv_eltwise
Sand3r- Mar 13, 2019
bb5b170
Fix missing STL includes in files
Sand3r- Mar 14, 2019
b5efc68
FC MKL-DNN: Enable new output size computation
Sand3r- Apr 2, 2019
5853915
FC MKL-DNN: enable only when fc_mkldnn_pass is enabled
Sand3r- Apr 3, 2019
f2eb6a4
FC MKL-DNN: Allow Weights to use oi or io format
Sand3r- Apr 3, 2019
492ecbc
FC MKL-DNN: Adjust UT to work with correct dims
Sand3r- Apr 5, 2019
5979bfa
Enable MKL DEBUG for resnet50 analyzer
Sand3r- Apr 15, 2019
7f900c4
FC MKL-DNN: Improve Hashing function
Sand3r- Apr 15, 2019
36dcdd3
FC MKL-DNN: Fix shape for fc weights in transpiler
Sand3r- Apr 17, 2019
6a63263
FC MKL-DNN: Update input pointer in re-used fc primitive
Sand3r- Apr 17, 2019
110945c
Add log for not handling fc fuse for unsupported dims
Sand3r- Apr 17, 2019
886d74d
FC MKL-DNN: Move transpose from pass to Op Kernel
Sand3r- May 13, 2019
f7ecfb7
FC MKL-DNN: Disable transpose in unit test
Sand3r- May 14, 2019
68c0fda
FC MKL-DNN: Remove fc_mkldnn_pass from default list
Sand3r- May 22, 2019
7dd6ce3
Correct Flag for fake data analyzer tests
Sand3r- May 22, 2019
678904c
FC MKL-DNN: Add comment about fc mkldnn pass disablement
Sand3r- May 22, 2019
04cbeeb
FC MKL-DNN: Disable fc in int8 tests
Sand3r- May 23, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion cmake/generic.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -385,7 +385,7 @@ function(cc_test TARGET_NAME)
set_property(TEST ${TARGET_NAME} PROPERTY ENVIRONMENT FLAGS_cpu_deterministic=true)
set_property(TEST ${TARGET_NAME} PROPERTY ENVIRONMENT FLAGS_init_allocated_mem=true)
set_property(TEST ${TARGET_NAME} PROPERTY ENVIRONMENT FLAGS_limit_of_tmp_allocation=4294967296) # 4G
set_property(TEST ${TARGET_NAME} PROPERTY ENVIRONMENT FLAGS_cudnn_deterministic=true)
set_property(TEST ${TARGET_NAME} PROPERTY ENVIRONMENT FLAGS_cudnn_deterministic=true ${MKL_DEBUG_FLAG})
# No unit test should exceed 10 minutes.
set_tests_properties(${TARGET_NAME} PROPERTIES TIMEOUT 600)
endif()
Expand Down
1 change: 1 addition & 0 deletions paddle/fluid/framework/ir/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ if(WITH_MKLDNN)
pass_library(conv_brelu_mkldnn_fuse_pass inference mkldnn)
pass_library(conv_concat_relu_mkldnn_fuse_pass inference mkldnn)
pass_library(conv_elementwise_add_mkldnn_fuse_pass inference mkldnn)
pass_library(fc_mkldnn_pass inference mkldnn)
pass_library(cpu_quantize_placement_pass base mkldnn)
pass_library(cpu_quantize_pass inference mkldnn)
pass_library(cpu_quantize_squash_pass inference mkldnn)
Expand Down
2 changes: 2 additions & 0 deletions paddle/fluid/framework/ir/fc_fuse_pass.cc
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
// limitations under the License.

#include "paddle/fluid/framework/ir/fc_fuse_pass.h"
#include <memory>
#include <string>
#include <unordered_set>
#include <vector>
Expand Down Expand Up @@ -80,6 +81,7 @@ void FCFusePass::ApplyImpl(ir::Graph* graph) const {
}

desc.SetType("fc");

auto fc_node = g->CreateOpNode(&desc); // OpDesc will be copied.
GraphSafeRemoveNodes(graph, {mul, elementwise_add, mul_out});

Expand Down
30 changes: 30 additions & 0 deletions paddle/fluid/framework/ir/graph_pattern_detector.cc
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,10 @@

#include <algorithm>
#include <array>
#include <memory>
#include <string>
#include <unordered_map>
#include <unordered_set>
#include <vector>

#include "paddle/fluid/framework/ir/graph_helper.h"
Expand Down Expand Up @@ -896,6 +899,33 @@ PDNode *patterns::FC::operator()(paddle::framework::ir::PDNode *x,
}
}

PDNode *patterns::FCMKLDNN::operator()(paddle::framework::ir::PDNode *x,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have PDNode *patterns::FC::operator(), why we need PDNode *patterns::FCMKLDNN::operator again?
We don't have PDNode *patterns::xxxMKLDNN:: operator() in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because FC pattern detector searches for mul + elementwise_add pattern, while FCMKLDNN searches for an FC op pattern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not get this point actually.
From pattern side, is there some different from the original FC pattern?
Maybe only need add use_mkldnn=True?

Copy link
Contributor Author

@Sand3r- Sand3r- Apr 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes, it is different:
  • Paddle's FC Pattern searches for Mul Op followed by Elementwise Add Op.
  • MKL-DNN's FC Pattern searches just for FC Op that was created by fc_fuse_pass. (It doesn't search for mul/elementwise_add)
  1. It is not enough to set use_mkldnn to true, because the weights ("W") of the operator need to be transposed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why you transpose the weights of this Op in the pass is to avoid duplicated transpose in the Op's compute(). How much elapsed time of this weight transpose?

However, from framework perspective, xxx_mkldnn_op should have the same behavior with xxx_op, they could have different kernels but with the same input/output/weights, i.e, fc's weights should be transposed in the mkldnn kernel.

Currently,

  • this fc_mkldnn_op is hard to extend to training. If fc_mkldnn_op wants to support training, how do you implement with it?
  • It is not enough to set use_mkldnn to true may make users confused.

@jianhang-liu How do you think about it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to do that, because if we save the optimized model with all the passes applied, once it will be loaded again it will execute just fine in MKL-DNN environment, because the passes would be already applied and the weights transposed.

There is no point in running saved optimized model in other environment anyway, because the passes such as conv + batch_norm + bias have already introduced changes which are only applicable in MKL-DNN-only environment (there is no support for bias in reference conv).

Is that what you meant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it for Fp32, but for INT8, we will running saved optimized model in other environment. #17097 base on this PR and use weights transpose as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, if kernel from #17097 will be accustomed to use transposed weights, then everything should be set and ready for running saved optimized model in int8 python environment. If not, it is always possible to transpose the weights back using transpiler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current implementation doesn't modify weights. It transposes them and stores them internally in execution context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have PDNode *patterns::FC::operator(), why we need PDNode *patterns::FCMKLDNN::operator again?
We don't have PDNode *patterns::xxxMKLDNN:: operator() in this file
The pattern is needed to detect Fully-Connected layer created by fc_fuse_pass.

bool with_bias) {
// Create shared nodes.
x->assert_is_op_input("fc", "Input");

auto *fc_op = pattern->NewNode(fc_repr())->assert_is_op("fc");
// Create variables
// Filter
auto *fc_weight_var = pattern->NewNode(weights_repr())
->AsInput()
->assert_is_persistable_var()
->assert_is_op_input("fc", "W");
// Bias
auto *fc_bias_var = pattern->NewNode(bias_repr())
->AsInput()
->assert_is_persistable_var()
->assert_is_op_input("fc", "Bias");
// Output
auto *fc_out_var = pattern->NewNode(output_repr())
->AsOutput()
->assert_is_op_output("fc", "Out")
->assert_is_only_output_of_op("fc");

fc_op->LinksFrom({x, fc_weight_var, fc_bias_var}).LinksTo({fc_out_var});
return fc_out_var;
}

PDNode *patterns::Embedding::operator()(PDNode *x) {
x->assert_is_op_input("lookup_table", "Ids");
auto *lookup_table_op =
Expand Down
19 changes: 19 additions & 0 deletions paddle/fluid/framework/ir/graph_pattern_detector.h
Original file line number Diff line number Diff line change
Expand Up @@ -517,6 +517,25 @@ struct FC : public PatternBase {
PATTERN_DECL_NODE(Out);
};

// MKL-DNN's FC with bias
// op: fc
// named node:
// fc
// w, bias, output
struct FCMKLDNN : public PatternBase {
FCMKLDNN(PDPattern* pattern, const std::string& name_scope)
: PatternBase(pattern, name_scope, "fc_mkldnn") {}

PDNode* operator()(PDNode* x, bool with_bias);

// declare operator node's name
PATTERN_DECL_NODE(fc);
// declare variable node's name
PATTERN_DECL_NODE(weights);
PATTERN_DECL_NODE(bias);
PATTERN_DECL_NODE(output);
};

// Embedding
struct Embedding : public PatternBase {
Embedding(PDPattern* pattern, const std::string& name_scope)
Expand Down
77 changes: 77 additions & 0 deletions paddle/fluid/framework/ir/mkldnn/fc_mkldnn_pass.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
// Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#include "paddle/fluid/framework/ir/mkldnn/fc_mkldnn_pass.h"
#include <algorithm>
#include <memory>
#include <string>
#include <vector>
#include "paddle/fluid/framework/eigen.h"
#include "paddle/fluid/framework/lod_tensor.h"
#include "paddle/fluid/platform/enforce.h"

namespace paddle {
namespace framework {
namespace ir {

void FCMKLDNNPass::ApplyImpl(ir::Graph* graph) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have fc_fuse_pass and mkldnn_placement_pass, why we need fc_mkldnn_pass?
After fc_fuse_pass and mkldnn_placement_pass, we could call mkldnn kernel of fc_op.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because the weights of the layer need to be transposed in the fc mkldnn pass. This allows the mkl-dnn's algorithm to execute much more efficiently. I can further explain why is that if necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this makes sense, but cloud we reuse some codes, seems only weight need reorder?

But another question, does this only for inference? If works on training, does it would cause some grad update issue? or print weight issue since format may be not nchw.

Copy link
Contributor Author

@Sand3r- Sand3r- Apr 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What code shall be re-used in this case? The pass is taking care of checking whether the Input has correct dimensions and applies the transpose only in this case.

This op is only designed to work for inference only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fc_mkldnn_pass now turns MKL-DNN's fully connected layer on if input dimensions are 2 or 4.

PADDLE_ENFORCE(graph);
Init("fc_mkldnn_pass", graph);

auto* scope = param_scope();
PADDLE_ENFORCE(scope);

GraphPatternDetector gpd;
auto* x = gpd.mutable_pattern()
->NewNode("fc_mkldnn_pass/x")
->AsInput()
->assert_is_op_input("fc", "Input");
patterns::FCMKLDNN fc_pattern(gpd.mutable_pattern(), "fc_mkldnn_pass");
fc_pattern(x, true /*with bias*/);

int found_fc_count = 0;
auto handler = [&](const GraphPatternDetector::subgraph_t& subgraph,
Graph* g) {
VLOG(4) << "Handle FC MKL-DNN pass";
if (!(graph->Has("use_mkldnn") && graph->Get<bool>("use_mkldnn"))) {
VLOG(3) << "do not perform fc fuse";
return;
}
GET_IR_NODE_FROM_SUBGRAPH(fc, fc, fc_pattern);
GET_IR_NODE_FROM_SUBGRAPH(weights, weights, fc_pattern);
GET_IR_NODE_FROM_SUBGRAPH(bias, bias, fc_pattern);
GET_IR_NODE_FROM_SUBGRAPH(output, output, fc_pattern);

OpDesc* desc = fc->Op();
auto in_size = fc->inputs[0]->Var()->GetShape().size();
if (in_size != 2 && in_size != 4) {
VLOG(3) << "Do not enable FC MKL-DNN for dimensions different than 2 & 4";
return;
}
desc->SetAttr("use_mkldnn", true);
PADDLE_ENFORCE(subgraph.count(x));

found_fc_count++;
};

gpd(graph, handler);

AddStatis(found_fc_count);
}

} // namespace ir
} // namespace framework
} // namespace paddle

REGISTER_PASS(fc_mkldnn_pass, paddle::framework::ir::FCMKLDNNPass);
38 changes: 38 additions & 0 deletions paddle/fluid/framework/ir/mkldnn/fc_mkldnn_pass.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
// Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include <memory>
#include "paddle/fluid/framework/ir/fuse_pass_base.h"
#include "paddle/fluid/framework/ir/graph.h"
#include "paddle/fluid/framework/ir/graph_pattern_detector.h"
#include "paddle/fluid/framework/ir/pass.h"

namespace paddle {
namespace framework {
namespace ir {

/*
* Transpose weights of FC to comply with MKL-DNN interface
*/
class FCMKLDNNPass : public FusePassBase {
public:
virtual ~FCMKLDNNPass() {}

protected:
void ApplyImpl(ir::Graph* graph) const;
};

} // namespace ir
} // namespace framework
} // namespace paddle
4 changes: 4 additions & 0 deletions paddle/fluid/framework/ir/mkldnn/mkldnn_placement_pass.cc
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ See the License for the specific language governing permissions and
limitations under the License. */

#include "paddle/fluid/framework/ir/mkldnn/mkldnn_placement_pass.h"
#include <memory>
#include <string>
#include <unordered_set>

Expand All @@ -24,6 +25,9 @@ void MKLDNNPlacementPass::ApplyImpl(ir::Graph* graph) const {
VLOG(3) << "Applies MKL-DNN placement strategy.";
const auto& op_types_list =
Get<std::unordered_set<std::string>>("mkldnn_enabled_op_types");
if (!graph->Has("use_mkldnn")) {
graph->Set<bool>("use_mkldnn", new bool(true));
}
for (const Node* n : graph->Nodes()) {
if (n->IsOp()) {
auto* op = n->Op();
Expand Down
23 changes: 13 additions & 10 deletions paddle/fluid/inference/api/paddle_pass_builder.cc
Original file line number Diff line number Diff line change
Expand Up @@ -146,16 +146,19 @@ void CpuPassStrategy::EnableMKLDNN() {
if (!use_mkldnn_) {
passes_.insert(passes_.begin(), "mkldnn_placement_pass");

for (auto &pass : std::vector<std::string>(
{"depthwise_conv_mkldnn_pass", //
"conv_bn_fuse_pass", // Execute BN passes again to
"conv_eltwiseadd_bn_fuse_pass", // preserve correct pass order
"conv_bias_mkldnn_fuse_pass", //
"conv3d_bias_mkldnn_fuse_pass", //
"conv_elementwise_add_mkldnn_fuse_pass",
"conv_concat_relu_mkldnn_fuse_pass",
"conv_relu_mkldnn_fuse_pass", //
"conv_brelu_mkldnn_fuse_pass"})) {
for (auto &pass : std::vector<std::string>({
"depthwise_conv_mkldnn_pass", //
"conv_bn_fuse_pass", // Execute BN passes again to
"conv_eltwiseadd_bn_fuse_pass", // preserve correct pass order
"conv_bias_mkldnn_fuse_pass", //
"conv3d_bias_mkldnn_fuse_pass", //
"conv_elementwise_add_mkldnn_fuse_pass",
"conv_concat_relu_mkldnn_fuse_pass",
"conv_relu_mkldnn_fuse_pass", //
"conv_brelu_mkldnn_fuse_pass", //
// Disabled due to topology-dependent speed-up
// "fc_mkldnn_pass"
})) {
passes_.push_back(pass);
}
}
Expand Down
12 changes: 7 additions & 5 deletions paddle/fluid/inference/tests/api/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,10 @@ function(inference_analysis_api_int8_test target model_dir data_dir filename)
--paddle_num_threads=${CPU_NUM_THREADS_ON_CI}
--iterations=2)
endfunction()

function(inference_analysis_api_test_with_fake_data target install_dir filename model_name)
function(inference_analysis_api_test_with_fake_data target install_dir filename model_name mkl_debug)
if(mkl_debug)
set(MKL_DEBUG_FLAG MKL_DEBUG_CPU_TYPE=7)
endif()
download_model(${install_dir} ${model_name})
inference_analysis_test(${target} SRCS ${filename}
EXTRA_DEPS ${INFERENCE_EXTRA_DEPS}
Expand Down Expand Up @@ -143,15 +145,15 @@ inference_analysis_api_test_with_refer_result(test_analyzer_mobilenet_transpose

# googlenet
inference_analysis_api_test_with_fake_data(test_analyzer_googlenet
"${INFERENCE_DEMO_INSTALL_DIR}/googlenet" analyzer_resnet50_tester.cc "googlenet.tar.gz")
"${INFERENCE_DEMO_INSTALL_DIR}/googlenet" analyzer_resnet50_tester.cc "googlenet.tar.gz" false)

# resnet50
inference_analysis_api_test_with_fake_data(test_analyzer_resnet50
"${INFERENCE_DEMO_INSTALL_DIR}/resnet50" analyzer_resnet50_tester.cc "resnet50_model.tar.gz")
"${INFERENCE_DEMO_INSTALL_DIR}/resnet50" analyzer_resnet50_tester.cc "resnet50_model.tar.gz" true)

# mobilenet with depthwise_conv op
inference_analysis_api_test_with_fake_data(test_analyzer_mobilenet_depthwise_conv
"${INFERENCE_DEMO_INSTALL_DIR}/mobilenet_depthwise_conv" analyzer_resnet50_tester.cc "mobilenet_model.tar.gz")
"${INFERENCE_DEMO_INSTALL_DIR}/mobilenet_depthwise_conv" analyzer_resnet50_tester.cc "mobilenet_model.tar.gz" false)

# int8 image classification tests
if(WITH_MKLDNN)
Expand Down
1 change: 1 addition & 0 deletions paddle/fluid/inference/tests/api/analyzer_bert_tester.cc
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,7 @@ void profile(bool use_mkldnn = false) {

if (use_mkldnn) {
config.EnableMKLDNN();
config.pass_builder()->AppendPass("fc_mkldnn_pass");
}

std::vector<std::vector<PaddleTensor>> outputs;
Expand Down
3 changes: 2 additions & 1 deletion paddle/fluid/inference/tests/api/analyzer_dam_tester.cc
Original file line number Diff line number Diff line change
Expand Up @@ -200,8 +200,9 @@ void profile(bool use_mkldnn = false) {
cfg.EnableMKLDNN();
// Enable all the mkldnn supported ops except conv3d in dam
std::unordered_set<std::string> op_list = {"softmax", "elementwise_add",
"relu"};
"relu", "fc"};
cfg.SetMKLDNNOp(op_list);
cfg.pass_builder()->AppendPass("fc_mkldnn_pass");
}

std::vector<std::vector<PaddleTensor>> outputs;
Expand Down
1 change: 1 addition & 0 deletions paddle/fluid/inference/tests/api/analyzer_mm_dnn_tester.cc
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ void profile(bool use_mkldnn = false) {

if (use_mkldnn) {
cfg.EnableMKLDNN();
cfg.pass_builder()->AppendPass("fc_mkldnn_pass");
}

std::vector<std::vector<PaddleTensor>> input_slots_all;
Expand Down
2 changes: 2 additions & 0 deletions paddle/fluid/inference/tests/api/analyzer_resnet50_tester.cc
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ void profile(bool use_mkldnn = false) {

if (use_mkldnn) {
cfg.EnableMKLDNN();
cfg.pass_builder()->AppendPass("fc_mkldnn_pass");
}
std::vector<std::vector<PaddleTensor>> outputs;

Expand Down Expand Up @@ -79,6 +80,7 @@ void compare(bool use_mkldnn = false) {
SetConfig(&cfg);
if (use_mkldnn) {
cfg.EnableMKLDNN();
cfg.pass_builder()->AppendPass("fc_mkldnn_pass");
}

std::vector<std::vector<PaddleTensor>> input_slots_all;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,7 @@ void SetConfig(AnalysisConfig *cfg, bool use_mkldnn = false) {
}
if (use_mkldnn) {
cfg->EnableMKLDNN();
cfg->pass_builder()->AppendPass("fc_mkldnn_pass");
}
// Enable seqpool_concat_fuse_pass, disabled by default since it takes much
// time
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,7 @@ void profile(bool use_mkldnn = false) {
std::vector<std::vector<PaddleTensor>> outputs;
if (use_mkldnn) {
cfg.EnableMKLDNN();
cfg.pass_builder()->AppendPass("fc_mkldnn_pass");
}

std::vector<std::vector<PaddleTensor>> input_slots_all;
Expand Down
1 change: 1 addition & 0 deletions paddle/fluid/inference/tests/api/analyzer_vis_tester.cc
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ void profile(bool use_mkldnn = false) {
SetConfig(&cfg);
if (use_mkldnn) {
cfg.EnableMKLDNN();
cfg.pass_builder()->AppendPass("fc_mkldnn_pass");
}
// cfg.pass_builder()->TurnOnDebug();
std::vector<std::vector<PaddleTensor>> outputs;
Expand Down