Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge mkldnn output grad #4759

Merged
merged 5 commits into from
Oct 15, 2017
Merged

Conversation

tensor-tang
Copy link
Contributor

@tensor-tang tensor-tang commented Oct 12, 2017

fix #4697

  • add merge output grad for branches
  • add gtest comparing with cpu net, only support forward yet

@tensor-tang tensor-tang changed the title [WIP] Merge mkldnn output grad Merge mkldnn output grad Oct 13, 2017
cpuOutVal_ = out;
}
// when output is cpu device, change the mkldnn output value and make they
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they->them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx, done

* and reset the merge grad primitive if needed.
* note: when this layer have serval output,
* do not support mixing with cpu device,
* because can not get memory desc from cpu device.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when this layer has serval outputs, it could not be mixed with cpu device, since it can not get memory desc from cpu device.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx, done


auto sumPD = mkldnn::sum::primitive_desc(
tmpOutGrad_->getMemoryDesc(), scales, srcPDs);
mergeGrad_.reset(new mkldnn::sum(sumPD, srcs, *tmpOutGrad_));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是调用sum这个接口来merge grad,对么?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的,调用mkldnn::sum来merge

if (outputMap_.size() <= 1) {
return;
}
std::vector<double> scales;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里scale可以初始化为1.0,不需要后面一个个push了吧。我看都是1.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

没问题

${CMAKE_CURRENT_BINARY_DIR}/test_CompareMKLDNNandCPU
--config_file_a=trainer/tests/sample_trainer_config_branch_net.conf --use_mkldnn_a=True
--config_file_b=trainer/tests/sample_trainer_config_branch_net.conf --use_mkldnn_b=False
--use_gpu=False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后面是否还会新加对比网络的测试呢?
这里的COMMAND能在CMake里封装下,从而简化下45-56行么?

test_CompareMKLDNNandCPU --config_file=trainer/tests/sample_trainer_config_branch_net.conf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

现在是想着按照种类分,有分支和没有分支的,再加layer就直接改conf文件内容就可以了。

确实可以简化下,done。

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luotao1 luotao1 merged commit 17b4cea into PaddlePaddle:develop Oct 15, 2017
@tensor-tang tensor-tang moved this from Doing to Done in Optimization on Intel Platform Oct 16, 2017
@tensor-tang tensor-tang deleted the merge_grad branch October 16, 2017 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Merge topdiffs before backward
2 participants