Skip to content
Permalink
Browse files

Propagate the input layout requirements for convertTo nodes (#3831)

Summary:
The layout requirements for convertTo nodes should be the same as the requirement of their inputs. exact same behavior as quantization nodes. For example, if the input is in NCHW format, we should propagate that information.

Fixes #3826
Pull Request resolved: #3831

Test Plan:
```
Test-case from the issue:
model-runner -m predict_net.pb -m init_net.pb -backend=Interpreter -convert-to-fp16
Model: predict_net.pb
shape: ( 1 64 112 112 )
max: 328.750  min: -342.750
[[[[-4.305, -1.395, -1.973, -55.750, -28.594, 30.094, 1.790, -25.000, 6.543, 0.682, -17.844, 63.406, -13.500, -17.703, -63.969, 9.273, 28.141, 12.992, -30.406, -8.078, 3.312, 24.797, 2.342, 17.484, 13.156, -3.906, -52.469, -41.531, -4.105, 15.273, -9.305, -13.758, 28.453, -6.852, 13.828, -7.699, 3.133, -10.570, -12.523, -52.469, 7.520, -5.211, 9.406, -14.516, 0.541, -2.070, -1.676, -32.312, -4.531, 12.336, -26.359, 43.219, 42.219, 30.422, 3.301, -38.750, 14.617, 38.750, -29.219, -50.719, 0.854, -0.113, 4.035, -1.172, -23.875, -15.938, -8.805, 67.875, 7.152, -16.422, 56.875, -3.996, -42.562, 27.516, 6.699, -33.281, 28.078, -1.342, -6.727, -3.949, -23.953, 11.305, -29.656, -32.094, -67.562, 34.406, -38.656, 40.719, 31.188, 22.047, -83.938, -20.734, -5.492, -10.516, -11.422, 10.211, -13.719, 14.133, -31.797, 3.926, ...]
```

Differential Revision: D18765965

Pulled By: shajrawi

fbshipit-source-id: 17f2bd511754f0055259cb6a6cb8fe094097e7e7
  • Loading branch information
shajrawi authored and facebook-github-bot committed Dec 2, 2019
1 parent 0d57af5 commit 2b296422716df4464eda2f619d6c52b28038d15c
Showing with 62 additions and 1 deletion.
  1. +6 −0 docs/Backends.md
  2. +2 −0 docs/NewOperators.md
  3. +10 −0 docs/TensorLayout.md
  4. +22 −1 lib/Graph/TensorLayout.cpp
  5. +22 −0 tests/unittests/TensorLayoutTest.cpp
@@ -195,6 +195,12 @@ BB.newBackendSpecificNode("CPUMaxSplat")
.setDocstring("A Max node with one splat input; CPU specific.");
```

If tensor layout requirements are enabled for the backend, on should take
special care of updating the layout verifier when adding a new node.
See `TensorLayout.md` for more information.
To extend the example above, if the new node is data parallel, a `.dataParallel()`
line should be added.

During `transformPostLowering()`, this `CPUMaxSplat` node replaces the
aforementioned pattern. However, there must be a corresponding instruction for
this Node to be lowered to during the IRGen phase. Thus, we need a corresponding
@@ -8,6 +8,8 @@
#### High level IR
* Create a new Glow high level IR node in `ClassGen/NodeGen.cpp`. Run `ninja all` to generate the node. In the build directory, check `glow/AutoGenNodes.h` to ensure the node has been generated.
* Implement the `verify()` method for the new node in `Graph/Nodes.cpp`.
* Implement any node layout requirements, if any, see `TensorLayout.md` for details.
Specifically see the notes section under `Canonical Tensor Layout`.
* Implement a node creation method in `Graph/Graph.cpp`.
* Implement logic to load model that contains the operator in `Importer/Caffe2ModelLoader.cpp` or `Importer/ONNXModelLoader.cpp` depending on which type of model the operator comes from. Add the operator to `Importer/CommonOperatorLoader.h` instead if the loading logic can be shared between Caffe2 and ONNX. Add as much validation logic as possible here in the loader for the operator because it's crucial to catch errors at this stage. Once the operator is loaded, it is assumed that Glow will be able to successfully run the operator so any issues must be caught here.
#### Low level IR
@@ -108,6 +108,16 @@ derives from `TensorLayoutCommon` and overrides the following functions:
- This function takes an operator `Node *node` and returns the layout requirements of the Nth result `n`.
- It returns Common layout constraints, for example, `ConvolutionNode` should be in `NHWC` format.

Notes:

1. Some nodes can accept any layout as input, they are either data parallel, e.g. `Add`,
or, while not data parallel, do not care about the order of dimensions for their operation,
e.g. `ReshapeNodeKind`. When adding new nodes to Glow, such a behavior should be explicitly
specified, by adding `.dataParallel()` in NodeGen for example.
2. Some nodes propagate the layout information of their input, e.g. `convertTo` node,
when adding such nodes to Glow the canonical layout verifier should be aware of them.
We currently do that in `getNthInputLayoutRequirements`.

## Placeholders and Constants

An important thing to note is that some operators may have a `Placeholder` or
@@ -455,6 +455,10 @@ std::string TensorLayoutCommon::getNthInputLayoutRequirements(const Node *node,
auto input = QN->getInput();
return getNthResultLayoutRequirements(input.getNode(), input.getResNo());
}
if (const auto *CTN = llvm::dyn_cast<ConvertToNode>(node)) {
auto input = CTN->getInput();
return getNthResultLayoutRequirements(input.getNode(), input.getResNo());
}
if (const auto *QPN = llvm::dyn_cast<QuantizationProfileNode>(node)) {
switch (n) {
case QuantizationProfileNode::InputIndices::InputIdx: {
@@ -478,6 +482,19 @@ static unsigned getInputIdx(const Node *N, NodeValue in) {
return N->getNumInputs();
}

/// \returns true if getting the input's layout would cause an infinite loop.
static bool inputDoesNotKnowRequirements(const Node *node) {
switch (node->getKind()) {
case Kinded::Kind::TransposeNodeKind:
case Kinded::Kind::QuantizeNodeKind:
case Kinded::Kind::QuantizationProfileNodeKind:
case Kinded::Kind::ConvertToNodeKind:
return true;
default:
return false;
}
}

std::string TensorLayoutCommon::getNthResultLayoutRequirements(const Node *node,
size_t n) {
DCHECK_LT(n, node->getNumResults()) << "Wrong output number";
@@ -492,6 +509,9 @@ std::string TensorLayoutCommon::getNthResultLayoutRequirements(const Node *node,
}
// Dynamically form the layout description for transposes.
auto input = TN->getInput();
while (inputDoesNotKnowRequirements(input)) {
input = input.getNode()->getNthInput(0);
}
auto inputLayout =
getNthInputLayoutRequirements(node, TransposeNode::InputIdx);
auto inputLayoutHelper = TensorLayoutDescription(inputLayout);
@@ -524,7 +544,8 @@ std::string TensorLayoutCommon::getNthResultLayoutRequirements(const Node *node,
auto result = node->getNthResult(n);
auto *user = (*result.getUsers().begin()).getUser();
int inputIdx = getInputIdx(user, result);
if (inputIdx >= user->getNumInputs() || llvm::isa<TransposeNode>(user)) {
if (inputDoesNotKnowRequirements(user) ||
inputIdx >= user->getNumInputs() || llvm::isa<TransposeNode>(user)) {
return getLayoutsForDims()[dims.size()].getSerializedLayout();
}
auto layout = getNthInputLayoutRequirements(user, inputIdx);
@@ -16,6 +16,8 @@
#include "BackendTestUtils.h"

#include "glow/Backend/Backend.h"
#include "glow/Converter/Float16Converter.h"
#include "glow/Converter/TypeAToTypeBFunctionConverter.h"
#include "glow/Graph/Graph.h"
#include "glow/Graph/TensorLayout.h"
#include "llvm/Support/raw_ostream.h"
@@ -91,6 +93,26 @@ TEST_P(TensorLayoutTest, convBadLayout) {
EXPECT_FALSE(verifyLayouts(*F_, CanonicalTensorLayout::getInstance(), false));
}

// Check that we propagate the layout information for convertTo nodes:
TEST_P(TensorLayoutTest, convertTo) {
CHECK_IF_ENABLED();

auto *input = mod_.createPlaceholder(ElemKind::FloatTy, {1, 3, 3, 1}, "input",
false, "NWCH");
auto *resultNCHW = F_->createTranspose("transposeInput", input, NHWC2NCHW);
auto *save = F_->createSave("save", resultNCHW);
bindings_.allocate(save->getPlaceholder());

EXPECT_TRUE(verifyLayouts(*F_, CanonicalTensorLayout::getInstance()));

PrecisionConfiguration precConfig;
TypeAToTypeBFunctionConverter converter(*F_, ElemKind::FloatTy,
ElemKind::Float16Ty, precConfig);
converter.convert();

EXPECT_TRUE(verifyLayouts(*F_, CanonicalTensorLayout::getInstance()));
}

// Check TensorLayoutDescription's parser with simple input.
TEST_P(TensorLayoutTest, parseTestSimple) {
CHECK_IF_ENABLED();

0 comments on commit 2b29642

Please sign in to comment.
You can’t perform that action at this time.