### Adding a New Op

#### Reasons:

 + It's not easy or possible to express your operation as a composition of existing ops.
 + It's not efficient to express your operation as a composition of existing primitives.
 + You want to hand-fuse a composition of primitives that a future compiler would find difficult fusing.

#### How
 + Register the new op in a C++ file.
 + Implement the op in C++. Need different version for GPU/CPU.
 + Create a Python wrapper.
 + Write a function to compute gradients for the op (optional).
 + Test the op. If you define gradients, you can verify them with the Python gradient checker.
 

#### Details for each steps.

##### How to register the new op?
```c++
#include "tensorflow/core/framework/op.h"
#include "tensorflow/core/framework/shape_inference.h"

using namespace tensorflow;

REGISTER_OP("ZeroOut")
    .Input("to_zero: int32")
    .Output("zeroed: int32")
    .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
      c->set_output(0, c->input(0));
      return Status::OK();
    });
```

1. First need to include op.h. and shape_inference.h if shape inference is needed.
2. Using the MACRO "REGISTER_OP".
3. The only one thing need to be care is that the ```.SetShapeFn()``` took a function pointer as parameter. And usually we pass a lambda expression to it.

```c++
REGISTER_OP("my_op_name")
//     .Attr("<name>:<type>")
//     .Attr("<name>:<type>=<default>")
//     .Input("<name>:<type-expr>")
//     .Input("<name>:Ref(<type-expr>)")
//     .Output("<name>:<type-expr>")
//     .Doc(R"(
// <1-line summary>
// <rest of the description (potentially many lines)>
// <name-of-attr-input-or-output>: <description of name>
// <name-of-attr-input-or-output>: <description of name;
//   if long, indent the description on subsequent lines>
// )");
//
// Note: .Doc() should be last.
// For details, see the OpDefBuilder class in op_def_builder.h.
```

---

##### Implement the kernel for the op