Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++ API] Implement builder style construction #7597

Merged
merged 8 commits into from
May 17, 2018

Conversation

goldsborough
Copy link
Contributor

@goldsborough goldsborough commented May 15, 2018

This PR implements our discussed "builder" style construction mechanism, where the class is fused with the builder itself. It is similar to the KWARGS mechanism in autogradpp, with some differences.

Largely, this PR:

  1. Creates a macro TORCH_PARAMETER (name up for discussion, maybe TORCH_PROPERTY?) used to give modules parameters/properties. It is written in a way so that it requires only 2 arguments instead of 5, like AUTOGRAD_KWARG did
  2. Rewrites core modules to follow a construction mechanism where
    1. Required arguments go to the constructor,
    2. Optional arguments are passed via setters/getters,
    3. Variable construction and other heavy lifting is moved to a reset() function
    4. reset() is called inside build(), which finalizes construction, and clone()
  3. Does some heavy refactoring of the Convolution and RNN modules to better use polymorphism, e.g.:
    1. In Conv, the call to at::conv<dimension>d(...) is now left to a virtual function, implemented in Conv1d, Conv2d and Conv3d
    2. Large rewrite of rnn classes to also move their autograd implementations to virtual methods, and in general to avoid the use of the RNN mode beyond CuDNN. The RNN mode should not replace inheritance and polymorphism!!!

Recommended review order:

  1. include/torch/nn/module.h
  2. include/torch/nn/modules/*.h and src/nn/modules/*.cpp
  3. tests

CC @ezyang @ebetica @apaszke @jgehring

Copy link
Contributor

@ebetica ebetica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reset parameters disappeared somewhere. I'm curious why we're departing from PyTorch in this department? I wonder why it's even there in the first place. @apaszke maybe you can shed some light on this?

variable_list CUDNN_forward(variable_list);
variable_list autograd_forward(variable_list);

void flatten_parameters_for_cudnn();

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

enum RNNMode { RNN_RELU = 0, RNN_TANH = 1, LSTM = 2, GRU = 3 };
// These must line up with the CUDNN mode codes:
// https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnRNNMode_t
enum class CuDNNMode { RNN_RELU, RNN_TANH, LSTM, GRU };

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

std::vector<Variable> hhb_;

size_t number_of_gates_;
bool has_cell_state_;

This comment was marked as off-topic.

This comment was marked as off-topic.

}} // namespace torch::nn

#define TORCH_PARAMETER(T, name) \

This comment was marked as off-topic.

Copy link
Contributor

@ebetica ebetica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay besides my nits above.

return this->name##_; \
} \
\
protected: \

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Copy link
Contributor

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why certain things changed as they are. The weight names are now inconsistent with those in Python, and I really don't think we should mess with visibility in our macros. The fact that it automagically changes just because you declared a parameter will be a constant source of unclear errors (the user doesn't have the line that makes things protected in their code!)

std::shared_ptr<Derived> build() {
auto module = std::make_shared<Derived>(static_cast<Derived&&>(*this));
module->reset();
return std::move(module);

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.


#define TORCH_PARAMETER(T, name) \
public: \
auto name(T new_##name)->decltype(*this) { \

This comment was marked as off-topic.

This comment was marked as off-topic.

return this->name##_; \
} \
\
protected: \

This comment was marked as off-topic.

@@ -19,32 +19,32 @@ class ContainerListImpl : public CloneableModule<Derived> {
}

std::shared_ptr<Module> add(std::shared_ptr<Module> m) {
return append(m).children_.back();
return append(m).modules_.back();

This comment was marked as off-topic.

This comment was marked as off-topic.

enum RNNMode { RNN_RELU = 0, RNN_TANH = 1, LSTM = 2, GRU = 3 };
// These must line up with the CUDNN mode codes:
// https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnRNNMode_t
enum class CuDNNMode { RNN_RELU, RNN_TANH, LSTM, GRU };

This comment was marked as off-topic.

REQUIRE(model->param("test.l1.bias").size(0) == 3);
REQUIRE(model->param("test.l1.weight").size(0) == 3);
REQUIRE(model->param("test.l1.weight").size(1) == 10);
REQUIRE(model->param("test.l1.weights").size(0) == 3);

This comment was marked as off-topic.

@goldsborough
Copy link
Contributor Author

Sorry for the weight naming changes, I didn't know that's what they were called in Python. "weight" sounded more like a single float to me than a tensor of "weights", but I'll change it back.

As for the visibility of the value inside TORCH_PARAMETER: This one is tricky. I know messing with visibility labels is awkward and may cause trouble if users use it outside of core modules. Making all variables public I would like to avoid. Making the declaration separate will add boilerplate, since everything now has to be repeated twice... then we could have just as well gone with "separate builder style construction" (where the builder class is separate from the actual class) ...

@ebetica
Copy link
Contributor

ebetica commented May 16, 2018

I actually think there's an argument to be made for making things as public as possible anyway, if we're to encourage hackability. I think not changing the visibility is more important and a bigger source of bugs than if people were to directly just use the private variables. The point of private variables, getters, and setters is that you can do arbitrary logic inside them, so you shouldn't touch the private variables, but presumably with a parameter macro like this we never worry about that issue at all.

@ezyang
Copy link
Contributor

ezyang commented May 16, 2018

Yes, make 'em public! If the user really cares they can stop using the macro and write the getters/properties themselves.

std::unique_ptr<Module> clone() const override {
auto ptr = std::unique_ptr<Module>(
new Derived(*static_cast<const Derived*>(this)));
virtual void reset() = 0;

This comment was marked as off-topic.

return std::move(module);
}

std::shared_ptr<Module> clone() const override {

This comment was marked as off-topic.

bool transposed = false,
bool with_bias = true,
int groups = 1);
struct ExpandingSize {

This comment was marked as off-topic.

This comment was marked as off-topic.


variable_list forward(variable_list input) override;

TORCH_PARAMETER(double, rate) = 0.5;

This comment was marked as off-topic.

@ezyang
Copy link
Contributor

ezyang commented May 16, 2018

This is not the fault of this patch, but I was thinking about safety: shoudn't the parameters array should store Variable& and not Variable, so that if someone writes

mod.foo_ = my_new_var_param

the parameters array is updated properly?

@goldsborough
Copy link
Contributor Author

goldsborough commented May 16, 2018

@ezyang Oh man, the fact that mod.foo_ = my_new_var_param works just gives me shivers. Theoretically I would say no, because if you move a Module all your references become garbage. If we're really hardcore about only ever storing Modules inside shared_ptrs, it would work. But I don't think we can make that constraint for user modules

@goldsborough goldsborough merged commit cba19e5 into pytorch:master May 17, 2018
@goldsborough goldsborough deleted the construction branch May 17, 2018 21:10
onnxbot added a commit to onnxbot/onnx-fb-universe that referenced this pull request May 17, 2018
petrex added a commit to petrex/pytorch that referenced this pull request May 23, 2018
* upstream/master:
  Makes AccumulateGrad high priority in backwards passes (pytorch#7604)
  [C++ API] Implement builder style construction (pytorch#7597)
  C10D: Added TCPStore to support C10D store interface (pytorch#7560)
  [auto] Update onnx to ba86ec2 - Protobuf typing (onnx/onnx#982) onnx/onnx@ba86ec2
  Add LBFGS optimization algorithm to C++ API (pytorch#7596)
weiyangfb pushed a commit to weiyangfb/pytorch that referenced this pull request Jun 11, 2018
* Implemented fused builder based construction mechanism

* "weights" -> "weight"

* Use int64_t instead of size_t everywhere in RNN

* Extracted Conv::ExpandingSize into its own thing

* Rename TORCH_PARAMETER to TORCH_ATTR

* Added documentation

* Fix weight names in batchnorm module
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants