Method/Function overlap #4

zdevito · 2017-06-07T02:47:01Z

Some functions have different declarations for method/function variants:

[[
  name: lt
  return: argument 0
  options:
    - cname: ltValue
      arguments:
        - arg: THBoolTensor* result
          output: True
        - THTensor* self
        - real value
    - cname: ltTensor
      arguments:
        - arg: THBoolTensor* result
          output: True
        - THTensor* self
        - THTensor* other
]]

[[
  name: lt_
  return: self
  options:
    - cname: ltValueT
      arguments:
        - THTensor* self
        - THTensor* self
        - real value
    - cname: ltTensorT
      arguments:
        - THTensor* self
        - THTensor* self
        - THTensor* other
]]

[[
  name: lt
  variants:
    - function
  return: argument 0
  options:
    - cname: ltValue
      arguments:
        - arg: THBoolTensor* result
          output: True
        - THTensor* tensor
        - real value
    - cname: ltTensor
      arguments:
        - arg: THBoolTensor* result
          output: True
        - THTensor* tensor
        - THTensor* other
    - cname: ltValueT
      arguments:
        - arg: THTensor* result
          output: True
        - THTensor* tensor
        - real value
    - cname: ltTensorT
      arguments:
        - arg: THTensor* result
          output: True
        - THTensor* tensor
        - THTensor* other
]]

However, they overlap in name and arguments causing conflicts. In the case where a function is method only we should add a suffix to the name e.g. "lt_method" in Type, and have the Tensor method call that one to avoid this.

The text was updated successfully, but these errors were encountered:

zdevito · 2017-06-07T14:34:52Z

Actually just split lt into lt and ltT, check that we can backport into pytorch. Also do for all comparison operators....

killeent · 2017-06-07T18:50:30Z

Did that ^

Summary: Currently there is a mismatch in naming between Python BatchNorm `running_var` and C++ BatchNorm `running_variance`, which causes JIT model parameters loading to fail (pytorch/vision#728 (comment)): ``` terminate called after throwing an instance of 'c10::Error' what(): No such serialized tensor 'running_variance' (read at /home/shahriar/Build/pytorch/torch/csrc/api/src/serialize/input-archive.cpp:27) frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x85 (0x7f2d92d32f95 in /usr/local/lib/libc10.so) frame #1: torch::serialize::InputArchive::read(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, at::Tensor&, bool) + 0xdeb (0x7f2d938551ab in /usr/local/lib/libtorch.so.1) frame #2: torch::nn::Module::load(torch::serialize::InputArchive&) + 0x98 (0x7f2d9381cd08 in /usr/local/lib/libtorch.so.1) frame #3: torch::nn::Module::load(torch::serialize::InputArchive&) + 0xf9 (0x7f2d9381cd69 in /usr/local/lib/libtorch.so.1) frame #4: torch::nn::Module::load(torch::serialize::InputArchive&) + 0xf9 (0x7f2d9381cd69 in /usr/local/lib/libtorch.so.1) frame #5: torch::nn::operator>>(torch::serialize::InputArchive&, std::shared_ptr<torch::nn::Module> const&) + 0x32 (0x7f2d9381c7b2 in /usr/local/lib/libtorch.so.1) frame #6: <unknown function> + 0x2b16c (0x5645f4d1916c in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) frame #7: <unknown function> + 0x27a3c (0x5645f4d15a3c in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) frame #8: <unknown function> + 0x2165c (0x5645f4d0f65c in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) frame #9: <unknown function> + 0x1540b (0x5645f4d0340b in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) frame #10: __libc_start_main + 0xf3 (0x7f2d051dd223 in /usr/lib/libc.so.6) frame #11: <unknown function> + 0x1381e (0x5645f4d0181e in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) ``` Renaming C++ BatchNorm `running_variance` to `running_var` should fix this problem. This is a BC-breaking change, but it should be easy for end user to rename `running_variance` to `running_var` in their call sites. Pull Request resolved: pytorch#17371 Reviewed By: goldsborough Differential Revision: D14172775 Pulled By: yf225 fbshipit-source-id: b9d3729ec79272a8084269756f28a8f7c4dd16b6

Summary: Tracing models which attempts to return this in-place value doesn't turn out well. I haven't run any tests to confirm the results to be honest, but regardless of the outcome, the operation happens in-place, so it should work as before. Sample output from traced model attempting to set `max_norm` on `Embedding`: ``` a leaf Variable that requires grad has been used in an in-place operation. (check_inplace at /pytorch/torch/csrc/autograd/VariableTypeUtils.h:49) frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f0ecc5cc021 in /usr/local/lib/python3.7/site-packages/torch/lib/libc10.so) frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f0ecc5cb8ea in /usr/local/lib/python3.7/site-packages/torch/lib/libc10.so) frame #2: <unknown function> + 0x38ab2f (0x7f0ecb55ab2f in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #3: torch::autograd::VariableType::embedding_renorm_(at::Tensor&, at::Tensor const&, double, double) const + 0x76 (0x7f0ecb5b5966 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #4: <unknown function> + 0x56c958 (0x7f0ecb73c958 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #5: <unknown function> + 0x672286 (0x7f0ecb842286 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #6: torch::jit::InterpreterState::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) + 0x22 (0x7f0ecb83d842 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #7: <unknown function> + 0x65c6ac (0x7f0ecb82c6ac in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #8: <unknown function> + 0x3c8ab4 (0x7f0f06bc0ab4 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #9: <unknown function> + 0x3ad2c3 (0x7f0f06ba52c3 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #10: <unknown function> + 0x11663e (0x7f0f0690e63e in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so) <omitting python frames> frame pytorch#39: python_call + 0x11 (0x5563c3c521c1 in uwsgi) frame pytorch#40: uwsgi_request_wsgi + 0x100 (0x5563c3c54410 in uwsgi) frame pytorch#41: wsgi_req_recv + 0xac (0x5563c3becabc in uwsgi) frame pytorch#42: simple_loop_run + 0xc4 (0x5563c3c35be4 in uwsgi) frame pytorch#43: simple_loop + 0x10 (0x5563c3c35a00 in uwsgi) frame pytorch#44: uwsgi_ignition + 0x241 (0x5563c3c3a3a1 in uwsgi) frame pytorch#45: uwsgi_worker_run + 0x275 (0x5563c3c3ec35 in uwsgi) frame pytorch#46: <unknown function> + 0x8f22c (0x5563c3c3f22c in uwsgi) frame pytorch#47: <unknown function> + 0x3c13e (0x5563c3bec13e in uwsgi) frame pytorch#48: __libc_start_main + 0xf1 (0x7f0f138922e1 in /lib/x86_64-linux-gnu/libc.so.6) frame pytorch#49: _start + 0x2a (0x5563c3bec16a in uwsgi) : operation failed in interpreter: op_version_set = 0 def forward(self, input_1: Tensor) -> Tensor: _0 = torch.norm(self.item_embedding.weight, 2, 1, True) _1 = torch.div(self.item_embedding.weight, _0) m_weight = torch.t(_1) input_2 = torch.contiguous(input_1) weight_1 = torch.embedding_renorm_(self.item_embedding.weight, input_2, 1., 2.) ~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE x = torch.embedding(weight_1, input_2, -1, False, False) input_3 = torch.div(x, torch.norm(x, 2, 2, True)) max_batch_size = ops.prim.NumToTensor(torch.size(input_3, 0)) hx = torch.zeros([2, int(max_batch_size), 70], dtype=6, layout=0, device=torch.device("cpu")) _2 = [self.lstm_layer.weight_ih_l0, self.lstm_layer.weight_hh_l0, self.lstm_layer.weight_ih_l1, self.lstm_layer.weight_hh_l1] input_4, _3, _4 = torch.lstm(input_3, [hx, hx], _2, False, 2, 0.10000000000000001, False, False, True) input = torch.matmul(input_4, torch.t(self.rnn2item.weight)) tastevec = torch.div(input, torch.norm(input, 2, 2, True)) outputs = torch.matmul(tastevec, m_weight) ``` Pull Request resolved: pytorch#18684 Differential Revision: D14782041 Pulled By: ezyang fbshipit-source-id: 7b2fc19b7d5b6600263644498bb728319a19f39d

killeent closed this as completed Jun 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Method/Function overlap #4

Method/Function overlap #4

zdevito commented Jun 7, 2017

zdevito commented Jun 7, 2017

killeent commented Jun 7, 2017

Method/Function overlap #4

Method/Function overlap #4

Comments

zdevito commented Jun 7, 2017

zdevito commented Jun 7, 2017

killeent commented Jun 7, 2017