Sort out whatever is happening with bernoulli, and regularize #5

zdevito · 2017-06-07T02:49:05Z

It is the only thing that has THFloatTensor THDoubleTensor referred to directly and it is not clear what is even happening in this file.

#define THCudaDoubleTensor_BERNOULLI_TENSOR THCudaDoubleTensor_bernoulli_DoubleTensor
#define THCudaTensor_BERNOULLI_TENSOR THCudaTensor_bernoulli_FloatTensor

[[
  name: bernoulli
  defined_if: CUDA_FLOAT || CUDA_DOUBLE
  types:
    - Float
    - Double
  processors:
    - CUDA
  return: argument 0
  variants:
    - method
    - function
  cname: BERNOULLI_TENSOR
  before_call:
    THTensor_(resizeAs)(LIBRARY_STATE ((THPTensor*)$arg0)->cdata, ((THPTensor*)$arg1)->cdata);
  arguments:
    - arg: THTensor* output
      output: True
    - THTensor* self
]]

#undef THCudaDoubleTensor_BERNOULLI_TENSOR
#undef THCudaTensor_BERNOULLI_TENSOR

[[
  name: bernoulli_
  defined_if: CUDA_FLOAT || CUDA_DOUBLE || CUDA_HALF
  types:
    - floating_point
  processors:
    - CUDA
  return: self
  options:
    - cname: bernoulli
      arguments:
        - THTensor* self
        - arg: double p
          default: 0.5
    - cname: bernoulli_FloatTensor
      arguments:
        - THTensor* self
        - THCudaTensor* float_p
    - cname: bernoulli_DoubleTensor
      arguments:
        - THTensor* self
        - THCudaDoubleTensor* float_p
]]

The text was updated successfully, but these errors were encountered:

killeent · 2017-06-07T13:56:04Z

@zdevito the bernoulli function has three variants:

x = torch.Tensor(10)
x.bernoulli_(0.5) # sample from bernoulli distribution with p=0.5 over all elements in x

y_float = torch.Tensor(10).uniform_()
x.bernoulli_(y) # sample from bernoulli distribution with p sourced from the corresponding value in y at each index

y_double = torch.DoubleTensor(10).uniform_()
x.bernoulli_(y) # variant of above, with double precision tensor source for p

The 2nd and 3rd variants explicitly have float and double Tensor arguments.

Now, for whatever reason (I didn't look into why), consider the following:

x = torch.DoubleTensor(10)
y = torch.Tensor(10)
z = torch.Tensor(10).uniform_()

torch.bernoulli(z, out=y) # ok!
torch.bernoulli(z, out=x) # invalid args, z/x must have same type

So essentially what the macros are doing is making it so that the functions generated for the first declaration call the float tensor bernoulli function when the type is CUDA_FLOAT, and the double tensor bernoulli function when the type is CUDA_DOUBLE. In the second declaration, where we don't have an output, this works. Its unclear to me without further digging why this is so, it seems that we should be able to have any output type Tensor for bernoulli...

Summary: Currently there is a mismatch in naming between Python BatchNorm `running_var` and C++ BatchNorm `running_variance`, which causes JIT model parameters loading to fail (pytorch/vision#728 (comment)): ``` terminate called after throwing an instance of 'c10::Error' what(): No such serialized tensor 'running_variance' (read at /home/shahriar/Build/pytorch/torch/csrc/api/src/serialize/input-archive.cpp:27) frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x85 (0x7f2d92d32f95 in /usr/local/lib/libc10.so) frame #1: torch::serialize::InputArchive::read(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, at::Tensor&, bool) + 0xdeb (0x7f2d938551ab in /usr/local/lib/libtorch.so.1) frame #2: torch::nn::Module::load(torch::serialize::InputArchive&) + 0x98 (0x7f2d9381cd08 in /usr/local/lib/libtorch.so.1) frame #3: torch::nn::Module::load(torch::serialize::InputArchive&) + 0xf9 (0x7f2d9381cd69 in /usr/local/lib/libtorch.so.1) frame #4: torch::nn::Module::load(torch::serialize::InputArchive&) + 0xf9 (0x7f2d9381cd69 in /usr/local/lib/libtorch.so.1) frame #5: torch::nn::operator>>(torch::serialize::InputArchive&, std::shared_ptr<torch::nn::Module> const&) + 0x32 (0x7f2d9381c7b2 in /usr/local/lib/libtorch.so.1) frame #6: <unknown function> + 0x2b16c (0x5645f4d1916c in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) frame #7: <unknown function> + 0x27a3c (0x5645f4d15a3c in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) frame #8: <unknown function> + 0x2165c (0x5645f4d0f65c in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) frame #9: <unknown function> + 0x1540b (0x5645f4d0340b in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) frame #10: __libc_start_main + 0xf3 (0x7f2d051dd223 in /usr/lib/libc.so.6) frame #11: <unknown function> + 0x1381e (0x5645f4d0181e in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) ``` Renaming C++ BatchNorm `running_variance` to `running_var` should fix this problem. This is a BC-breaking change, but it should be easy for end user to rename `running_variance` to `running_var` in their call sites. Pull Request resolved: pytorch#17371 Reviewed By: goldsborough Differential Revision: D14172775 Pulled By: yf225 fbshipit-source-id: b9d3729ec79272a8084269756f28a8f7c4dd16b6

…7b7558 (pytorch#18070) Summary: Pull Request resolved: pytorch#18070 Previous import was d1f45b1a2b1585d0e9bc65e15e463db344fc3ff6 Included changes: - **[2bcc406](houseroad/foxi@2bcc406)**: Merge pull request #7 from jackm321/tracing_fixes <Jack Montgomery> - **[c39033c](houseroad/foxi@c39033c)**: Fixes for tracing events <Jack Montgomery> - **[50912cf](houseroad/foxi@50912cf)**: Merge pull request #5 from jackm321/add_trace_events <Jack Montgomery> - **[ba2fdcb](houseroad/foxi@ba2fdcb)**: Merge pull request #5 from jackm321/add_trace_events <Jack Montgomery> - **[7d42b12](houseroad/foxi@7d42b12)**: address comments <Jack Montgomery> - **[dcabd8d](houseroad/foxi@dcabd8d)**: Add trace events interface <Jack Montgomery> Reviewed By: houseroad Differential Revision: D14483201 fbshipit-source-id: f51ed869c9a89521079df89903abc0ac0a45ac7b

Summary: Tracing models which attempts to return this in-place value doesn't turn out well. I haven't run any tests to confirm the results to be honest, but regardless of the outcome, the operation happens in-place, so it should work as before. Sample output from traced model attempting to set `max_norm` on `Embedding`: ``` a leaf Variable that requires grad has been used in an in-place operation. (check_inplace at /pytorch/torch/csrc/autograd/VariableTypeUtils.h:49) frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f0ecc5cc021 in /usr/local/lib/python3.7/site-packages/torch/lib/libc10.so) frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f0ecc5cb8ea in /usr/local/lib/python3.7/site-packages/torch/lib/libc10.so) frame #2: <unknown function> + 0x38ab2f (0x7f0ecb55ab2f in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #3: torch::autograd::VariableType::embedding_renorm_(at::Tensor&, at::Tensor const&, double, double) const + 0x76 (0x7f0ecb5b5966 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #4: <unknown function> + 0x56c958 (0x7f0ecb73c958 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #5: <unknown function> + 0x672286 (0x7f0ecb842286 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #6: torch::jit::InterpreterState::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) + 0x22 (0x7f0ecb83d842 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #7: <unknown function> + 0x65c6ac (0x7f0ecb82c6ac in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #8: <unknown function> + 0x3c8ab4 (0x7f0f06bc0ab4 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #9: <unknown function> + 0x3ad2c3 (0x7f0f06ba52c3 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #10: <unknown function> + 0x11663e (0x7f0f0690e63e in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so) <omitting python frames> frame pytorch#39: python_call + 0x11 (0x5563c3c521c1 in uwsgi) frame pytorch#40: uwsgi_request_wsgi + 0x100 (0x5563c3c54410 in uwsgi) frame pytorch#41: wsgi_req_recv + 0xac (0x5563c3becabc in uwsgi) frame pytorch#42: simple_loop_run + 0xc4 (0x5563c3c35be4 in uwsgi) frame pytorch#43: simple_loop + 0x10 (0x5563c3c35a00 in uwsgi) frame pytorch#44: uwsgi_ignition + 0x241 (0x5563c3c3a3a1 in uwsgi) frame pytorch#45: uwsgi_worker_run + 0x275 (0x5563c3c3ec35 in uwsgi) frame pytorch#46: <unknown function> + 0x8f22c (0x5563c3c3f22c in uwsgi) frame pytorch#47: <unknown function> + 0x3c13e (0x5563c3bec13e in uwsgi) frame pytorch#48: __libc_start_main + 0xf1 (0x7f0f138922e1 in /lib/x86_64-linux-gnu/libc.so.6) frame pytorch#49: _start + 0x2a (0x5563c3bec16a in uwsgi) : operation failed in interpreter: op_version_set = 0 def forward(self, input_1: Tensor) -> Tensor: _0 = torch.norm(self.item_embedding.weight, 2, 1, True) _1 = torch.div(self.item_embedding.weight, _0) m_weight = torch.t(_1) input_2 = torch.contiguous(input_1) weight_1 = torch.embedding_renorm_(self.item_embedding.weight, input_2, 1., 2.) ~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE x = torch.embedding(weight_1, input_2, -1, False, False) input_3 = torch.div(x, torch.norm(x, 2, 2, True)) max_batch_size = ops.prim.NumToTensor(torch.size(input_3, 0)) hx = torch.zeros([2, int(max_batch_size), 70], dtype=6, layout=0, device=torch.device("cpu")) _2 = [self.lstm_layer.weight_ih_l0, self.lstm_layer.weight_hh_l0, self.lstm_layer.weight_ih_l1, self.lstm_layer.weight_hh_l1] input_4, _3, _4 = torch.lstm(input_3, [hx, hx], _2, False, 2, 0.10000000000000001, False, False, True) input = torch.matmul(input_4, torch.t(self.rnn2item.weight)) tastevec = torch.div(input, torch.norm(input, 2, 2, True)) outputs = torch.matmul(tastevec, m_weight) ``` Pull Request resolved: pytorch#18684 Differential Revision: D14782041 Pulled By: ezyang fbshipit-source-id: 7b2fc19b7d5b6600263644498bb728319a19f39d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sort out whatever is happening with bernoulli, and regularize #5

Sort out whatever is happening with bernoulli, and regularize #5

zdevito commented Jun 7, 2017

killeent commented Jun 7, 2017

Sort out whatever is happening with bernoulli, and regularize #5

Sort out whatever is happening with bernoulli, and regularize #5

Comments

zdevito commented Jun 7, 2017

killeent commented Jun 7, 2017