Serialize conj and neg bits #88182

kshitij12345 · 2022-11-01T08:30:24Z

TODO:

C++ Unpickler Fix (locally tested pickled in Python and unpickled in C++)
C++ Pickler Fix (locally tested pickled in C++ and unpickled in Python)
Do quant_tensor, sparse_tensor, etc require similar changes? (Sparse and Quant don't need this)
Add Comments
How to make sure C++ and Python are in sync? (Functions in pickler.h help in getting and setting Tensor Metadata (math-bits for now) on a tensor. They are the only place which should handle this.)

Notes:
Quant Tensor don't support complex dtypes and for float they segfault with _neg_view : #88484

Sparse Tensor:

>>> a = torch.tensor([[0, 2.], [3j, 0]]).to_sparse()
>>> a.conj().is_conj()
False
>>> a._neg_view()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NotImplementedError: Cannot access storage of SparseTensorImpl

pytorch-bot · 2022-11-01T08:30:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88182

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 5152d8c:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torch/csrc/jit/serialization/unpickler.cpp

torch/csrc/jit/serialization/pickler.h

kshitij12345 · 2022-11-03T20:41:04Z

torch/utils/model_dump/__init__.py

+            if len(data.args) == 6:
+                storage, offset, size, stride, requires_grad, hooks = data.args
+            else:
+                storage, offset, size, stride, requires_grad, hooks, math_bits = data.args
            storage_info = get_storage_info(storage)
            return {"__tensor_v2__": [storage_info, offset, size, stride, requires_grad]}


This ties up to some javascript rendering helper. We ignore MathBits in that javascript rendering of this information.

pytorch/torch/utils/model_dump/code.js

Lines 173 to 178 in 65de9a2

if (data.__tensor_v2__) {

const [storage, offset, size, stride, grad] = data.__tensor_v2__;

const [dtype, key, device, numel] = storage;

return this.renderTensor(

"tensor", dtype, key, device, numel, offset, size, stride, grad, []);

}

kshitij12345 · 2022-11-03T20:54:57Z

torch/csrc/jit/serialization/pickler.h

+// set MathBits on Tensor from map
+inline void setTensorMathBits(
+    const at::Tensor& t,
+    c10::Dict<c10::IValue, c10::IValue> math_bits_idict) {


This overload is required for C++ unpickler.cpp which returns a c10::Dict<IValue, IValue>.

add a comment?

kshitij12345 · 2022-11-04T09:03:19Z

It would be great to know if this makes sense and looks in the right direction. Thanks!

ezyang · 2022-11-04T19:19:43Z

torch/_tensor.py

@@ -358,6 +358,7 @@ def _reduce_ex_internal(self, proto):
                self.stride(),
                self.requires_grad,
                backward_hooks,
+                torch._utils.get_math_bits(self),


You can't do this. This will break serialization FC.

What you need to do is test if get_math_bits is the default value (all not set). If none of it is set, then you must omit it from the argument list, so that it is compatible with old rebuild_tensor.

Make sense. Thanks!

ezyang · 2022-11-04T19:21:28Z

torch/csrc/Module.cpp

+  py_module.def(
+      "_set_tensor_mathbits",
+      static_cast<void (*)(
+          const at::Tensor&, std::unordered_map<int8_t, bool>)>(


I'm not a big fan of this being a map from ints to bools. If someone is inspecting the Pickle data by hand, they will see an opaque numeric constant that they will have to grovel in the C++ enum definition to resolve the name of. Also, you don't even say on the enum that these numbers are serialized and must be kept consistent.

I think a more forward looking format would be to have a dictionary from string to bool. This can be conceptualized as arbitrary extra metadata attached to the tensor, and lets us add more properties to it in the future if needed. Additionally, you should NOT add entries to the dictionary if they are the default values.

I did start with map of <string, bool> but string seemed like an overkill for key. But I think what you have mentioned makes sense. Have updated the type of map.

Thanks!

ezyang · 2022-11-04T19:26:50Z

Thanks a lot for fixing this, it is long overdue. I have some comments about the serialization format.

anjali411

Looks good overall. I think we should add a test for C++ serialization API https://github.com/pytorch/pytorch/blob/master/test/cpp/api/serialize.cpp

kshitij12345 · 2022-11-08T13:44:16Z

Looks good overall. I think we should add a test for C++ serialization API https://github.com/pytorch/pytorch/blob/master/test/cpp/api/serialize.cpp

Thanks for sharing that link. Will update the name and also add a test for C++!

…serialization/mathbits

anjali411 · 2022-11-09T11:41:11Z

test/cpp/api/serialize.cpp

+  }
+
+  {
+    auto expected = torch::conj(torch::_neg_view(x));


Suggested change

auto expected = torch::conj(torch::_neg_view(x));

auto expected = torch::imag(torch::conj(x));

Here we are trying to set both neg and conj bits.

sorry I meant to post this for the only neg bit case

anjali411 · 2022-11-09T11:44:55Z

Looks great. Just occurred to me that it might be nice to preemptively add serialization support for ZeroTensor bit. We can't currently get a ZeroTensor through a public API, so adding an assert will be fine too. Don't have to be in this PR, i.e., can be done in a follow up PR. thank you!

kshitij12345 · 2022-11-09T12:10:52Z

Sure. Will have a follow-up PR for ZeroTensor. Thanks!

kshitij12345 · 2022-11-09T12:57:24Z

@pytorchbot merge

pytorchmergebot · 2022-11-09T12:59:00Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2022-11-09T13:19:27Z

Merge failed

Reason: HTTP Error 502: Bad Gateway

Details for Dev Infra team

Raised by workflow job

anjali411 · 2022-11-09T17:10:59Z

@pytorchbot merge

anjali411 · 2022-11-09T17:11:10Z

@pytorchbot merge -g

pytorchmergebot · 2022-11-09T17:15:07Z

Merge started

Your change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Follow-up : #88182 (comment) Pull Request resolved: #88803 Approved by: https://github.com/anjali411

Fixes pytorch#81690 TODO: * [x] C++ Unpickler Fix (locally tested pickled in Python and unpickled in C++) * [x] C++ Pickler Fix (locally tested pickled in C++ and unpickled in Python) * [x] Do quant_tensor, sparse_tensor, etc require similar changes? (Sparse and Quant don't need this) * [x] Add Comments * [x] How to make sure C++ and Python are in sync? (Functions in `pickler.h` help in getting and setting Tensor Metadata (math-bits for now) on a tensor. They are the only place which should handle this.) Notes: Quant Tensor don't support complex dtypes and for float they segfault with `_neg_view` : pytorch#88484 Sparse Tensor: ```python >>> a = torch.tensor([[0, 2.], [3j, 0]]).to_sparse() >>> a.conj().is_conj() False >>> a._neg_view() Traceback (most recent call last): File "<stdin>", line 1, in <module> NotImplementedError: Cannot access storage of SparseTensorImpl ``` Pull Request resolved: pytorch#88182 Approved by: https://github.com/ezyang, https://github.com/anjali411

Follow-up : pytorch#88182 (comment) Pull Request resolved: pytorch#88803 Approved by: https://github.com/anjali411

[fix] MathBits: serialization

521133c

pytorchbot added the open source label Nov 1, 2022

kshitij12345 added 4 commits November 3, 2022 08:28

latest master

ac3f28a

move get and set mathbits to c++

ef49e72

latest master

6225275

handle pickling from c++

288bdaf

kshitij12345 commented Nov 3, 2022

View reviewed changes

torch/csrc/jit/serialization/unpickler.cpp Outdated Show resolved Hide resolved

pytorch-bot bot added the release notes: jit release notes category label Nov 3, 2022

kshitij12345 commented Nov 3, 2022

View reviewed changes

torch/csrc/jit/serialization/pickler.h Outdated Show resolved Hide resolved

kshitij12345 added 9 commits November 3, 2022 14:08

remove stale change

3138725

c++ handle old pickles

1d697f2

Merge branch 'pytorch:master' into fix/serialization/mathbits

1bf9318

guard against quantized

171c85b

merge

33d0ce8

hacky fix for jit

913ca38

guard against quantize

f91047d

use enum instead of string

36e62f3

make linter happy

de9ba4a

kshitij12345 commented Nov 3, 2022

View reviewed changes

kshitij12345 requested review from ezyang, albanD and anjali411 November 4, 2022 09:02

kshitij12345 marked this pull request as ready for review November 4, 2022 09:02

kshitij12345 changed the title ~~[WIP] [fix] MathBits: serialization~~ [fix] MathBits: serialization Nov 4, 2022

ezyang reviewed Nov 4, 2022

View reviewed changes

kshitij12345 added 4 commits November 8, 2022 06:23

address review

0da94ce

latest upstream

6a0f74d

latest viable/strict

ee654d0

update error message

a874e5f

anjali411 approved these changes Nov 8, 2022

View reviewed changes

kshitij12345 added 4 commits November 9, 2022 09:49

Merge branch 'master' of https://github.com/pytorch/pytorch into fix/…

4e4edea

…serialization/mathbits

rename mathbits to metadata

543d35a

add cpp test

239df55

update variable name in model_dump code

5152d8c

anjali411 reviewed Nov 9, 2022

View reviewed changes

kshitij12345 added release notes: complex release notes category topic: bug fixes topic category labels Nov 9, 2022

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 9, 2022

pytorchmergebot added the Merged label Nov 9, 2022

pytorchmergebot closed this in eb9b156 Nov 9, 2022

kshitij12345 mentioned this pull request Nov 10, 2022

Error on ZeroTensor serialization #88803

Closed

pytorchmergebot pushed a commit that referenced this pull request Nov 11, 2022

Error on ZeroTensor serialization (#88803)

d15a6b0

Follow-up : #88182 (comment) Pull Request resolved: #88803 Approved by: https://github.com/anjali411

kulinseth pushed a commit to kulinseth/pytorch that referenced this pull request Dec 10, 2022

Error on ZeroTensor serialization (pytorch#88803)

43480bd

Follow-up : pytorch#88182 (comment) Pull Request resolved: pytorch#88803 Approved by: https://github.com/anjali411

lezcano changed the title ~~[fix] MathBits: serialization~~ Serialize conj and neg bits Feb 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serialize conj and neg bits #88182

Serialize conj and neg bits #88182

kshitij12345 commented Nov 1, 2022 •

edited

pytorch-bot bot commented Nov 1, 2022 •

edited

kshitij12345 Nov 3, 2022 •

edited

kshitij12345 Nov 3, 2022

anjali411 Nov 8, 2022

kshitij12345 commented Nov 4, 2022

ezyang Nov 4, 2022

kshitij12345 Nov 4, 2022

ezyang Nov 4, 2022

ezyang Nov 4, 2022

kshitij12345 Nov 4, 2022

ezyang commented Nov 4, 2022

anjali411 left a comment

kshitij12345 commented Nov 8, 2022

anjali411 Nov 9, 2022

kshitij12345 Nov 9, 2022

anjali411 Nov 9, 2022

anjali411 commented Nov 9, 2022

kshitij12345 commented Nov 9, 2022

kshitij12345 commented Nov 9, 2022

pytorchmergebot commented Nov 9, 2022

pytorchmergebot commented Nov 9, 2022

anjali411 commented Nov 9, 2022

anjali411 commented Nov 9, 2022

pytorchmergebot commented Nov 9, 2022

	if (data.__tensor_v2__) {
	const [storage, offset, size, stride, grad] = data.__tensor_v2__;
	const [dtype, key, device, numel] = storage;
	return this.renderTensor(
	"tensor", dtype, key, device, numel, offset, size, stride, grad, []);
	}

	auto expected = torch::conj(torch::_neg_view(x));
	auto expected = torch::imag(torch::conj(x));

Serialize conj and neg bits #88182

Serialize conj and neg bits #88182

Conversation

kshitij12345 commented Nov 1, 2022 • edited

pytorch-bot bot commented Nov 1, 2022 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88182

✅ No Failures

kshitij12345 Nov 3, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kshitij12345 commented Nov 4, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ezyang commented Nov 4, 2022

anjali411 left a comment

Choose a reason for hiding this comment

kshitij12345 commented Nov 8, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anjali411 commented Nov 9, 2022

kshitij12345 commented Nov 9, 2022

kshitij12345 commented Nov 9, 2022

pytorchmergebot commented Nov 9, 2022

Merge started

pytorchmergebot commented Nov 9, 2022

Merge failed

anjali411 commented Nov 9, 2022

anjali411 commented Nov 9, 2022

pytorchmergebot commented Nov 9, 2022

Merge started

kshitij12345 commented Nov 1, 2022 •

edited

pytorch-bot bot commented Nov 1, 2022 •

edited

kshitij12345 Nov 3, 2022 •

edited