-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialize conj and neg bits #88182
Serialize conj and neg bits #88182
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88182
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 5152d8c: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
if len(data.args) == 6: | ||
storage, offset, size, stride, requires_grad, hooks = data.args | ||
else: | ||
storage, offset, size, stride, requires_grad, hooks, math_bits = data.args | ||
storage_info = get_storage_info(storage) | ||
return {"__tensor_v2__": [storage_info, offset, size, stride, requires_grad]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ties up to some javascript rendering helper. We ignore MathBits in that javascript rendering of this information.
pytorch/torch/utils/model_dump/code.js
Lines 173 to 178 in 65de9a2
if (data.__tensor_v2__) { | |
const [storage, offset, size, stride, grad] = data.__tensor_v2__; | |
const [dtype, key, device, numel] = storage; | |
return this.renderTensor( | |
"tensor", dtype, key, device, numel, offset, size, stride, grad, []); | |
} |
// set MathBits on Tensor from map | ||
inline void setTensorMathBits( | ||
const at::Tensor& t, | ||
c10::Dict<c10::IValue, c10::IValue> math_bits_idict) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overload is required for C++ unpickler.cpp
which returns a c10::Dict<IValue, IValue>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment?
It would be great to know if this makes sense and looks in the right direction. Thanks! |
torch/_tensor.py
Outdated
@@ -358,6 +358,7 @@ def _reduce_ex_internal(self, proto): | |||
self.stride(), | |||
self.requires_grad, | |||
backward_hooks, | |||
torch._utils.get_math_bits(self), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't do this. This will break serialization FC.
What you need to do is test if get_math_bits is the default value (all not set). If none of it is set, then you must omit it from the argument list, so that it is compatible with old rebuild_tensor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. Thanks!
torch/csrc/Module.cpp
Outdated
py_module.def( | ||
"_set_tensor_mathbits", | ||
static_cast<void (*)( | ||
const at::Tensor&, std::unordered_map<int8_t, bool>)>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fan of this being a map from ints to bools. If someone is inspecting the Pickle data by hand, they will see an opaque numeric constant that they will have to grovel in the C++ enum definition to resolve the name of. Also, you don't even say on the enum that these numbers are serialized and must be kept consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a more forward looking format would be to have a dictionary from string to bool. This can be conceptualized as arbitrary extra metadata attached to the tensor, and lets us add more properties to it in the future if needed. Additionally, you should NOT add entries to the dictionary if they are the default values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did start with map of <string, bool>
but string
seemed like an overkill for key
. But I think what you have mentioned makes sense. Have updated the type of map.
Thanks!
Thanks a lot for fixing this, it is long overdue. I have some comments about the serialization format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. I think we should add a test for C++ serialization API https://github.com/pytorch/pytorch/blob/master/test/cpp/api/serialize.cpp
Thanks for sharing that link. Will update the name and also add a test for C++! |
} | ||
|
||
{ | ||
auto expected = torch::conj(torch::_neg_view(x)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto expected = torch::conj(torch::_neg_view(x)); | |
auto expected = torch::imag(torch::conj(x)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are trying to set both neg
and conj
bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I meant to post this for the only neg bit case
Looks great. Just occurred to me that it might be nice to preemptively add serialization support for ZeroTensor bit. We can't currently get a ZeroTensor through a public API, so adding an assert will be fine too. Don't have to be in this PR, i.e., can be done in a follow up PR. thank you! |
Sure. Will have a follow-up PR for ZeroTensor. Thanks! |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: HTTP Error 502: Bad Gateway Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge |
@pytorchbot merge -g |
Merge startedYour change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Follow-up : #88182 (comment) Pull Request resolved: #88803 Approved by: https://github.com/anjali411
Fixes pytorch#81690 TODO: * [x] C++ Unpickler Fix (locally tested pickled in Python and unpickled in C++) * [x] C++ Pickler Fix (locally tested pickled in C++ and unpickled in Python) * [x] Do quant_tensor, sparse_tensor, etc require similar changes? (Sparse and Quant don't need this) * [x] Add Comments * [x] How to make sure C++ and Python are in sync? (Functions in `pickler.h` help in getting and setting Tensor Metadata (math-bits for now) on a tensor. They are the only place which should handle this.) Notes: Quant Tensor don't support complex dtypes and for float they segfault with `_neg_view` : pytorch#88484 Sparse Tensor: ```python >>> a = torch.tensor([[0, 2.], [3j, 0]]).to_sparse() >>> a.conj().is_conj() False >>> a._neg_view() Traceback (most recent call last): File "<stdin>", line 1, in <module> NotImplementedError: Cannot access storage of SparseTensorImpl ``` Pull Request resolved: pytorch#88182 Approved by: https://github.com/ezyang, https://github.com/anjali411
Follow-up : pytorch#88182 (comment) Pull Request resolved: pytorch#88803 Approved by: https://github.com/anjali411
Fixes #81690
TODO:
pickler.h
help in getting and setting Tensor Metadata (math-bits for now) on a tensor. They are the only place which should handle this.)Notes:
Quant Tensor don't support complex dtypes and for float they segfault with
_neg_view
: #88484Sparse Tensor: