move TypedStorage handling to assertEqual #89557

pmeier · 2022-11-23T10:36:25Z

#85303 added a patch to torch.testing.assert_close to handle torch.storage.TypedStorage's. This change is not reflected in the docs and is not intended for the public API. This PR removes the patch ones again and moves the behavior to TestCase.assertEqual instead. Meaning, TypedStorage's are again not supported by the public API, but the behavior is the same for all internal use cases.

cc @gujinghui @PenghuiCheng @XiaobingSuper @jianyuh @jgong5 @mingfeima @sanchitintel @ashokei @jingxu10 @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen

pytorch-bot · 2022-11-23T10:36:28Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/89557

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 254491c:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torch/testing/_internal/common_utils.py

pmeier

There are two distinct errors left:

https://github.com/pytorch/pytorch/actions/runs/3539134271/jobs/5941041682#step:10:99929

pytorch/test/test_serialization.py

Line 678 in 1588ea0

def test_save_different_dtype_unallocated(self):

Unless I'm mistaken we should simply add exact_dtype=False to the assertEqual calls.
https://github.com/pytorch/pytorch/actions/runs/3539134271/jobs/5940935728#step:10:29047

pytorch/test/test_torch.py

Line 194 in 1588ea0

def test_storage_setitem(self, device, dtype):

The problem is only occurs for the quantized dtypes. It happens, because you can't instantiate a new tensor with such a dtype directly, but our logic currently tries to.

torch/testing/_internal/common_utils.py

kurtamohler · 2022-12-06T18:15:13Z

As for your errors:

Unless I'm mistaken we should simply add exact_dtype=False to the assertEqual calls.

I think we want the dtypes to match when we perform the comparison--otherwise it wouldn't be possible to ensure an exact match, right? I'm not sure why the dtypes disagree. The loaded storage's dtype at least matches the saved storage's dtype in test/test_serialization.py in save_load_check, right? We definitely need that to be true

The problem is only occurs for the quantized dtypes. It happens, because you can't instantiate a new tensor with such a dtype directly, but our logic currently tries to.

You may be able to use something like this trick to interpret the quantized type as the appropriate nonquantized type:

pytorch/torch/storage.py

Lines 530 to 543 in 7322f73

    
           if self.dtype in [torch.quint8, torch.quint4x2, torch.quint2x4, torch.qint32, torch.qint8]: 
        
               interpret_dtypes = { 
        
                   torch.quint8: torch.uint8, 
        
                   torch.quint4x2: torch.uint8, 
        
                   torch.quint2x4: torch.uint8, 
        
                   torch.qint32: torch.int32, 
        
                   torch.qint8: torch.int8 
        
               } 
        
               tmp_dtype = interpret_dtypes[self.dtype] 
        
               tmp_tensor = torch.tensor([], dtype=tmp_dtype, device=self._untyped_storage.device) 
        
               tmp_tensor.set_(TypedStorage( 
        
                   wrap_storage=self._untyped_storage, 
        
                   dtype=tmp_dtype, 
        
                   _internal=True))

pmeier · 2022-12-07T10:34:08Z

I think we want the dtypes to match when we perform the comparison--otherwise it wouldn't be possible to ensure an exact match, right? I'm not sure why the dtypes disagree. The loaded storage's dtype at least matches the saved storage's dtype in test/test_serialization.py in save_load_check, right? We definitely need that to be true

I've looked more closely into the test and as is it makes little sense. Since we operate without any data

pytorch/test/test_serialization.py

Line 693 in b0bd5c4

a = torch.tensor([], dtype=dtype, device=device)

the only thing we could be checking with

pytorch/test/test_serialization.py

Lines 688 to 689 in b0bd5c4

    
           self.assertEqual(a, a_loaded) 
        
           self.assertEqual(b, b_loaded)

is the dtype.

However, due to some quirks in assertEqual we aren't doing that either: for #67796 we agreed offline (I'm too lazy now to find the specific commit in the PR that added this) that storages should be compared as if they were a list of Python Scalars. This is why before this PR we had them listed in sequence_types. Meaning, an empty tensor will be turned into an empty list and no check is happening at all:

>>> a = torch.tensor([], dtype=torch.int)
>>> b = a.float()
>>> torch.testing.assert_close(a, b)
AssertionError: The values for attribute 'dtype' do not match: torch.int32 != torch.float32.
>>> torch.testing.assert_close(a.tolist(), b.tolist())

Even if we had some data before, the check wouldn't have been doing a dtype check due to some legacy quirks in assertEqual

pytorch/torch/testing/_internal/common_utils.py

Lines 1491 to 1496 in 8b08478

    
           class RelaxedNumberPair(NumberPair): 
        
               """Pair for number-like inputs. 
        
               In contrast to the builtin :class:`NumberPair`, this class also supports one input being a single element 
        
               tensor-like or a :class:`enum.Enum`. (D)Type checks are disabled, meaning comparing 1 to 1.0 succeeds even when 
        
               ``check_dtype=True`` is passed.

In conclusion, before this PR assertEqual will only check the values, but never the dtypes of typed storages. Since there are no values, the test is doing nothing for typed storages.

Now for the part why this is failing after this PR. A minimal reproduction is

import io
import torch

dtype = torch.float64
other_dtype = torch.float32

t = torch.tensor([], dtype=dtype)
s = torch.TypedStorage(wrap_storage=t.storage().untyped(), dtype=other_dtype)

with io.BytesIO() as buffer:
    torch.save([t, s], buffer)
    buffer.seek(0)
    *_, s_loaded = torch.load(buffer)

assert s_loaded.dtype == s.dtype

Is the mismatch between s.dtype and t.dtype intentional? We create s from t.storage() that has dtype, but we explicitly set dtype=other_dtype. If yes, the failing test in this PR actually is a valid failure, since s_loaded has dtype and not other_dtype like s.

Interestingly, the behavior is correct when only serializing s, i.e. torch.save([s], buffer). Plus, if we actually try to do the same with some data, i.e. t = torch.tensor([0], dtype=dtype), we are not even allowed to serialize both at the same time:

RuntimeError: Cannot save multiple tensors or storages that view the same data as different types

kurtamohler · 2022-12-08T00:40:09Z

Ah sorry, I forgot about this until now. Yes the mismatch between s.dtype and a.dtype is intentional. When I wrote test_save_different_dtype_unallocated, I should have added a comment to explain it.

Historically, it has not been possible to save/load multiple tensors/storages that point to the same location (though I think it may be fairly straightforward to add that feature now that all the storage refactoring is complete). In torch.save we check the data_ptr of each tensor and storage and throw an error if two have the same data_ptr but different dtypes. However, if a tensor and storage are unallocated (as in tensor([])), their data_ptrs are both 0, and the tensor and storage are not actually pointing to the same memory location, since they're not pointing at any memory location. This means that doing this

a = torch.tensor([])
s = a.storage()

and this

a = torch.tensor([])
s = torch.Storage()

produce exactly the same a and s.

We do allow saving and loading unallocated tensors/storages of different dtypes. The only reason why the test for this exists is because I broke this behavior at one point and had to fix it. The history of this is here: #58970

It probably would have been better if I had not used a.storage() in this test, and instead created the storage separately, so that it's clear that a and s aren't actually linked at all.

kurtamohler · 2022-12-08T00:50:51Z

And it appears that the dtype mismatch between the saved and loaded storage is an upstream bug. If I change the test to do this:

self.assertEqual(a.dtype, a_loaded.dtype)
self.assertEqual(b.dtype, b_loaded.dtype)

it fails. I had assumed assertEqual checks the dtype. I will look into why this failure happens

pmeier · 2022-12-08T09:41:39Z

I had assumed assertEqual checks the dtype.

It will after this PR. Should we maybe xfail this test in this PR and merge it so you can have a testing ground?

pmeier · 2022-12-08T09:46:39Z

Should we maybe xfail this test

I took the liberty to do that in 822f6b2. In case you want to go another round, I'll revert later.

kurtamohler · 2022-12-08T21:19:11Z

Expected failure seems like a good idea to me

mruberry

Stamped, but be sure to revert the fbgemm changes

pmeier · 2022-12-12T20:30:46Z

@pytorchbot merge -g

pytorchmergebot · 2022-12-12T20:34:17Z

Merge started

Your change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

move TypedStorage handling to assertEqual

6d9cf47

pmeier added module: testing Issues related to the torch.testing module (not tests) topic: not user facing topic category labels Nov 23, 2022

pmeier requested a review from kurtamohler November 23, 2022 10:36

pmeier commented Nov 23, 2022

View reviewed changes

torch/testing/_internal/common_utils.py Outdated Show resolved Hide resolved

torch/testing/_internal/common_utils.py Show resolved Hide resolved

fix

a1b2aad

pytorchbot added the open source label Nov 23, 2022

pmeier added 4 commits November 23, 2022 14:19

move handling into pairs

e9fc4ad

fix compat

21597c5

revert debug

dbb2934

add separate comparison pair for typed storages

9788deb

pmeier commented Nov 24, 2022

View reviewed changes

torch/testing/_internal/common_utils.py Outdated Show resolved Hide resolved

Merge branch 'master' into typed-storage-comparison

3279fae

pmeier added 2 commits December 7, 2022 10:34

fix tensor creation

18733ea

Merge branch 'master' into typed-storage-comparison

ba6c93f

fix

052405a

add xfail

822f6b2

pmeier marked this pull request as ready for review December 8, 2022 09:46

pmeier requested a review from a team as a code owner December 8, 2022 09:46

kurtamohler approved these changes Dec 8, 2022

View reviewed changes

kurtamohler mentioned this pull request Dec 8, 2022

Saving and loading unallocated storage and tensor at the same time can result in incorrect dtype #90497

Closed

update fixme comment

d6a8c32

pmeier added 2 commits December 9, 2022 21:45

Merge branch 'master' into typed-storage-comparison

bee87ed

fix docstring

bc1e7eb

github-actions bot added the module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration label Dec 9, 2022

Merge branch 'master' into typed-storage-comparison

40836d5

mruberry approved these changes Dec 12, 2022

View reviewed changes

pmeier added 2 commits December 12, 2022 21:28

Merge branch 'master' into typed-storage-comparison

1898728

revert submodule updates

254491c

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 12, 2022

pytorchmergebot added the Merged label Dec 12, 2022

pytorchmergebot closed this in 7bb97c4 Dec 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move TypedStorage handling to assertEqual #89557

move TypedStorage handling to assertEqual #89557

pmeier commented Nov 23, 2022 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Nov 23, 2022 •

edited

Loading

pmeier left a comment

kurtamohler commented Dec 6, 2022 •

edited

Loading

pmeier commented Dec 7, 2022 •

edited

Loading

kurtamohler commented Dec 8, 2022 •

edited

Loading

kurtamohler commented Dec 8, 2022 •

edited

Loading

pmeier commented Dec 8, 2022

pmeier commented Dec 8, 2022

kurtamohler commented Dec 8, 2022

mruberry left a comment

pmeier commented Dec 12, 2022

pytorchmergebot commented Dec 12, 2022

move TypedStorage handling to assertEqual #89557

move TypedStorage handling to assertEqual #89557

Conversation

pmeier commented Nov 23, 2022 • edited by pytorch-bot bot Loading

pytorch-bot bot commented Nov 23, 2022 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/89557

✅ No Failures

pmeier left a comment

Choose a reason for hiding this comment

kurtamohler commented Dec 6, 2022 • edited Loading

pmeier commented Dec 7, 2022 • edited Loading

kurtamohler commented Dec 8, 2022 • edited Loading

kurtamohler commented Dec 8, 2022 • edited Loading

pmeier commented Dec 8, 2022

pmeier commented Dec 8, 2022

kurtamohler commented Dec 8, 2022

mruberry left a comment

Choose a reason for hiding this comment

pmeier commented Dec 12, 2022

pytorchmergebot commented Dec 12, 2022

Merge started

pmeier commented Nov 23, 2022 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Nov 23, 2022 •

edited

Loading

kurtamohler commented Dec 6, 2022 •

edited

Loading

pmeier commented Dec 7, 2022 •

edited

Loading

kurtamohler commented Dec 8, 2022 •

edited

Loading

kurtamohler commented Dec 8, 2022 •

edited

Loading