Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure axis masking operations are not in-place #1481

Merged
merged 4 commits into from May 3, 2021

Conversation

carolineechen
Copy link
Contributor

As reported in #1478, TimeMasking performs in-place operations that modifies the input tensor. This is also the case for FrequencyMasking.

This PR modifies the algorithm to not perform in-place operations on the input tensor and adds checks in the unit test to ensure that the input tensor is not being changed.

(Resolves #1478)

@vincentqb
Copy link
Contributor

vincentqb commented Apr 30, 2021

I agree the functional should not modify in place. However, it is possible someone in the wild was relying on that, and this PR is changing the behavior of the function silently. I do not see a way of detecting users who would have relied on that feature though. Should we raise a warning for one release to warn of this behavior? Or would calling this a bug fix in the release notes be enough?

On first thought, I'm leaning for just calling this a bug fix in the release notes. Thoughts?

@carolineechen
Copy link
Contributor Author

@vincentqb I agree that this should be considered a bug fix since it's not the expected behavior for input parameters to be changed in functional/transforms, and we also have not mentioned that this was an in-place operation in the past either.

@@ -211,11 +212,13 @@ def test_mask_along_axis(self, shape, mask_param, mask_value, axis):

assert mask_specgram.size() == specgram.size()
assert num_masked_columns < mask_param
self.assertEqual(specgram, specgram_copy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the intention here is to specifically test that the original is not changed. at a minimum, i'd expect a comment justifying why this is here. however, i would usually expect a separate test for this, so we know that this was the intention of the test. having a separate test would also make it easier to extend to other functionals if we want to.

thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this point. Each unit test should be testing only one thing, so that when a unit test fails we know what aspect of the functionality is not met.

Can you make new test methods with short description and reference the issue number in docstring.
See examples of test descriptions like

def test_save_tensor_preserve(self, dtype):
"""save function should not alter Tensor"""

def test_mp3(self):
"""Providing format allows to read mp3 without extension
libsox does not check header for mp3
https://github.com/pytorch/audio/issues/1040
The file was generated with the following command
ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" -ar 16000 -f mp3 test_noext
"""


https://github.com/pytorch/audio/issues/1478
"""
torch.random.manual_seed(42)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tricky one. The goal of this test is to ensure that "when mask is randomly applied, the original Tensor is not altered", and in #1478, we learned that bug happens stochastically. So we need to control the randomness in a way that it always hits the condition that caused the issue in #1478.

Simply setting the random seed has a positive and negative effects here. The positive aspect is that the test is repeatable. Assuming that the environment (hardware / software (including everything from OS, PyTorch and CUDA) is the same, we can repeat the test and expect it to produce the same result. But the negative aspect is, we cannot be sure if this specific seed value applies to any configuration (HW/SW) in future? We do not know and it's not likely. If something about the random generator has been changed in the future, and if this seed value stopped hitting the condition, then the test becomes moot.

There are couple of approaches to overcome this.

  1. Set the test configuration to always hit the correct condition of the reported bug.
    For example, if we make the test to mask all the elements of the input tensor, we can be sure that the test meets the requirement. However, this diverges from the expected usages (should be masking a part of the input, not all), and looking at the signature/docstring of function tested, it's not straightforward to do so. (It could be, but the docstring is hard to understand.)
  2. Patch random generator for the sake of testing.
    If there is no other solution, we can patch the random generator to change the behavior in our favor. This kind of technique is often used in function that relies on external resource (like HTTP access, for example moto makes it possible to test your AWS app without internet connection). But this complicates the test logic, which increases the chance of writing a wrong test.
  3. Bound the probability of this test not hitting the condition.
    Another approach is to bound the probability that the test hits the correct condition. Say that this one attempt will hit the bug condition with probability p. If you repeat the procedure n times, assuming that the implementation of pseudo random generator meets iid, (which I think is a reasonable assumption for this context even though it is not in strict sense) then the probability that the test hits the condition at least once becomes 1 - (1 - p)^n. For p=0.6 and n=10, we get something like 99.989 % of hitting the bug condition. The good thing about this approach is that we can still set the seed value once (and only once at the very beginning) and we expect the reproducibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with everything here, and feel that option 3 makes the most sense to me. The success in #1478 (when the input tensor is not changed) takes place when value = torch.rand(1) * mask_param is equal to 0, which results in mask_start=mask_end and no masking takes place. A mask_param value of 100 in all these tests indicates a 1% probability of value=0 and not hitting the condition. I think therefore looping it over 5 times should be sufficient (> 99.9999%), what do you think? Also, should this reasoning be explained in the documentation or is linking the issue sufficient?

Additionally, I have run this test on the previous implementation and see that all these tests fail, indicating that we do hit the condition (although I do agree that the test as is does not ensure this is always true, if we end up changing parameters or if the state of the random seed changes)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, I have run this test on the previous implementation and see that all these tests fail, indicating that we do hit the condition

Good to know. Thanks for taking the proper care.

A mask_param value of 100 in all these tests indicates a 1% probability of value=0 and not hitting the condition. I think therefore looping it over 5 times should be sufficient (> 99.9999%), what do you think?

Yes, 5 sounds good enough.

Also, should this reasoning be explained in the documentation or is linking the issue sufficient?

Yes, explaining it in the docstring is more helpful for future maintainers. (which is how I learned this kind of technique in the past)

Copy link
Collaborator

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

@carolineechen carolineechen merged commit 7fd5fce into pytorch:master May 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

torchaudio.transforms.TimeMasking is inplace
4 participants