fix a bug

YuanGongND · May 8, 2022 · b708675 · b708675
1 parent a5112ec
commit b708675
Show file tree

Hide file tree

Showing 2 changed files with 5 additions and 1 deletion.
diff --git a/README.md b/README.md
@@ -13,7 +13,7 @@
 
 ## News
 
-May, 2022: It was found that newer `torchaudio` package has different behavior with older ones in time and frequency masking and will cause a bug. Please stick to the version in `requirement.txt`.
+May, 2022: It was found that newer `torchaudio` package has different behavior with older ones in time and frequency masking and will cause a bug. We find a workaround and fixed it.
 
 March, 2022: We released a new preprint [*CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification*](https://arxiv.org/abs/2203.06760), where we proposed a knowledge distillation based method to further improve the AST model performance without changing its architecture.
 

diff --git a/src/dataloader.py b/src/dataloader.py
@@ -187,10 +187,14 @@ def __getitem__(self, index):
         freqm = torchaudio.transforms.FrequencyMasking(self.freqm)
         timem = torchaudio.transforms.TimeMasking(self.timem)
         fbank = torch.transpose(fbank, 0, 1)
+        # this is just to satisfy new torchaudio version, which only accept [1, freq, time]
+        fbank = fbank.unsqueeze(0)
         if self.freqm != 0:
             fbank = freqm(fbank)
         if self.timem != 0:
             fbank = timem(fbank)
+        # squeeze it back, it is just a trick to satisfy new torchaudio version
+        fbank = fbank.squeeze(0)
         fbank = torch.transpose(fbank, 0, 1)
 
         # normalize the input for both training and test