Skip to content

Commit

Permalink
fix a bug
Browse files Browse the repository at this point in the history
  • Loading branch information
YuanGongND committed May 8, 2022
1 parent a5112ec commit b708675
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

## News

May, 2022: It was found that newer `torchaudio` package has different behavior with older ones in time and frequency masking and will cause a bug. Please stick to the version in `requirement.txt`.
May, 2022: It was found that newer `torchaudio` package has different behavior with older ones in time and frequency masking and will cause a bug. We find a workaround and fixed it.

March, 2022: We released a new preprint [*CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification*](https://arxiv.org/abs/2203.06760), where we proposed a knowledge distillation based method to further improve the AST model performance without changing its architecture.

Expand Down
4 changes: 4 additions & 0 deletions src/dataloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,10 +187,14 @@ def __getitem__(self, index):
freqm = torchaudio.transforms.FrequencyMasking(self.freqm)
timem = torchaudio.transforms.TimeMasking(self.timem)
fbank = torch.transpose(fbank, 0, 1)
# this is just to satisfy new torchaudio version, which only accept [1, freq, time]
fbank = fbank.unsqueeze(0)
if self.freqm != 0:
fbank = freqm(fbank)
if self.timem != 0:
fbank = timem(fbank)
# squeeze it back, it is just a trick to satisfy new torchaudio version
fbank = fbank.squeeze(0)
fbank = torch.transpose(fbank, 0, 1)

# normalize the input for both training and test
Expand Down

0 comments on commit b708675

Please sign in to comment.