AudioSpectrogram failed to be plotted under ClassificationInterpretation #95

riven314 · 2021-05-16T18:30:24Z

I have made a public kaggle kernel for reproducing the bug as your reference (look at session 3. Failed to Show AudioSpectrogram for the main point):
https://www.kaggle.com/alexlwh/failed-to-show-audiospec-tensor

Essentially, I tried to show the predictions v.s. ground truth using ClassificationInterpretation as follows:

interp = ClassificationInterpretation.from_learner(learner)

@patch
def __getitem__(self: Interpretation, idxs):
    if not is_listy(idxs):
        idxs = [idxs]
    attrs = 'inputs,preds,targs,decoded,losses'
    res = L([getattr(self, attr)[idxs] for attr in attrs.split(',')])
    return res

@patch
@delegates(TfmdDL.show_results)
def show_at(self: Interpretation, idxs, **kwargs):
    inp, _, targ,dec, _ = self[idxs]
    self.dl.show_results((inp, dec), targ, **kwargs)

interp.show_at(0)

And then I got the following error:

/opt/conda/lib/python3.7/site-packages/fastaudio/core/spectrogram.py in show(self, ctx, ax, title, **kwargs)
     75     def show(self, ctx=None, ax=None, title="", **kwargs):
     76         "Show spectrogram using librosa"
---> 77         return show_spectrogram(self, ctx=ctx, ax=ax, title=title, **kwargs)
     78 
     79 

/opt/conda/lib/python3.7/site-packages/fastaudio/core/spectrogram.py in show_spectrogram(sg, title, ax, ctx, **kwargs)
     87         ia = ax.inset_axes((i / sg.nchannels, 0.2, 1 / sg.nchannels, 0.7))
     88         z = specshow(
---> 89             channel.cpu().numpy(), ax=ia, **sg._all_show_args(show_y=i == 0), **kwargs
     90         )
     91         ia.set_title(f"Channel {i}")

/opt/conda/lib/python3.7/site-packages/librosa/display.py in specshow(data, x_coords, y_coords, x_axis, y_axis, sr, hop_length, fmin, fmax, tuning, bins_per_octave, key, Sa, mela, thaat, ax, **kwargs)
    843     # Get the x and y coordinates
    844     y_coords = __mesh_coords(y_axis, y_coords, data.shape[0], **all_params)
--> 845     x_coords = __mesh_coords(x_axis, x_coords, data.shape[1], **all_params)
    846 
    847     axes = __check_axes(ax)

/opt/conda/lib/python3.7/site-packages/librosa/display.py in __mesh_coords(ax_type, coords, n, **kwargs)
    915     if ax_type not in coord_map:
    916         raise ParameterError("Unknown axis type: {}".format(ax_type))
--> 917     return coord_map[ax_type](n, **kwargs)
    918 
    919 

/opt/conda/lib/python3.7/site-packages/librosa/display.py in __coord_time(n, sr, hop_length, **_kwargs)
   1194 def __coord_time(n, sr=22050, hop_length=512, **_kwargs):
   1195     """Get time coordinates from frames"""
-> 1196     return core.frames_to_time(np.arange(n + 1), sr=sr, hop_length=hop_length)

/opt/conda/lib/python3.7/site-packages/librosa/core/convert.py in frames_to_time(frames, sr, hop_length, n_fft)
    186     samples = frames_to_samples(frames, hop_length=hop_length, n_fft=n_fft)
    187 
--> 188     return samples_to_time(samples, sr=sr)
    189 
    190 

/opt/conda/lib/python3.7/site-packages/librosa/core/convert.py in samples_to_time(samples, sr)
    304     """
    305 
--> 306     return np.asanyarray(samples) / float(sr)
    307 
    308 

TypeError: float() argument must be a string or a number, not 'NoneType'

I suspect it is because at some point the AudioSpectrogram.sr failed to be propagated to ClassificationInterpretation, rendering interp.inputs.sr = None. (Not sure exactly at which point the issue happens yet):

interp.inputs._all_show_args()
>>
{'x_coords': None,
 'y_coords': None,
 'x_axis': 'time',
 'y_axis': 'mel',
 'sr': None,
 'hop_length': 1024,
 'fmin': None,
 'fmax': None,
 'tuning': 0.0,
 'bins_per_octave': 12,
 'key': 'C:maj',
 'Sa': None,
 'mela': None,
 'thaat': None,
 'cmap': 'viridis'}

The text was updated successfully, but these errors were encountered:

scart97 · 2021-05-17T21:04:50Z

I never tried to run InterpretationClassification on audio models, so it's not surprising that it's broken. Why did you have to @patch those two methods?

riven314 · 2021-05-18T17:03:35Z

@scart97
just for trying out a patch trick from a Zach's tutorial: https://walkwithfastai.com/basics.interp
The bug persists even if I dont do the patch trick, e.g. :

inputs, preds, targs, decoded, losses = learner.get_preds(
    dl=None, with_input=True, 
    with_loss=True, with_decoded=True, act=None
)
learner.dls[1].show_results((inputs, decoded), targs)

scart97 · 2021-05-18T20:43:56Z

Didn't find the root cause yet, but I could isolate the problem better. The issue is not related to ClassificationInterpretation directly, the input tensor doesn't have the correct sr attribute from the beginning when iterating over the data. To make it more clear, if you grab a batch with dls.one_batch() it will have all the properties as expected, but iterating over causes problems:

I guess the problem needs to be fixed on fastai, but I don't have enough knowledge about the dataloaders stuff yet to properly understand what is happening here.

riven314 · 2021-05-19T15:09:15Z

@scart97
Thanks for narrowing down the scope. I found sth strange on top of that:
AudioSpectrogram managed to inherit sr when the iteration has NO multiprocessing. So I think something fishy happens when we do multiprocessing in DataLoader. For better visibility, I have updated my kaggle kernel for reproducing the bug:
https://www.kaggle.com/alexlwh/failed-to-show-audiospec-tensor

e.g.

kevinbird15 · 2021-05-22T01:42:00Z

After digging into this with Arto from fastai discord today, Arto submitted a PR that will fix TensorBase with his PR: fastai/fastai#3383

I believe we can do something similar as a temporary fix, but once we go to pytorch 1.8.1, we should be able to rely on Tensor.__reduce_ex__ directly since the new version handles the object.__dict__ properly (which ends up being the issue in pytorch 1.7.1 and fastai (for slightly different reasons). Should I create a temporary work around similar to what Arto did in fastai to fix it with our current version of fastaudio that uses fastai 2.2.7 or do we want to increase the version requirement to pytorch 1.8.1 instead (which would also fix the issue).

riven314 · 2021-05-22T11:54:53Z

Have tried out the fix. It could fix the issue I raised here (i.e. Interpretation.show_at failed to work for AudioTensor)
Below is my updated public kernel with the fix:
https://www.kaggle.com/alexlwh/failed-to-show-audiospec-tensor

kevinbird15 · 2021-05-22T18:21:58Z

so adding that code back here, we would need to add this somewhere:

# fix to preserve metadata for subclass tensor in serialization
# src: https://github.com/fastai/fastai/pull/3383
from fastai.torch_core import _fa_rebuild_qtensor, _fa_rebuild_tensor

def _rebuild_from_type(func, type, args, dict):
    ret = func(*args).as_subclass(type)
    ret.__dict__ = dict
    return ret

@patch
def __reduce_ex__(self: TensorBase, proto):
    torch.utils.hooks.warn_if_has_hooks(self)
    args = (type(self), self.storage(), self.storage_offset(), tuple(self.size()), self.stride())
    if self.is_quantized: args = args + (self.q_scale(), self.q_zero_point())
    args = args + (self.requires_grad, OrderedDict())
    f = _fa_rebuild_qtensor if self.is_quantized else  _fa_rebuild_tensor
    return (_rebuild_from_type, (f, type(self), args, self.__dict__))

Then we should be able to keep the same 2.2.7 requirement and stick with all the same dependencies right?

Where would these chunks of code be added in fastaudio? maybe instead of patching TensorBase, this __reduce_ex__ should be added to AudioTensor? That would be my proposal at least.

scart97 · 2021-05-22T21:29:58Z

The fix needs to be applied at the Tensorbase, as it affects both the audio and the spectrogram tensors. The plan is to keep this patch at fastaudio until fastai has a new version that fixes it upstream, then we bump the minimal fastai required by the library

scart97 · 2021-05-22T21:52:39Z

🎉 This issue has been resolved in version 0.1.6 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

riven314 changed the title ~~AudioSpectrogram failed to be shown under ClassificationInterpretation~~ AudioSpectrogram failed to be plotted under ClassificationInterpretation May 16, 2021

scart97 added the bug Something isn't working label May 17, 2021

kevinbird15 mentioned this issue May 21, 2021

Multiprocessing is not maintaining any metadata fastai/fastai#3382

Closed

scart97 mentioned this issue May 22, 2021

Fix issue with multiprocessing not keeping the Tensor metadata #97

Merged

scart97 closed this as completed in #97 May 22, 2021

scart97 added the released label May 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AudioSpectrogram failed to be plotted under ClassificationInterpretation #95

AudioSpectrogram failed to be plotted under ClassificationInterpretation #95

riven314 commented May 16, 2021 •

edited

scart97 commented May 17, 2021

riven314 commented May 18, 2021

scart97 commented May 18, 2021 •

edited

riven314 commented May 19, 2021 •

edited

kevinbird15 commented May 22, 2021 •

edited

riven314 commented May 22, 2021 •

edited

kevinbird15 commented May 22, 2021 •

edited

scart97 commented May 22, 2021

scart97 commented May 22, 2021

AudioSpectrogram failed to be plotted under ClassificationInterpretation #95

AudioSpectrogram failed to be plotted under ClassificationInterpretation #95

Comments

riven314 commented May 16, 2021 • edited

scart97 commented May 17, 2021

riven314 commented May 18, 2021

scart97 commented May 18, 2021 • edited

riven314 commented May 19, 2021 • edited

kevinbird15 commented May 22, 2021 • edited

riven314 commented May 22, 2021 • edited

kevinbird15 commented May 22, 2021 • edited

scart97 commented May 22, 2021

scart97 commented May 22, 2021

riven314 commented May 16, 2021 •

edited

scart97 commented May 18, 2021 •

edited

riven314 commented May 19, 2021 •

edited

kevinbird15 commented May 22, 2021 •

edited

riven314 commented May 22, 2021 •

edited

kevinbird15 commented May 22, 2021 •

edited