Toggle Spectrogram Preview for Audio #384

Path-A · 2020-08-03T19:50:16Z

Is your feature request related to a problem? Please describe.
Classifying or segmenting audio with only a waveform preview can be time-consuming or difficult, especially with noisy audio data. Some data is more easily segmented by looking at frequency content over time.

Describe the solution you'd like
Include a toggle to preview a spectrogram representation of an audio clip. Some common python libraries to generate these are Librosa or Scipy.signal.

Describe alternatives you've considered
I've manually generated the spectrograms and saved them as images to be used within the image classification labeling tool. The downsides of this are threefold.

Labeling audio this way does not allow for temporal segmentation. The user must classify the entire spectrogram, not simply a vertical fraction of it. A user could, in theory, use the image annotation tool, but it would be tedious and the user would need to convert bounding boxes to its corresponding time in the audio clip.
The user can no longer listen to the audio clip while viewing the spectrogram image.
The user generated spectograms require temporary additional storage requirements.

Additional context
Each user's spectrogram needs may differ, such as their sound of interest being within the low or high frequency areas of the spectrogram. To keep implementation simple, use default spectrogram parameters that generalize well and potentially allow users to zoom in on this general spectrogram. A more robust solution would allow the user to specify a few parameters to generate the spectrogram that they would want. Lastly, I include an example of a log-scaled spectrogram with its accompanying waveform.

makseq · 2020-08-03T20:55:25Z

@Path-A Thank you for your issue. Are you familiar with React?

Path-A · 2020-08-04T00:05:01Z

@makseq Unfortunately not, although I've been wanting to learn.

feddybear · 2021-04-26T06:48:55Z

For future reference, I think the challenge here is more of understanding the multicanvas code of wavesurfer.
I was able to successfully use the spectrogram functions but could only implement it on the older single canvas implementation. Unfortunately, for long audio files, this is impractical because of the need to recalculate and redraw spectra when zooming, etc. If pre-segmentation is done, it becomes more practical. But segmentation of audio (especially for speech technology applications) isn't perfect.
Here's a sample demonstration as reference: LINK
It takes about 12-14 seconds per zoom value on a 3-minute file (using N=512 fft samples).

makseq · 2021-04-26T10:02:01Z

@feddybear Wow! It's very impressive! Do you have an account on our slack? https://label-studio.slack.com/

feddybear · 2021-04-26T11:35:24Z

@feddybear Wow! It's very impressive! Do you have an account on our slack? https://label-studio.slack.com/

Hi @makseq yeah I also mentioned this on one of the spectrogram inquiries there. But I'm leaving it to someone more capable, especially in reading the wavesurfer multicanvas codes. Hopefully it's also someone who knows signal processing, as the older implementation of drawing spectrogram on wavesurfer had some really weird canvas settings that didn't make sense (e.g. height of the spectrogram).

makseq · 2021-04-26T15:47:01Z

@feddybear Let's move to slack. I know DSP, also we'll include our frontend team there.

makseq · 2021-04-26T15:48:45Z

@feddybear please, tag me again (@makseq). I can't find my mention there.

Tom-Lu · 2021-05-21T16:52:23Z

I'm also seeking for similar feature, any progress so far?

makseq · 2021-05-22T22:06:25Z

@Tom-Lu
We have some news from our contributor:
https://github.com/feddybear/label-studio-frontend

I hope we will make this work to the end.

tpeet · 2021-07-08T07:05:10Z

Would be also very interested in this feature, as it is currently hard to select regions for audio with low SNR

Path-A · 2021-09-30T17:12:36Z

Has there been any progress regarding development of this feature?

makseq · 2021-09-30T21:42:29Z

Only if @feddybear has any news.
We are currently focusing on the image / html tagging. Audio updates are planned for next year.

feddybear · 2021-10-01T06:04:56Z

Sorry, I have yet to integrate the spectrogram-related edits from the previous version to the latest one. Also, kinda occupied with other stuff outside of annotation.

Selimonder · 2022-02-22T17:28:17Z

Hello,

Has there been any progress regarding the development of this feature?

mikolajpabiszczak · 2022-05-18T09:59:58Z

I know it's irritating with people asking again & again, but it would be really useful to have this, any news about this?

makseq · 2022-05-19T23:38:04Z

Thank you for asking. By your activity we prioritize features, so it isn't irritating :-)
@nicholasrq had some progress in Audio Plus Engine, but I heard that we haven't still implemented spectrograms :-( I will draw the attention of our team to this feature request.

faroit · 2022-05-24T15:14:37Z

@makseq @feddybear also 👍 for spectrograms!

mcgee0916 · 2023-05-25T00:57:57Z

Hello,
Has there been any progress about this feature?

cspindler · 2023-11-27T15:39:50Z

👍 Yes, spectrogram annotation would be fantastic. But it would be just the start - with the spectrogram view available, the following features would be super useful:

RectangleLabels in the spectrogram (time and frequency range)
Audio playback speed control (with and without time-stretching), especially slowing down to 0.1 of original speed.
Choices of frequency scaling (lin, log, mel, bark)
Zoom (time and frequency)

I'm not familiar with React at all, but that could change.

We're building a pipeline to annotate bats in high-frequency audio recordings, based on batdetect2. They also have a labelling UI that checks many of the boxes UI-wise, except the handling of large amount of tasks, users, storage backends etc. - all the golden labelstud.io features.

paulpeyret-biophonia · 2024-01-30T09:34:20Z

Hello,

I would love to see this feature for spectrogram annotation with sound playback. There is a huge demand from all the bioacoustics and eco-acoustics community (who are still working on desktop app like audacity and raven for annotations).
@cspindler I think you well described the need.

I understand that spectrogram calculation speed is a bottleneck here for the zooming feature inside the spectrogram.
Maybe these libs can help get descent speeds.
libAudioFlux/audioFlux#22

Hopping this feature will come soon.
Cheers

samvelkoch · 2024-04-25T20:32:54Z

Up for specs in audio labeling

sajarin · 2024-05-01T19:30:05Z

/jira create

Workflow run
Jira issue TRIAG-527 is created

DK2895 · 2025-01-20T12:25:57Z

Hi guys (@sajarin @makseq @feddybear),

Has there been any further progress on this feature? Thanks

isaac-jordan · 2025-02-04T12:01:25Z

+1 on wanting these spectrogram tooling. The lack of it is forcing us to consider using other tools for audio labelling.

l4j3b · 2025-02-07T20:38:36Z

@isaac-jordan Would you mind sharing the other tools you are considering please? I am in the same situation.

isaac-jordan · 2025-02-08T12:33:55Z

@isaac-jordan Would you mind sharing the other tools you are considering please? I am in the same situation.

Two front-runners for us are https://github.com/mbsantiago/whombat and https://www.wildlifeacoustics.com/products/kaleidoscope

shemerey · 2025-02-11T09:48:58Z

btw the repo even contains web/node_modules/wavesurfer.js/src/plugin/spectrogram/fft.js I thought that we can enable the plugin and use it, but unfortunately it looks like wavesurfer.js is not used any more.

xixinzhang · 2025-02-14T05:05:05Z

It's helpful but seems hard

Path-A assigned deppp Aug 3, 2020

niklub added the feature Feature request label Mar 29, 2021

makseq added this to the Label Studio 1.3 milestone Jul 9, 2021

niklub removed this from the Label Studio 1.3 milestone Aug 30, 2021

makseq added this to the Future milestone Oct 4, 2021

makseq added often asked audio editor Label Studio Frontend labels Oct 5, 2021

sajarin added community:reviewed Issue has been reviewed by the Label Studio Community Team. community:feature-request Feature Request from the community reviewed by the community team. labels May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Toggle Spectrogram Preview for Audio #384

Toggle Spectrogram Preview for Audio #384

Path-A commented Aug 3, 2020 •

edited

Loading

makseq commented Aug 3, 2020

Path-A commented Aug 4, 2020

feddybear commented Apr 26, 2021 •

edited

Loading

makseq commented Apr 26, 2021

feddybear commented Apr 26, 2021

makseq commented Apr 26, 2021

makseq commented Apr 26, 2021

Tom-Lu commented May 21, 2021

makseq commented May 22, 2021 •

edited

Loading

tpeet commented Jul 8, 2021

Path-A commented Sep 30, 2021

makseq commented Sep 30, 2021

feddybear commented Oct 1, 2021 •

edited

Loading

Selimonder commented Feb 22, 2022

mikolajpabiszczak commented May 18, 2022

makseq commented May 19, 2022 •

edited

Loading

faroit commented May 24, 2022

mcgee0916 commented May 25, 2023

cspindler commented Nov 27, 2023

paulpeyret-biophonia commented Jan 30, 2024

samvelkoch commented Apr 25, 2024

sajarin commented May 1, 2024 •

edited by jira bot

Loading

DK2895 commented Jan 20, 2025 •

edited

Loading

isaac-jordan commented Feb 4, 2025

l4j3b commented Feb 7, 2025

isaac-jordan commented Feb 8, 2025

shemerey commented Feb 11, 2025

xixinzhang commented Feb 14, 2025

Toggle Spectrogram Preview for Audio #384

Toggle Spectrogram Preview for Audio #384

Comments

Path-A commented Aug 3, 2020 • edited Loading

makseq commented Aug 3, 2020

Path-A commented Aug 4, 2020

feddybear commented Apr 26, 2021 • edited Loading

makseq commented Apr 26, 2021

feddybear commented Apr 26, 2021

makseq commented Apr 26, 2021

makseq commented Apr 26, 2021

Tom-Lu commented May 21, 2021

makseq commented May 22, 2021 • edited Loading

tpeet commented Jul 8, 2021

Path-A commented Sep 30, 2021

makseq commented Sep 30, 2021

feddybear commented Oct 1, 2021 • edited Loading

Selimonder commented Feb 22, 2022

mikolajpabiszczak commented May 18, 2022

makseq commented May 19, 2022 • edited Loading

faroit commented May 24, 2022

mcgee0916 commented May 25, 2023

cspindler commented Nov 27, 2023

paulpeyret-biophonia commented Jan 30, 2024

samvelkoch commented Apr 25, 2024

sajarin commented May 1, 2024 • edited by jira bot Loading

DK2895 commented Jan 20, 2025 • edited Loading

isaac-jordan commented Feb 4, 2025

l4j3b commented Feb 7, 2025

isaac-jordan commented Feb 8, 2025

shemerey commented Feb 11, 2025

xixinzhang commented Feb 14, 2025

Path-A commented Aug 3, 2020 •

edited

Loading

feddybear commented Apr 26, 2021 •

edited

Loading

makseq commented May 22, 2021 •

edited

Loading

feddybear commented Oct 1, 2021 •

edited

Loading

makseq commented May 19, 2022 •

edited

Loading

sajarin commented May 1, 2024 •

edited by jira bot

Loading

DK2895 commented Jan 20, 2025 •

edited

Loading