# Interactive inspection of rainforest recordings
> A (¿hopefully?) helpful but certainly entertaining tool to inspect rainforest recordings. 

## TL;DR

This notebook mainly implements two classes, `SingleRecordingInspector` and `SongOrSpeciesRecordingInspector`. These make use of `ipywidgets` and provide an interactive tool to inspect audio files, highlighting the regions with known species/song labels. 

What they do:
* `SingleRecordingInspector`: 
    * displays one specific spectrogram of interest
    * renders labeled boxes filtered by `f_min`, `f_max` and whether or not the box label is tp or fp
    * provides audio playback of the file
* `SongOrSpeciesRecordingInspector`: 
    * displays your choice of one or more spectrograms and audio for a specific `species_id` or `songtype_id`
    * renders labeled boxes
    
So, `SingleRecordingInspector` helps to understand what is in a single audio file and `SongOrSpeciesRecordingInspector` helps to understand what a specific signal (by species or song type) looks like across audio files.

Happy clicking! 😁

## References


- blog article about the mel spectrogram ([TDS](https://towardsdatascience.com/getting-to-know-the-mel-spectrogram-31bca3e2d9d0))

Steve Brunton explainer videos: 
- Spectrogram example in python ([yt](https://www.youtube.com/watch?v=TJGlxdW7Fb4))
- The Gabor transform ([yt](https://www.youtube.com/watch?v=EfWnEldTyPA))

Analysis notebooks:
* https://www.kaggle.com/prokaggler/rfcx-species-audio-detection-eda-pytorch
* https://www.kaggle.com/lakshya91/basic-audio-and-data-analysis
* [](http://)https://www.kaggle.com/alkahapur/exploratory-data-analysis-and-modelling-using-cnn

## Imports

In [None]:
import numpy as np
import pandas as pd
import os
import matplotlib as mpl
import matplotlib.pyplot as plt

import torchaudio
import librosa
import librosa.display

import IPython.display as ipd
from IPython.display import display, HTML, clear_output

import ipywidgets as widgets
from ipywidgets.widgets.interaction import show_inline_matplotlib_plots

from fastcore.all import *

## Setup

In [None]:
base_path = Path('/kaggle/input/rfcx-species-audio-detection/')

## Inspecting a single file
> Preamble to our two `Inspector` 🧐 classes.

First, let's code something to parse flac files using `torchaudio`.

In [None]:
rec_id = 'c12e0a62b'
audio, sample_rate = torchaudio.load(base_path/f'train/{rec_id}.flac')

In [None]:
print(f'audio size: {audio.size()}, sample_rate {sample_rate}')

Second, let's listen to the audio file.

In [None]:
ipd.display(ipd.Audio(data=audio[0], rate=sample_rate))

Third, let's collect our label data.

In [None]:
def get_labels(base_path):
    df_train_fp = (pd.read_csv(base_path/'train_fp.csv')
                   .assign(positive=True))
    df_train_tp = (pd.read_csv(base_path/'train_tp.csv')
                   .assign(positive=False))
    return pd.concat((df_train_fp, df_train_tp), ignore_index=True)

In [None]:
%%time
df = get_labels(base_path)
df.head()

In [None]:
mask = df['recording_id'] == rec_id
df_sounds = df.loc[mask]

Fourth, let's plot the spectrogram and on top of it the labels.

In [None]:
def plot_spectrogram(audio, sample_rate, df_sounds, rec_id,
                     positive_background_color='white',negative_background_color='white',
                     positive_frame_color='green', negative_frame_color='red',
                     positive_font_color='green', negative_font_color='red', figsize=(12,4), dpi=150):
    fig, ax = plt.subplots(nrows=1, ncols=1, figsize=figsize, dpi=dpi)
    ax.tick_params(axis='x', labelsize=10)
    ax.tick_params(axis='y', labelsize=10)

    tmp = librosa.stft(audio[0].numpy())
    tmp_db = librosa.amplitude_to_db(abs(tmp))
    img = librosa.display.specshow(tmp_db, sr=sample_rate, x_axis='time',
                             y_axis='hz',ax=ax)

    boxes = {'positive': [], 'negative': []}
    frames = {'positive': [], 'negative': []}
    for i, row in df_sounds.iterrows():
        xy = (row['t_min'], row['f_min'])
        width = row['t_max'] - row['t_min']
        height = row['f_max'] - row['f_min']
        box = mpl.patches.Rectangle(xy, width, height)
        frame = mpl.patches.Rectangle(xy, width, height)
        if row['positive']:
            boxes['positive'].append(box)
            frames['positive'].append(frame)
        else:
            boxes['negative'].append(box)
            frames['negative'].append(frame)
        msg = f'species: {row["species_id"]}\nsong: {row["songtype_id"]}'
        c = positive_font_color if row['positive'] else negative_font_color
        ax.annotate(msg, xy=(xy[0],xy[1]+height), color=c, fontsize=12)
    for k,c in zip(['positive','negative'],[positive_background_color, negative_background_color]):
        _boxes = mpl.collections.PatchCollection(boxes[k],facecolor='white', lw=2, alpha=.2)
        ax.add_collection(_boxes)
    for k,c in zip(['positive','negative'],[positive_frame_color, negative_frame_color]):
        _frames = mpl.collections.PatchCollection(frames[k],facecolor='None',edgecolor=c, lw=2, alpha=.7)
        ax.add_collection(_frames)
    cax = fig.colorbar(img)
    cax.ax.set_title('dB')
    ax.set_title(f'Spectrogram for recording_id {rec_id}', fontsize=20)
    return fig, ax

In [None]:
%%time
plot_spectrogram(audio, sample_rate, df_sounds, rec_id);

Lastly, let's collect all the file paths.

In [None]:
def get_flac_files(path:Path, condition=lambda x: x.suffix == '.flac'):
    return {val.name: val for val in path.ls().filter(condition)}

In [None]:
%%time
files = {k: get_flac_files(base_path/k) for k in ['train','test']}

## The big show using `ipywidgets`
> Here we "only" re-use the above functions.

Both inspector (🧐) classes have the same structure. So let's create a template class. 

In [None]:
class SoundInspectorTemplate:
    'Template for inspector GUIs'
    def __init__(self, files, df):
        self.files = files
        self.df = df
            
    def plot(self, change):
        'Define what is plotted here'
        pass
    
    def render(self):
        'Define widgets and their arrangement here'
        pass

### Inspecting a single recording

In [None]:
@delegates() # fun decorator from fastcore: https://fastcore.fast.ai/meta.html#delegates
class SingleRecordingInspector(SoundInspectorTemplate):
    'Inspecting a single recording'
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        
    def _update_file_options(self, new_set, f_min, f_max, sample_type):
        if len(new_set) == 0: return []
        if (f_min == f_max) or (f_max < f_min): return []
        if len(sample_type) == 0: return []
        files = list(self.files[new_set].keys())
        types = [True] if sample_type == 'positive' else [False] if sample_type == 'negative' else [True,False]
        if new_set == 'train':
            mask_fmin = (self.df['f_min']>=f_min) 
            mask_fmax = (self.df['f_max']<=f_max) 
            mask_type = (self.df['positive'].isin(types))
            rec_ids = self.df.loc[mask_fmin & mask_fmax & mask_type,'recording_id'].unique()
            files = [f'{v}.flac' for v in rec_ids if f'{v}.flac' in files]
        return files
    
    def update_file_options_set(self, new_set):
        return self._update_file_options(new_set, self.f_min.value, self.f_max.value, self.samples.value)
    
    def update_file_options_fmin(self, f_min):
        return self._update_file_options(self.set.value, f_min, self.f_max.value, self.samples.value)
    
    def update_file_options_fmax(self, f_max):
        return self._update_file_options(self.set.value, self.f_min.value, f_max, self.samples.value)
    
    def update_file_options_sample_type(self, sample_type):
        return self._update_file_options(self.set.value, self.f_min.value, self.f_max.value, sample_type)    
    
    def plot(self, change):
        audio, sample_rate = torchaudio.load(self.files[self.set.value][self.file.value])
        rec_id = self.file.value.split('.')[0]
        mask = self.df['recording_id'] == rec_id
        msg = f'''
        <b>Visualizing {self.set.value}/{self.file.value}</b><br>
        <ul>
        <li>Path: {self.files[self.set.value][self.file.value]}</li>
        <li>Sampling rate: {sample_rate}</li>
        <li>Tensor size: {" x ".join(map(str,tuple(audio.size())))}</li>
        '''
        mask = mask 
        if self.samples.value == 'positive':
            mask = mask & (self.df['positive'] == True)
        if self.samples.value == 'negative':
            mask = mask & (self.df['positive'] == False)
        if self.set.value == 'train':
            mask = mask & (self.df['f_min'] >= self.f_min.value) & (self.df['f_max'] <= self.f_max.value) 
            msg += f'''
        <li># species: {self.df.loc[mask,"species_id"].nunique()}</li>
        <li># song types: {self.df.loc[mask,"songtype_id"].nunique()}</li>
        </ul>
        {self.df.loc[mask].to_html()}
        '''
        df = self.df.loc[mask]
        with self.out:
            clear_output()
            self.text.value = msg
            ipd.display(
                ipd.Audio(data=audio[0], rate=sample_rate),
                plot_spectrogram(audio, sample_rate, df, rec_id)
            );
            show_inline_matplotlib_plots() # shortest code i've found to remove matplotlib charts during update
    
    def render(self):
        self.set = widgets.Dropdown(value='train', options=['train','test'])
        self.file = widgets.Combobox(placeholder='Choose a file', value='c12e0a62b.flac',
                                     options=list(self.files['train'].keys()))
        self.samples = widgets.Dropdown(value='all', options=['positive', 'negative', 'all'])
        self.f_min = widgets.FloatText(value=0, placeholder='enter f_min')
        self.f_max = widgets.FloatText(value=5e4, placeholder='enter f_ax')
        self.dlink_set_files = widgets.dlink((self.set,'value'),(self.file,'options'),self.update_file_options_set)
        self.dlink_fmin_files = widgets.dlink((self.f_min,'value'),(self.file,'options'),self.update_file_options_fmin)
        self.dlink_fmax_files = widgets.dlink((self.f_max,'value'),(self.file,'options'),self.update_file_options_fmax)
        self.dlink_sample_files = widgets.dlink((self.samples,'value'),(self.file,'options'),self.update_file_options_sample_type)
        self.out = widgets.Output()
        self.text = widgets.HTML()
        self.submit = widgets.Button(description='Submit')
        self.submit.on_click(self.plot)
        
        return widgets.HBox([
            widgets.VBox([widgets.Label('source'),widgets.Label('to highlight'),
                          widgets.Label('f_min (Hz)'),widgets.Label('f_max (Hz)'),widgets.Label('file')]),
            widgets.VBox([self.set,self.samples,self.f_min,self.f_max,self.file,self.submit,self.out,self.text])
        ])

**How to use:**

First set each of the fields:
* `source` = source of the recording (`train` or `test` set set) 
* `to high...(light)` = which labels to draw boxes for (`positive` = tp, `negative` = fp, `all` = ¯\\\_(ツ)\_/¯)
* `f_min` = the lowest allowed frequency for a box (only `train` set)
* `f_max` = the highest allowed frequency for a box (only `train` set)
* `file` = the filename of the recording to render (the list will update based on the values in the other fields)

Hit "Submit"!

In [None]:
gui = SingleRecordingInspector(files, df)
gui.render()

### Inspecting multiple recordings

In [None]:
@delegates() # fun decorator from fastcore: https://fastcore.fast.ai/meta.html#delegates
class SongOrSpeciesRecordingInspector(SoundInspectorTemplate):
    'Inspecting a single recording'
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
                
    def plot(self, change):
        mask = self.df['recording_id'].isin(self.recordings.value)
        mask = mask & (self.df[f'{self.mode.value}_id'] == int(self.selection.value))
        df = self.df.loc[mask]
        msg = f'''
        <b>Visualizing {self.mode.value}_id == {self.selection.value}</b> in recordings: {", ".join(self.recordings.value)}
        '''
        self.text.value = msg
        for i,(rec_id, _df) in enumerate(df.groupby('recording_id')):
            with self.out.children[i]:            
                clear_output()
                audio, sample_rate = torchaudio.load(self.files['train'][f'{rec_id}.flac'])
                ipd.display(
                    ipd.Audio(data=audio[0], rate=sample_rate),
                    plot_spectrogram(audio, sample_rate, _df, rec_id)
                );
                show_inline_matplotlib_plots() # shortest code i've found to remove matplotlib charts during update
    
    def update_selection_options(self,change):
        'updating self.selection.options based on self.mode.value'
        if len(change) == 0: return []
        col = f'{change}_id'
        return tuple(map(str,self.df[col].unique()))
    
    def _update_recording_options(self,selection,sample):
        if (len(selection) == 0) or (len(sample) == 0): return []
        col = f'{self.mode.value}_id'
        sample_options = [True] if (sample == 'positive') else [False] if (sample == 'negative') else [True,False]
        mask = (self.df[col] == int(selection)) & (self.df['positive'].isin(sample_options))
        return tuple(self.df.loc[mask,'recording_id'].unique())
        
    def update_recording_options_selection(self,change):
        'updating self.recording.options based on self.selection.value'
        return self._update_recording_options(change,self.samples.value)
    
    def update_recording_options_samples(self,change):
        'updating self.recording.options based on self.selection.value'
        return self._update_recording_options(self.selection.value,change)
                
    def _update_tab_titles(self):
        rec_ids = self.recordings.value
        self.out.children = [widgets.Output() for _ in rec_ids]
        for i, rec_id in enumerate(rec_ids):
            self.out.set_title(i,f'rec_id: {rec_id}')
        
    def update_tab_titles(self, change):
        if (change['owner'] != self.recordings) or (change['type'] != 'change') or isinstance(change['new'],dict) or (change['new']==change['old']) or (len(change['new']) == 0):
            return
        self._update_tab_titles()
    
    def render(self):
        self.mode = widgets.Dropdown(value='species', options=['species','songtype'])
        self.selection = widgets.Combobox(placeholder='Choose a species/song type', value='3',
                                     options=list(map(str,self.df['species_id'].unique())))
        self.dlink_mode_selection = widgets.dlink((self.mode,'value'),(self.selection, 'options'), self.update_selection_options)
        recs = self.df.loc[self.df['species_id']==3,'recording_id'].unique()
        self.recordings = widgets.SelectMultiple(options=recs,value=['c12e0a62b'])
        self.samples = widgets.Dropdown(value='all', options=['positive', 'negative', 'all'])
        self.dlink_selection_recordings = widgets.dlink((self.selection, 'value'),(self.recordings,'options'), self.update_recording_options_selection)
        self.dlink_samples_recordings = widgets.dlink((self.samples, 'value'),(self.recordings,'options'), self.update_recording_options_samples)
        self.out = widgets.Accordion(children=[widgets.Output() for _ in self.recordings.value])
        self._update_tab_titles()
        self.recordings.observe(self.update_tab_titles)
        self.text = widgets.HTML()
        self.submit = widgets.Button(description='Submit')
        self.submit.on_click(self.plot)
        
        return widgets.HBox([
            widgets.VBox([widgets.Label('id type'),widgets.Label('species/song id'),widgets.Label('to highlight'),
                          widgets.Label('recording ids'),]),
            widgets.VBox([self.mode,
                          self.selection,
                          self.samples,
                          self.recordings,
                          self.submit,self.text,self.out]),
        ])

**How to use:**

First enter stuff:
* `id type` = id (`species` or `songtype`) to filter the recordings for 
* `species/...(songtype)` = which id of species or song type to filter the recordings for
* `to high...(light)` = which labels to draw boxes for (`positive` = tp, `negative` = fp, `all` = ¯\\\_(ツ)\_/¯)
* `recording` = `recording_id` values to select (using ctrl or shift you can select multiple!!!) (also: the list of available `recording_id` values is updated based on the other fields (sometimes this isn't instantaneous))

Hit "Submit"!

Note: To not overwhelm the output for each selected recording is hidden in a designated clickable field. If you want to inspect the results don't forget to click on the respective field and 🧐.

In [None]:
gui = SongOrSpeciesRecordingInspector(files, df)
gui.render()

That's it! 🥳