# segmentation tool tutorial
pyannote**book** is a custom [Jupyter widget](https://ipywidgets.readthedocs.io/en/stable/) built on top of [pyannote.core](http://pyannote.github.io/pyannote-core/) and [wavesurfer.js](https://wavesurfer-js.org/).

It can be used to visualize and edit temporal audio labels. 

### Before you to start:
This code enables the selection of one file from the "audio" folder at a time, automatically looks for the corresponding .rttm file in the "rttm_original" folder (if any), and visualizes the audio and the labeled segments in the pyannotebook widget. 

The code is not complete yet: you should add somewhere a snippet of code running the vtc model or an alternative one (audacity- or praat-based) to obtain the starting .rttm files.

In [30]:
# imports
import ipywidgets as widgets
from IPython.display import display
from pyannotebook import Pyannotebook, load_rttm
from pyannote.core import Annotation, Segment
import os
from pathlib import Path

In [31]:
# utility functions
def create_file_selector(path):
    if not os.path.isdir(path):
        print(f"Path {path} doesn't exist.")
        return

    file_list = [f for f in os.listdir(path) if os.path.isfile(os.path.join(path, f))]
    if not file_list:
        print(f"No files found in {path}.")
        return

    selector = widgets.Dropdown(
        options=file_list,
        description='File List:',
        disabled=False,
    )
    return selector

### Let's select the audio file we wanna work with

In [32]:
path = './audio'  # replace with your path
file_selector = create_file_selector(path)

if file_selector:
    display(file_selector)

Dropdown(description='File List:', options=('sample.wav',), value='sample.wav')

In [33]:
if file_selector:
    selected_file = file_selector.value
    print(f"Selected file: {selected_file}")
    filename = selected_file[:-4]

Selected file: sample.wav


### Let's correct or create and label the audio segments of interest

In [34]:
# instantiate annotation widget
widget = Pyannotebook(f"./audio/{filename}.wav")
widget

Pyannotebook(children=(WavesurferWidget(active_label='a', audio='data:audio/x-wav;base64,UklGRjJMHQBXQVZFZm10I…

In [35]:
# ... and assign them to the `annotation` property
if Path(f"./rttm_original/{filename}.rttm").exists():
    annotation = load_rttm(f"./rttm_original/{filename}.rttm")["sample"]
    widget.annotation = annotation

### Here you can find some keyboard shortcuts for facilitating your labeling process

Keyboard shortcuts will only work when widget is active so make sure to move your mouse hover to use them.

Key                                           | Description
:---------------------------------------------|:------------------------------------------------
<kbd>SPACE</kbd>                                       | Toggle play/pause
<kbd>ENTER</kbd>                                       | Create region at current time
<kbd>SHIFT</kbd>+<kbd>ENTER</kbd>                      | Split region at current current time
<kbd>A</kbd>, <kbd>B</kbd>, <kbd>C</kbd>, ..., or <kbd>Z</kbd>                    | Update label of selected region
<kbd>LEFT</kbd> or <kbd>RIGHT</kbd>                             | 1. Edit start time of selected region (if any)<br/>2. Move time cursor (when paused)
<kbd>SHIFT</kbd>+<kbd>LEFT</kbd> or <kbd>SHIFT</kbd>+<kbd>RIGHT</kbd>             | Same, but faster.
<kbd>ALT</kbd>+<kbd>LEFT</kbd> or <kbd>ALT</kbd>+<kbd>RIGHT</kbd>                 | Edit end time of selected segment
<kbd>SHIFT</kbd>+<kbd>ALT</kbd>+<kbd>LEFT</kbd> or <kbd>SHIFT</kbd>+<kbd>ALT</kbd>+<kbd>RIGHT</kbd> | Same, but faster.
<kbd>TAB</kbd>                                         | Select next segment
<kbd>SHIFT</kbd>+<kbd>TAB</kbd>                                 | Select previous segment
<kbd>BACKSPACE</kbd>                                   | Delete selected region and select previous one
<kbd>DELETE</kbd> or <kbd>SHIFT</kbd>+<kbd>BACKSPACE</kbd>               | Delete selected region and select next one
<kbd>ESC</kbd>                                         | Unselect segment
<kbd>UP</kbd> or <kbd>DOWN</kbd>                                | Zoom in/out (work in progress)

### Don't forget to save your annotations

Reading the `annotation` property returns a [`pyannote.core.Annotation`](http://pyannote.github.io/pyannote-core/structure.html#annotation) instance...

In [39]:
# check that it does indeed return an `Annotation` instance
from pyannote.core import Annotation
assert isinstance(widget.annotation, Annotation)

... which can be iterated like this:

In [40]:
# iterate over regions and their respective labels
for segment, _, label in widget.annotation.itertracks(yield_label=True):
    print(f"{round(segment.start, 2)} - {round(segment.end, 2)}: {label}")
    # segment.start
    # segment.end
    # label

0 - 0.43: child
0.84 - 1.64: adult_male
3.44 - 5.14: child


... or saved to disk in [`RTTM`](https://catalog.ldc.upenn.edu/docs/LDC2004T12/rt03-fall-eval-plan-v9.pdf) file format like this:

In [41]:
with open(f"./rttm_final/{filename}.rttm", "w") as rttm:
    widget.annotation.write_rttm(rttm)