# Demo inaSpeechSegmenter

This is a demo notebook which how to use inaSpeechSegmenter. 

It has been tested on Google Colab. In some context, you may need to install ffmpeg command line utility. You can do this in the notebook with the following command line:

```
! apt install ffmpeg
```

In [None]:
# Install the library
! pip install inaSpeechSegmenter

We use inaSpeechSegmenter. We also use pandas and seaborn for the data analytics section below.

In [None]:
# Load the libraries
from inaSpeechSegmenter import Segmenter
from inaSpeechSegmenter.export_funcs import seg2csv
import pandas as pd
import seaborn as sns

In [None]:
# Load the model
seg = Segmenter()

In [None]:
# Choose any mp3 file online
media = 'https://github.com/ina-foss/inaSpeechSegmenter/raw/master/media/musanmix.mp3'

In [None]:
# Run the segmentation
segmentation = seg(media)

In [None]:
# Look at the outcome
segmentation

In [None]:
# Export results to CSV
seg2csv(segmentation, 'myseg.csv')

## Data analytics

This section read the output of the segmentation, compute total length for each label ('music', 'noEnergy', 'male', 'female') and draw a graph. 

In [None]:
# Read the results in a table
df = pd.read_table("myseg.csv")

In [None]:
# Compute the length of each sequence
df["length"] = df['stop'] - df['start']

In [None]:
# Compute the aggregated length of all sequences by label
df[['labels', 'length']].groupby("labels").sum()

In [None]:
# Store the aggregated data in a new data frame
df_aggregated= df[['labels', 'length']].groupby("labels").sum()

In [None]:
# Add the index as a column
df_aggregated['labels'] = df_aggregated.index

In [None]:
# Draw the plot
sns.barplot(
    data = df_aggregated, 
    x = "labels", 
    y = "length"
    )