# 01a - Segment Songs with WhisperSeg

## Setup

1. Following the instructions provided by WhisperSeg (https://github.com/nianlonggu/WhisperSeg), create a new conda environment with all of `wseg`'s dependencies. 
2. Clone the WhisperSeg github repo and add the repository to your system path. 

In [1]:
import sys
sys.path.insert(0, 'C:\\Grad_School\\Code_and_software\\Py_code\\WhisperSeg\\') #change this file path based on where WhisperSeg was cloned on your system. 

## Imports

In [2]:
import librosa
import json
import glob
import pandas as pd
from audio_utils import SpecViewer

spec_viewer = SpecViewer()

## Initialize Model

If your device has a GPU and you want to run model inference faster, set `device = "cuda"` when initializing the model. Otherwise set `device = "cpu"`. 

In [3]:
from model import WhisperSegmenterFast
segmenter = WhisperSegmenterFast( "nccratliri/whisperseg-large-ms-ct2", device="cuda" )


Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
binary_path: c:\Users\tkoch\anaconda3\envs\wseg_cuda\lib\site-packages\bitsandbytes\cuda_setup\libbitsandbytes_cuda116.dll
CUDA SETUP: Loading binary c:\Users\tkoch\anaconda3\envs\wseg_cuda\lib\site-packages\bitsandbytes\cuda_setup\libbitsandbytes_cuda116.dll...


## Specify Zebra Finch-specific Hyperparameters

Justification for this set of hyperparameters is provided in the WhisperSeg paper (https://doi.org/10.1101/2023.09.30.560270). 

In [5]:
sr = 32000 
min_frequency = 0
spec_time_step = 0.0025
min_segment_length = 0.01
eps = 0.02
num_trials = 3

## Get List of Birds to Segment

In [None]:
All_Birds = ["B145", "B236", "B258", "B385", "B402", "B447", 
             "B507", "G255", "G397", "G402", "G413", "G437", 
             "G439", "G524", "G528", "O144", "O254", "O421", 
             "O440", "O512", "R402", "R425", "R469", "S132", 
             "S421", "S525", "S528", "Y389", "Y397", "Y425", 
             "Y440", "B524", "O434", "S389", "Y433", "Y453"] 

36

## Segment All Birds in Dataset

In [None]:
for Bird_ID in All_Birds:
    print(Bird_ID)
    #get list of files to segment
    song_folder_path = 'E:\\Final_Bird_Dataset\\FP1_project_birds\\segmented_songs\\' + Bird_ID + "\\" #set path to song folder according to your file system. 
    all_songs = glob.glob(song_folder_path + "*.wav")

    #initialize empty dataframe to save syllable segmentations
    full_seg_table = pd.DataFrame()

    #Loop over each .wav file in the specified song folder. 
    for i, song in enumerate(all_songs):
        #load audio
        audio, __ = librosa.load(song, sr = sr)

        #segment file
        prediction = segmenter.segment(audio, 
                                       sr = sr,
                                       min_frequency = min_frequency, 
                                       spec_time_step = spec_time_step,
                                       min_segment_length = min_segment_length, 
                                       eps = eps, 
                                       num_trials = num_trials)

        #format segmentation as dataframe
        curr_prediction_df = pd.DataFrame(prediction)

        #add file name to dataframe
        song_name = song.split("\\")[-1]
        curr_prediction_df['file'] = song_name

        #add current file's segments to full_seg_table
        full_seg_table = pd.concat([full_seg_table, curr_prediction_df])
    
    #save full_seg_table
    full_seg_table.to_csv('E:\\Final_Bird_Dataset\\WhisperSeg_Segmentation\\' + Bird_ID + "_wseg.csv") #set path to output file according to your file system. 