Session 2: Exploring the properties of your acoustic signal
Before class assignment: 

Create a Python notebook in which you:
-Load audio files with the opensoundscape.Audio class, selecting a few seconds of audio containing the target sound
-Create Audio widgets for playback
-Display spectrograms of the target sound using the opensoundscape.Spectrogram class
-Compare visualizations of the sound when varying the spectrogram creation parameters, especially the window_samples argument (e.g. what does the spectrogram of the target sound look like with window_samples=256 vs 1024?)
-Compare spectrograms and audio of the target sound with the confusion sounds
-Additionally visualize a few randomly selected files from your field data. What properties of the field audio data do you notice? Are these files different from or similar to your annotated data? Write these observations and reflections directly into your Python notebook in a markdown cell. 
-Add your notebook to your GitHub repository in a “notebooks/” subfolder. Use “Session2…” as the beginning of the file name. Write brief descriptions of each new file in your readme.md file. 


In [1]:
#OPSO & general imports 
from opensoundscape import Audio, Spectrogram
from opensoundscape.annotations import BoxedAnnotations

# General-purpose packages
import numpy as np
import pandas as pd
from glob import glob
from pathlib import Path

from matplotlib import pyplot as plt
plt.rcParams['figure.figsize']=[15,5] #for big visuals
%config InlineBackend.figure_format = 'retina'



In [2]:
# Load audio files with the opensoundscape.Audio class, selecting a few seconds of audio containing the target sound
audio_file = "/home/brg226/projects/rloc/VIRA_beg/raven_annotations/31493211.wav"
annotation_file = "/home/brg226/projects/rloc/VIRA_beg/raven_annotations/31493211.Table.1.selections.txt"

In [None]:
# Let’s look at a spectrogram of the audio file to see what we’re working with.
Spectrogram.from_audio(Audio.from_file(audio_file)).plot()

In [3]:
#  from_raven_files() method expects annotation files and audio files each in their own list–you cannot pass a single annotation file. So, wrap the filenames in lists:
annot_list = [annotation_file]
audio_list = [audio_file]

In [4]:
# Create an object from Raven file
annotations = BoxedAnnotations.from_raven_files(
    raven_files=annot_list,
    audio_files=audio_list,
    annotation_column=None,  # Put None if you don't have an annotation column
)

# Inspect the object's .df attribute
# which contains the table of annotations
annotations.df.head()

Unnamed: 0,audio_file,annotation_file,annotation,start_time,end_time,low_f,high_f,Channel,Selection,View
0,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,,0.207647,0.822754,4901.8,6312.9,1,1,Spectrogram 1
1,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,,6.934149,7.721642,4938.9,6572.8,1,2,Spectrogram 1
2,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,,17.469654,17.986814,4047.7,5570.2,1,3,Spectrogram 1
3,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,,21.853158,22.507443,4679.0,6238.6,1,4,Spectrogram 1
4,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,,24.786791,25.354883,4976.1,6201.5,1,5,Spectrogram 1


Load multiple Raven annotation tables

In [5]:
# Set the current directory to where the dataset is downloaded

dataset_path = Path("/home/brg226/projects/rloc/VIRA_beg/raven_annotations/")

In [6]:
# Make a list of all of the selection table files
selections = glob(f"{dataset_path}/*.txt")

In [7]:
# create a list of audio files, one corresponding to each Raven file
# (Audio files have the same names as selection files with a different extension)
audio_files = [
    f.replace("Annotation_Files", "Recordings").replace(
        ".Table.1.selections.txt", ".wav"
    )
    for f in selections
]

In [8]:
all_annotations = BoxedAnnotations.from_raven_files(
    selections, annotation_column=None, audio_files=audio_files
)

all_annotations.df['annotation']="virail"
all_annotations.df.head(300)

Unnamed: 0,audio_file,annotation_file,annotation,start_time,end_time,low_f,high_f,Channel,Selection,View
0,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,virail,0.183247,0.858368,3297.7,4995.7,2,1,Spectrogram 1
1,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,virail,1.629935,2.092875,3346.9,4675.8,2,2,Spectrogram 1
2,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,virail,6.915169,7.416687,3789.8,5217.2,2,3,Spectrogram 1
3,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,virail,13.599893,14.062833,3642.2,4626.6,2,4,Spectrogram 1
4,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,virail,16.416112,16.859763,3273.0,4946.5,2,5,Spectrogram 1
...,...,...,...,...,...,...,...,...,...,...
268,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,virail,129.938821,130.321181,3980.6,5201.0,1,14,Spectrogram 1
269,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,virail,132.890038,133.184455,3835.4,5491.5,1,15,Spectrogram 1
270,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,virail,143.166660,143.476371,2934.6,5462.5,1,16,Spectrogram 1
271,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,/home/brg226/projects/rloc/VIRA_beg/raven_anno...,virail,89.703466,90.062884,4242.1,5491.5,1,17,Spectrogram 1


In [9]:
onehot=all_annotations.clip_labels(clip_duration=1.5, min_label_overlap=.1)
onehot.shape
onehot.head(300)

positive_vira=onehot[onehot["virail"]==True]

In [12]:
positive_vira.head(300)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,virail
file,start_time,end_time,Unnamed: 3_level_1
/home/brg226/projects/rloc/VIRA_beg/raven_annotations/252407901.wav,0.0,1.5,True
/home/brg226/projects/rloc/VIRA_beg/raven_annotations/252407901.wav,1.5,3.0,True
/home/brg226/projects/rloc/VIRA_beg/raven_annotations/252407901.wav,6.0,7.5,True
/home/brg226/projects/rloc/VIRA_beg/raven_annotations/252407901.wav,13.5,15.0,True
/home/brg226/projects/rloc/VIRA_beg/raven_annotations/252407901.wav,16.5,18.0,True
...,...,...,...
/home/brg226/projects/rloc/VIRA_beg/raven_annotations/242259551.wav,124.5,126.0,True
/home/brg226/projects/rloc/VIRA_beg/raven_annotations/242259551.wav,127.5,129.0,True
/home/brg226/projects/rloc/VIRA_beg/raven_annotations/242259551.wav,129.0,130.5,True
/home/brg226/projects/rloc/VIRA_beg/raven_annotations/242259551.wav,132.0,133.5,True


In [17]:

df = pd.DataFrame(positive_vira)

# Save DataFrame to CSV
df.to_csv('onehot_pos_virabeg.csv')  # index=False removes the row index
df.to_pickle(f"/home/brg226/projects/rloc/VIRA_beg/onehot_pos_virabeg.pkl")
df.head


<bound method NDFrame.head of                                                                         virail
file                                               start_time end_time        
/home/brg226/projects/rloc/VIRA_beg/raven_annot... 0.0        1.5         True
                                                   1.5        3.0         True
                                                   6.0        7.5         True
                                                   13.5       15.0        True
                                                   16.5       18.0        True
...                                                                        ...
/home/brg226/projects/rloc/VIRA_beg/raven_annot... 124.5      126.0       True
                                                   127.5      129.0       True
                                                   129.0      130.5       True
                                                   132.0      133.5       True
                      

Mess around w/ audio visualization of training data

In [None]:
audio_object = Audio.from_file('/home/brg226/projects/rloc/VIRA_beg/raven_annotations/31493211.wav')
audio_object

In [None]:
spectrogram_object = Spectrogram.from_audio(audio_object)
spectrogram_object.plot()

In [None]:
# View the documentation about this method
Spectrogram.from_audio?

In [None]:
audio = audio_object.resample(22000).trim(0, 5)

spec = Spectrogram.from_audio(audio, window_samples=55, overlap_samples=0)
spec.plot()

In [None]:
spec = Spectrogram.from_audio(audio, window_samples=256, overlap_samples=0)
spec.plot()

In [None]:
spec2 = Spectrogram.from_audio(audio, window_samples=1064, overlap_samples=0)
spec2.plot()

In [None]:
from opensoundscape.spectrogram import MelSpectrogram
melspec = MelSpectrogram.from_audio(
    audio_object.trim(24, 30), window_samples=2048, n_mels=400
)
from matplotlib import pyplot as plt

melspec.plot()

Visualize confusion sp.

In [None]:
BLJA_object = Audio.from_file('/home/brg226/projects/rloc/VIRA_beg/raven_annotations/confusion_sp/XC1036141 - Blue Jay - Cyanocitta cristata.mp3')
RWBL_object = Audio.from_file('/home/brg226/projects/rloc/VIRA_beg/raven_annotations/confusion_sp/XC1004470 - Red-winged Blackbird - Agelaius phoeniceus.mp3')
BCCH_object = Audio.from_file('/home/brg226/projects/rloc/VIRA_beg/raven_annotations/confusion_sp/XC335758 - Black-capped Chickadee - Poecile atricapillus.mp3')

In [None]:
spectrogram_object = Spectrogram.from_audio(BLJA_object.trim(0,10))
spectrogram_object.plot()

In [None]:
spectrogram_object = Spectrogram.from_audio(RWBL_object.trim(0,10))
spectrogram_object.plot()

In [None]:
spectrogram_object = Spectrogram.from_audio(BCCH_object.trim(0,10))
spectrogram_object.plot()

Explore field data

In [None]:
field_audio_object = Audio.from_file('/home/brg226/projects/rloc/datasets/rloc2024a_sync/MSD-0382/20240515_000000.WAV')
field_audio_object

In [None]:
spectrogram_object = Spectrogram.from_audio(field_audio_object)
spectrogram_object.plot()

In [None]:
from opensoundscape.spectrogram import MelSpectrogram
melspec = MelSpectrogram.from_audio(
    field_audio_object.trim(0, 40), window_samples=2048, n_mels=400
)
from matplotlib import pyplot as plt

melspec.plot()

Field data looks very simialr to training data, besides some weird band of mechanical noise @ 100 hZ in field data. The field data has more species than the target recordings of target and confusion species.