# Features and API changes in OpenSoundscape 0.8.0

With examples

## Top-level imports

We can now import some of the most frequently used classes at the top level:

In [39]:
from opensoundscape import Audio, Spectrogram, CNN

An alternative approach is to import the package and access the classes:

In [9]:
import opensoundscape as opso
opso.Audio.from_file

<bound method Audio.from_file of <class 'opensoundscape.audio.Audio'>>

# Machine Learning

## WandB Integration

Weights and Biases (WandB) is a great web-based tool for monitoring machine learning training and other processes. OpenSoundscape now natively supports logging to WandB during training and prediction with the `CNN` class. 

All you need to do is log in and initialize a wandb "session" - this will create a page on the WandB where any metrics and graphics are automatically logged. You can easily inspect samples from your dataset, see how metrics change over the course of training, and compare metrics across multiple training runs. It's also a great place to add notes about what makes each training run unique, and will keep track of hyperparameters used during CNN training.

To use WandB, create an account and find your API key, which you will use to log in

In [16]:
import wandb
# wandb.login(key='...') #put your api key here, find it at https://wandb.ai/settings

Now, create a new session:

In [40]:
wandb_session = wandb.init(
    entity='kitzeslab', #a shared group on the wandb platform
    project="trying wandb in opensoundscape", #a collection of runs
    name='test internal logging', #optionally, name this specific run
    # add any extra hyperparameters or info for this run in the `config` dictionary
    config=dict( 
        comment="Description: extra comment from training notebook",
    )
)

VBox(children=(Label(value='Waiting for wandb.init()...\r'), FloatProgress(value=0.01670067708333439, max=1.0)…

Problem at: /var/folders/d8/265wdp1n0bn_r85dh3pp95fh0000gq/T/ipykernel_37278/1733791050.py 1 <cell line: 1>


UsageError: Error communicating with wandb process
For more info see: https://docs.wandb.ai/library/init#init-start-error

We can see that WandB has created a website where we can see anything we log to this `wandb_session`. The .train() and .predict() methods of CNN in OpenSoundscape will log useful metrics and preprocessed samples if we pass the `wandb_session` to the `wandb_session` argument:

In [None]:
model = CNN(architecture='efficientnet_b4', classes=['bird'], sample_duration=1.0)
samples = ['./resources/birds_10s.wav']
score_dataframe = model.predict(samples,wandb_session=wandb_session)
wandb.finish() #let wandb know your run has finished

We can go to the link above to see the results! 
(Hint: train() provides more interesting things to look at than predict())

Some things logged by predict include 
- a random set of preprocessed samples that we can listen to and see
- the top-scoring samples per class
- progress through all prediction batches
- whether the run is complete, failed, or still running

The model.wandb_logging dictionary has some configurable parameteres for logging samples:

In [None]:
model.wandb_logging

## EfficientNet architecture added
We snuck an example into the previous cell :)

## Changes to CNN.predict()
again, we snuck an example in that last python cell - notice that model.predict() now just returns a single dataframe containing the CNN score outputs. We can still apply an activation layer if we want with the predict() function, but it doesn't return boolean 0/1 predictions. 

To generate 0/1 predictions from an array or dataframe of continuous scores, use the functions `predict_multi_target_labels` and `predict_single_target_labels`from the `metrics` module:

In [None]:
predictions = opso.metrics.predict_multi_target_labels(score_dataframe,threshold=0)
predictions.head(2)

Note that because CNN.predict() doesn't generate binary predictions, it also no longer accepts arguments for "threshold" and "binary_preds"

## Preference for 1 class for single-class models
this is the third thing we snuck into the previous example: when training a CNN to recognize one thing, we recommend having just one class (rather than two classes `present` and `absent`). Either is mathematically valid, but having one output node makes more sense and will cause you less hastle in general. 

## Train on unsplit audio
It works... try it!

## Tracing and debugging preprocessing errors - coming soon

## GradCAM - coming soon


## new augmentations
random gain and add noise

In [35]:
opso.actions.audio_add_noise

<function opensoundscape.preprocess.actions.audio_add_noise(audio, noise_dB=-30, signal_dB=0, color='white')>

In [34]:
opso.actions.audio_random_gain

<function opensoundscape.preprocess.actions.audio_random_gain(audio, dB_range=(-30, 0), clip_range=(-1, 1))>

## VGGish
VGGish is a pre-trained audio feature extractor. It's easy to use the pytorch model hub version in OpenSoundscape.

(example)[https://github.com/kitzeslab/bioacoustics-cookbook/blob/main/vggish.ipynb] of how to use it with OpenSoundscape in bioacoustic cookbook

## M1 chip / Apple Silicon GPU error workaround
falls back to CPU if torch's logit fails with Not Implemented Error

# Audio Objects

## Robust metadata
Audio.save now saves metadata dictionaries in JSON (in the "comment" field of standard metadata), allowing us to keep track of arbitrary fields and modify+recover the "recording_start_time" (datetime.datetime) field. 

Note that if you add more than 30-50 metadata fields it will run out of space and fail silently

In [41]:
a=Audio.from_file('./resources/birds_10s.wav')
a.metadata

{'samplerate': 32000,
 'format': 'WAV',
 'frames': 324263,
 'sections': 1,
 'subtype': 'PCM_16',
 'channels': 1,
 'duration': 10.13321875}

In [42]:
a.metadata['special note']="does not contain IBWO"
a.save('./resources/saved_file.wav')
Audio.from_file('./resources/saved_file.wav').metadata

{'samplerate': 32000,
 'format': 'WAV',
 'frames': 324263,
 'sections': 1,
 'subtype': 'PCM_16',
 'channels': 1,
 'duration': 10.13321875,
 'special note': 'does not contain IBWO',
 'opso_metadata_version': 'v0.1',
 'opensoundscape_version': '0.7.1'}

## Audio class additions: 
- .silence()
- .noise()
- .normalize()

In [24]:
silent = Audio.silence(duration=3,sample_rate=32000)

In [31]:
pink = Audio.noise(duration=3,sample_rate=32000,dBFS=-5,color='pink')

## _properties_
New properties:
- rms
- dBFS

Converted from functions to properties:
- duration

In [47]:
silent.dBFS

  return 20 * np.log10(self.rms * np.sqrt(2))


-inf

In [46]:
silent.rms

0.0

In [45]:
silent.duration

3.0

## normalize method

In [49]:
pink.normalize(peak_dBFS=-3)

<Audio(samples=(96000,), sample_rate=32000)>

# Spectrogram objects

## properties

In [36]:
s=Spectrogram.from_audio(a)
print(s.duration)

10.127999999999998


In [52]:
print(s.window_start_times)

[0.0000e+00 8.0000e-03 1.6000e-02 ... 1.0096e+01 1.0104e+01 1.0112e+01]


# [Discussions](https://github.com/kitzeslab/opensoundscape/discussions) page!