# Generate Spectrograms for a dataset locally

## 1. Import the relevant libraries

Beside OSmOSE, the use of pathlib is strongly recommended when working with path. While it understand string paths just fine, they are not as portable as pathlib.Path, and less readable.

In [1]:
import os
from pathlib import Path
from IPython.display import Image, display
import random

import OSmOSE as osm

## 2. Prepare the dataset

### 2.1 Dataset properties

The following cell describes the properties of your data. Refer to the comments to get the meaning of each one. For a more detailed explanation, see the Spectrogram documentation.

The `gps_coordinates` is a tuple containing the coordinate of the gathering point in the format (lat, lon).

In [2]:
dataset_path = Path(r"C:\Users\Rumengol\Documents\OSmOSE\datasets\sample")

gps_coordinates = (49,-2)

spectrogram = osm.Aplose(dataset_path=dataset_path, gps_coordinates=gps_coordinates, local=True)

Cannot set osmose group on a non-Unix operating system.
It seems you are on a non-Unix operating system (probably Windows). The build() method will not work as intended and permission might be incorrectly set.


**Set and change spectrogram parameters**

The `analysis_samplerate` is the sample rate at which you want to resample your data before processing it. If you want to keep the same as the initial sample rate, set it to 0.
<a id='params'></a>

In [3]:
"""Data parameters"""
spectrogram.data_normalization = "" # Can be instrument or zscore
# Only if data_normalization is zscore
spectrogram.zscore_duration = "original"
# Only if data_normalization is instrument
spectrogram.gain_dB = 0
spectrogram.sensitivity = 1
spectrogram.peak_voltage = 0
spectrogram.spectro_normalization = "spectrum" # Can be spectrum or density

"""Spectrogram generation parameters"""

spectrogram.analysis_samplerate = 10
spectrogram.spectro_duration = 10 # Duration of the spectrogram in second
spectrogram.zoom_level = 0 # Number of zoom levels. Set to 0 to deactivate zoom.

spectrogram.nfft = 100 # Number of Fast Fourier Transform
spectrogram.window_size = 100 # Size of the used window
spectrogram.window_type = "hamming" 
spectrogram.overlap = 0 # The percentage of overlap between two windows
hp_filter_min_freq =0

"""Spectrogram display parameters"""
spectrogram.dynamic_max = 0
spectrogram.dynamic_min = 0
spectrogram.colormap = "viridis"

### 2.2 Technical properties

Here you can change the technical aspect of spectrogram generation. 
- `merge_on_reshape`: If the spectro_duration is different than the original audio duration, new audio files with the desired duration will be created. By default, if a new file overlaps two old ones, they will be fused together, assuming their timestamps corresponds. If you do not want this behavior, set this parameter to False.
- `last_file_behavior`: Change how the leftover data is treated after reshaping all files, or every file if merge_on_reshape is False. You can either `pad` with silence until it reaches the desired duration, `truncate` the last file to fit the leftover data duration, or `discard` the data altogether.
- `save_matrix`: Numpy matrices take up a considerable amount of space. Like an enormous one. Only save them if you know you need them.
- `save_image`: Disable image generation, if you only want matrices. You must have one of save_matrix or save_image set to True.

In [4]:
merge_on_reshape = True
last_file_behavior = "pad" # or truncate or discard
save_matrix = False
save_image = True

spectrogram.number_adjustment_spectrogram = 1
spectrogram.batch_number = 1
force_init = False

date_template = ""

### 2.3 Initialize the dataset for spectrogram generation

The initialize method will first build your dataset to OSmOSE standard if it is not already prepared, then create the files required to generate spectrograms.

In [5]:
spectrogram.initialize(date_template=date_template,
                       force_init=force_init)

## 3 Generate adjustment spectrograms

These spectrograms are meant to evaluate the spectrogram parameters. If you are not satisfied with the result, change the parameters [in this cell](#params) and rerun the adjustment.

In [6]:
sum(2**i for i in range(spectrogram.zoom_level))


0

In [7]:
files = list(spectrogram.audio_path.glob("*.wav"))
    
files_to_process = random.sample(files, min(spectrogram.number_adjustment_spectrogram, len(files) -1))

for audio_file in files_to_process:
    spectrogram.generate_spectrogram(audio_file=audio_file, adjust = True, save_image=True)

spectro_list = os.listdir(spectrogram.path_output_spectrogram)
for spectro in spectro_list:
    display(Image(spectrogram.path_output_spectrogram.joinpath(spectro)))

[] 0 0
True


LibsndfileError: Error opening '2022_07_07T21_40_40_000.wav': System error.

## 4 Generate all spectrograms

In [None]:
for file in files:
    spectrogram.generate_spectrogram(audio_file=file, 
                                    save_image=save_image, 
                                    save_matrix=save_matrix,
                                    last_file_behavior=last_file_behavior,
                                    merge_files=merge_on_reshape
                                    )