Audio data file naming standard for Orcasound? #7
Replies: 1 comment 1 reply
-
I think the format Ben showed here may be the best fit for this project. Having a start and stop timestamp in the filename will make scanning and collating files based on a daterange much easier. The extra dimensions we have that are not addressed in current schema are the time and frequency domain granularity. I think we could either address through the filename (Ben's examples had precision in seconds so this is close, but we would probably want to change it to something like seconds per sample?) or the location. Here's examples for a 6 hour file with data at one-minute, 10hz granularity. Ex 1: rpi_port_townsend/20230101T085500_20230101T145500_60s_10hz.parquet One additional option is to include some "pre-made" frequency band settings, such as "3rd-octave", "12-octave", "10_log_hz" etc. |
Beta Was this translation helpful? Give feedback.
-
As we get serious about computing noise metrics and also consider when to implement a suite of improvements to the orcanode code, 2023 is a strategic time to try to agree upon any new standards we want for naming Orasound audio data files -- both the lossy streaming segments and the lossless recordings.
Please review this on-going discussion of the associated date-time format issue in the orcanode repo. Feel free to offer links to other similar conversations you have seen in the bioacoustic community.
We can also discuss the current AWS data "structures" for streaming and archived audio data in separate S3 buckets:
streaming bucket > device + hydrophone location > audio format > UNIX epoch
e.g. streaming-orcasound-net > rpi_sunset_bay/ > hls/ > 1654712898/
archive bucket > device + hydrophone location > YYYY-MM-DD_HH-MM-SS_device_location-samplerate-numchannels.flac
e.g. archive-orcasound-net >
rpi_port_townsend/ > 2023-01-21_16-42-31_rpi_port_townsend-48000-2.flac
Associated issues:
Beta Was this translation helpful? Give feedback.
All reactions