Sampling Rate, Frame Shift and Feature Extractors #768

DSMaVie · 2022-07-05T19:43:46Z

DSMaVie
Jul 5, 2022

First off, let me say, I love Lhotse. :)

However, I have some problems with the feature extractor API. Usually everything in Lhotse is quite sampling rate agnostic. the Recording class saves the sampling rate per recording, and even the extract function of the FeatureExtractor expects the sampling rate per sample.

But then, the FeatureExtractor also requires implementing the frame_shift property, which sometimes depends on the sampling rate. In these cases the sampling rate is assumed to be the same for every sample that is ingested by the FeatureExtractor, usually an entire CutSet. Even in the examples and recipes in the code base, sometimes the frame_shift depends on the sampling rate, (which results in it being assumed to be equal for every input to the Extractor) and sometimes it does not.

I am just wondering if there is a better way to handle this. The frame_shift in its current form is heavily validated, but hardly explained and I don't see any immediate reason for it to exist anyways. I feel it is cumbersome and requires too much working against Lhotse's implementation of the FeatureExtractor API than working with it.

I am not sure what would be the optimal solution but from a cursory look through the code base, you could:

drop the frame_shift property and validation or only validate when frame_shift is actually implemented.
Move the sampling_rate argument from the extract method to the __init__ to make it clear that a fixed sampling rate is expected for the entire FeatureSet, which seems to be the case for most implementations I have seen so far anyway.

Again, I really love Lhotse and wanted to leave this feedback on a little piece of the code somewhere. I hope it helps :)

pzelasko · 2022-07-05T19:50:22Z

pzelasko
Jul 5, 2022
Maintainer

Thanks for reaching out. Can you show an example of a FeatureExtractor that does not have a frame_shift? In most applications frame_shift is a key property since it tells you what is the elapsed duration between two consecutive frames.

Also, can you explain how does frame_shift depend on the sampling_rate? We specifically express the frame_shift in seconds to make it sampling rate agnostic.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sampling Rate, Frame Shift and Feature Extractors #768

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Sampling Rate, Frame Shift and Feature Extractors #768

DSMaVie Jul 5, 2022

Replies: 1 comment

pzelasko Jul 5, 2022 Maintainer

DSMaVie
Jul 5, 2022

pzelasko
Jul 5, 2022
Maintainer