Bug: Non-matching audios when converting videos from Senselab to HuggingFace #71

wilke0818 · 2024-06-17T20:48:35Z

Description

Videos in SenselabDatasets that have audio do not convert to the exact same audio in HuggingFace. This likely is caused by a few different factors: the extraction of audio from a video using torchvision does not result in the same audio as using ffmpeg directly, additionally, converting to HuggingFace datasets using their Audio feature uses Soundfile under the hood which also causes additional distortions at points.

Steps to Reproduce

In dataset_test.py, we test a video and its extracted audio and currently check the tensors when converting are close enough to each other (defined here as atol=1e-4), but the issue can be seen by checking if they are equal instead.

Expected Results

We would expect that no matter what library was used to decode the audio from a video that when converting to a HuggingFace dataset and then back to a SenselaDataset should result in the same audio throughout the process since the audio waveform is just a 2D tensor.

Actual Results

The audios when converting to HuggingFace and then back to Senselab for videos diverge with around a maximal absolute difference of 5e-5 though notably not every value diverges. It's possible that how different libraries handle silence, or near silence, cause differences in the encodings.

Additional Notes

Interestingly, this issue has not been seen in the conversion of existing audio files leading me to believe that it occurs as a result of different encodings of the audio from a video (floating point precision and bit depth).

wilke0818 added the bug label Jun 17, 2024

fabiocat93 added the help wanted label Nov 14, 2024

fabiocat93 added this to senselab Nov 14, 2024

fabiocat93 moved this to Blocked in senselab Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Non-matching audios when converting videos from Senselab to HuggingFace #71

Bug: Non-matching audios when converting videos from Senselab to HuggingFace #71

wilke0818 commented Jun 17, 2024

Bug: Non-matching audios when converting videos from Senselab to HuggingFace #71

Bug: Non-matching audios when converting videos from Senselab to HuggingFace #71

Comments

wilke0818 commented Jun 17, 2024

Description

Steps to Reproduce

Expected Results

Actual Results

Additional Notes