Skip to content

wav_io needs to accept wav files which has 'JUNK' chunk before 'fmt ' chunk #1503

@yongtang

Description

@yongtang

This issue is from tensorflow/tensorflow#26247 (comment) : (cc @MemoonaTahira)

I am using transfer learning for audio using this tutorial and I have a few wav files with 'BEXT' chunk and it throws an error.

2021-08-17 16:13:21.912804: W tensorflow/core/framework/op_kernel.cc:1692] OP_REQUIRES failed at audio_video_wav_kernels.cc:315 : Out of range: EOF reached
2021-08-17 16:13:21.916281: W tensorflow/core/framework/op_kernel.cc:1692] OP_REQUIRES failed at decode_wav_op.cc:55 : Invalid argument: Data too short when trying to read string
Traceback (most recent call last):
File "C:\Users\Mona\anaconda3\envs\lisnen_work\lib\contextlib.py", line 135, in exit
self.gen.throw(type, value, traceback)
File "C:\Users\Mona\anaconda3\envs\lisnen_work\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2833, in variable_creator_scope
yield
File "C:\Users\Mona\anaconda3\envs\lisnen_work\lib\site-packages\keras\engine\training.py", line 1184, in fit
tmp_logs = self.train_function(iterator)
File "C:\Users\Mona\anaconda3\envs\lisnen_work\lib\site-packages\tensorflow\python\eager\def_function.py", line 885, in call
result = self._call(*args, **kwds)
File "C:\Users\Mona\anaconda3\envs\lisnen_work\lib\site-packages\tensorflow\python\eager\def_function.py", line 917, in _call
return self._stateless_fn(*args, **kwds) # pylint: disable=not-callable
File "C:\Users\Mona\anaconda3\envs\lisnen_work\lib\site-packages\tensorflow\python\eager\function.py", line 3039, in call
return graph_function._call_flat(
File "C:\Users\Mona\anaconda3\envs\lisnen_work\lib\site-packages\tensorflow\python\eager\function.py", line 1963, in _call_flat
return self._build_call_outputs(self._inference_function.call(
File "C:\Users\Mona\anaconda3\envs\lisnen_work\lib\site-packages\tensorflow\python\eager\function.py", line 591, in call
outputs = execute.execute(
File "C:\Users\Mona\anaconda3\envs\lisnen_work\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Function invocation produced OutOfRangeError: EOF reached
[[{{node PartitionedCall/IO>AudioDecodeWAV}}]]
[[IteratorGetNext]]
(1) Invalid argument: Function invocation produced OutOfRangeError: EOF reached
[[{{node PartitionedCall/IO>AudioDecodeWAV}}]]
[[IteratorGetNext]]
[[IteratorGetNext/_2]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_31415]

Function call stack:
train_function -> train_function

2021-08-17 16:13:29.065415: W tensorflow/core/framework/op_kernel.cc:1692] OP_REQUIRES failed at decode_wav_op.cc:55 : Invalid argument: Header mismatch: Expected fmt but found bext

(I am using tensorflow-gpu 2.6 and I tried first with just tfio 0.20.0,
and then after installing tensorflow-io-nightly 0.20.0.dev20210815170710)

I have recreated the function load_16k_mono() as discussed here on this thread with tfio.audio.decode_wav instead of tf.audio.decode_wav

Both functions load_16k_mono() and load_16k_mono_modified are showing the exact same output in the debugger. However, when I use audio files processed through this function for training, I still get the same error.

Here is the full code:

import tensorflow as tf
import tensorflow_io as tfio


@tf.function
def load_wav_16k_mono(filename):
    """ Load a WAV file, convert it to a float tensor, resample to 16 kHz single-channel audio. """
    file_contents = tf.io.read_file(filename)
    wav, sample_rate = tf.audio.decode_wav(
          file_contents,
          desired_channels=1)
    wav = tf.squeeze(wav, axis=-1)
    sample_rate = tf.cast(sample_rate, dtype=tf.int64)
    wav = tfio.audio.resample(wav, rate_in=sample_rate, rate_out=16000)
    return wav


@tf.function
def load_wav_16k_mono_modified(filename):
    file_contents = tf.io.read_file(filename)

    wav = tfio.audio.decode_wav(file_contents, dtype=tf.int16)
    wav = wav[:, 0]
    wav = tf.cast(wav, tf.float32)

    _, sample_rate = tf.audio.decode_wav(file_contents, desired_channels=1)
    sample_rate = tf.cast(sample_rate, dtype=tf.int64)
    wav = tfio.audio.resample(wav, rate_in=sample_rate, rate_out=16000)

    return wav

testing_wav_file_name = tf.keras.utils.get_file('miaow_16k.wav',
                                                'https://storage.googleapis.com/audioset/miaow_16k.wav',
                                                cache_dir='./',
                                                cache_subdir='test_data')

load_wav_16k_mono (testing_wav_file_name)
load_wav_16k_mono_modified (testing_wav_file_name)

I also saw this code in one of the issues, and it is handling bext chunks, but probably not tfio.audio.wave_decode?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions