Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: Error while processing frame #3

Closed
great-thoughts opened this issue Jul 4, 2016 · 2 comments
Closed

Error: Error while processing frame #3

great-thoughts opened this issue Jul 4, 2016 · 2 comments

Comments

@great-thoughts
Copy link

Hi John,
I was trying to use it to classify my audio frames into speech and silence. When I segment my audio into 30ms, the code runs with no errors. However, when I try 25ms, I get an error that says:

 in is_speech(self, buf, sample_rate, length)
     25                 'buffer has %s frames, but length argument was %s' % (
     26                     int(len(buf) / 2.0), length))
---> 27         return _webrtcvad.process(self._vad, sample_rate, buf, length)
     28 
     29 

Error: Error while processing frame

This is my code:

source1 = path + "phone1.wav"
audio, sample_rate = read_wave(source1)
framesz=25.
frames = frame_generator(framesz, audio, sample_rate)
vad = webrtcvad.Vad(3)
frames = list(frames)
num_voiced = [1 if vad.is_speech(f, sample_rate) else 0 for f in frames]
@wiseman
Copy link
Owner

wiseman commented Jul 5, 2016

The underlying webrtc code can only handle frames that are 10, 20, or 30 ms long. You can use webrtcvad.valid_rate_and_frame_length to check whether a sample rate/frame size is valid (see e.g. https://github.com/wiseman/py-webrtcvad/blob/master/test_webrtcvad.py#L20).

@wiseman wiseman closed this as completed Jul 5, 2016
@Prakash2608
Copy link

import webrtcvad
import soundfile as sf
import numpy as np
import librosa

def extract_speech_segments(audio_path, output_path):
# Load the audio file
audio, sr = librosa.load(audio_path, sr = 16000)

# Set the VAD parameters
vad = webrtcvad.Vad()
vad.set_mode(3)  # Aggressiveness level (0-3)

# Set the frame duration for VAD analysis
frame_duration = 30  # in milliseconds

# Convert the frame duration to the number of samples
frame_size = int(sr * (frame_duration / 1000.0))

# Initialize variables
speech_segments = []
current_segment_start = 0
current_segment_end = 0

# Iterate over the audio frames
for i in range(0, len(audio), frame_size):
    frame = audio[i:i + frame_size]

    # Convert the frame to int16 format
    frame = np.int16(frame * 32768)

    # Check if the frame contains speech
    if vad.is_speech(frame.tobytes(), sample_rate=sr):
        # If it's a new speech segment, update the current segment start
        if current_segment_start == 0:
            current_segment_start = i

        # Update the current segment end
        current_segment_end = i + frame_size

    # If the frame does not contain speech
    else:
        # If we were in a speech segment, add it to the list
        if current_segment_start != 0:
            speech_segments.append((current_segment_start, current_segment_end))
            current_segment_start = 0
            current_segment_end = 0

# Save the speech segments as individual audio files
for idx, (start, end) in enumerate(speech_segments):
    segment_audio = audio[start:end]
    segment_output_path = f"{output_path}_segment{idx}.wav"
    sf.write(segment_output_path, segment_audio, sr)

return speech_segments

Example usage

audio_path = '/kaggle/working/audio.wav'
output_path = '/kaggle/working/'
speech_segments = extract_speech_segments(audio_path, output_path)

This is my code.

Error Traceback (most recent call last)
Cell In[10], line 60
58 audio_path = '/kaggle/working/audio.wav'
59 output_path = '/kaggle/working/'
---> 60 speech_segments = extract_speech_segments(audio_path, output_path)

Cell In[10], line 33, in extract_speech_segments(audio_path, output_path)
30 frame = np.int16(frame * 32768)
32 # Check if the frame contains speech
---> 33 if vad.is_speech(frame.tobytes(), sample_rate=sr):
34 # If it's a new speech segment, update the current segment start
35 if current_segment_start == 0:
36 current_segment_start = i

File /opt/conda/lib/python3.10/site-packages/webrtcvad.py:27, in Vad.is_speech(self, buf, sample_rate, length)
23 if length * 2 > len(buf):
24 raise IndexError(
25 'buffer has %s frames, but length argument was %s' % (
26 int(len(buf) / 2.0), length))
---> 27 return _webrtcvad.process(self._vad, sample_rate, buf, length)

Error: Error while processing frame

and I am getting this error, also checked the prerequisites.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants