Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended Recording on Low-Resource Devices #11

Closed
Nirvanatin opened this issue Jun 2, 2023 · 4 comments
Closed

Extended Recording on Low-Resource Devices #11

Nirvanatin opened this issue Jun 2, 2023 · 4 comments
Labels
help wanted The issuer has requested some help unrelated A question loosely related to the library

Comments

@Nirvanatin
Copy link

How can the pawp_simple_recording_app.py example be adjusted to optimize it for extended recording sessions on devices with limited storage and RAM?

@s0d3s
Copy link
Owner

s0d3s commented Jun 3, 2023

Hi🖐
You asked a rather abstract question, but I'll try to answer.

To begin with, it is worth saying that in this example, the processing of the audio stream was meant after it was completely recorded. Which in itself is not optimal when it comes to the available resources of the device.

It also uses a "wide enough" format - paInt24, if it is narrowed down to paInt8 - the space required to store audio fragments will decrease.

So to sum up, in order to reduce the memory required for the application to work, you need to:

  • Process audio fragments immediately, rather than postponing for further processing. For example: immediately write them to a file in the callback (if we are talking about just a recording application)
  • "reduce" your sample format to paInt8

Required memory table (Uncompressed .WAV/etc)

Format Sample Rate Bit Depth Total memory(MB)
paInt24 44.1 kHz 24 bit 908.43
paInt16 44.1 kHz 16 bit 605.62
paInt8 44.1 kHz 8 bit 302.81

What can I do to reduce the file size of recorded audio?

  • first of all - use codecs to compress your audio. I recommend - flac. (more)
  • (optional) convert recording from stereo to mono

@s0d3s s0d3s added help wanted The issuer has requested some help unrelated A question loosely related to the library labels Jun 3, 2023
@Nirvanatin
Copy link
Author

Hey there, I appreciate your response. I have modified your example code to immediately process audio fragments and save them as FLAC files. The code has undergone some hasty modifications and may not be reliable in various situations. However, I'm still facing the challenge of accomplishing this without relying on WAV format.

Additionally, I would greatly appreciate any suggestions you may have regarding simultaneously recording two audio inputs. At the moment, I'm dependent on OBS Studio and VB-Audio Virtual Cable to apply RNNoise suppression to my microphone inputs. Could you suggest a simpler solution that can be implemented entirely in Python?

from queue import Queue
import pyaudiowpatch as pyaudio
import wave
import os
import soundfile as sf


filename = "loopback_record_class.wav"
compressed_filename = "loopback_record_class.flac"
data_format = pyaudio.paInt24


class ARException(Exception):
    """Base class for AudioRecorder's exceptions"""
    ...


class WASAPINotFound(ARException):
    ...


class InvalidDevice(ARException):
    ...


class AudioRecorder:
    def __init__(self, p_audio: pyaudio.PyAudio, wave_file: wave.Wave_write):
        self.p = p_audio
        self.wave_file = wave_file
        self.stream = None

    @staticmethod
    def get_default_wasapi_device(p_audio: pyaudio.PyAudio):
        try:  # Get default WASAPI info
            wasapi_info = p_audio.get_host_api_info_by_type(pyaudio.paWASAPI)
        except OSError:
            raise WASAPINotFound("Looks like WASAPI is not available on the system")

        # Get default WASAPI speakers
        sys_default_speakers = p_audio.get_device_info_by_index(wasapi_info["defaultOutputDevice"])

        if not sys_default_speakers["isLoopbackDevice"]:
            for loopback in p_audio.get_loopback_device_info_generator():
                if sys_default_speakers["name"] in loopback["name"]:
                    return loopback
                    break
            else:
                raise InvalidDevice("Default loopback output device not found.\n\nRun `python -m pyaudio` to check available devices")

    def callback(self, in_data, frame_count, time_info, status):
        """Write frames to file immediately and return PA flag"""
        self.wave_file.writeframes(in_data)
        return (None, pyaudio.paContinue)

    def start_recording(self, target_device: dict):
        self.close_stream()

        self.stream = self.p.open(format=data_format,
                                  channels=target_device["maxInputChannels"],
                                  rate=int(target_device["defaultSampleRate"]),
                                  frames_per_buffer=pyaudio.get_sample_size(pyaudio.paInt24),
                                  input=True,
                                  input_device_index=target_device["index"],
                                  stream_callback=self.callback
                                  )

    def stop_stream(self):
        self.stream.stop_stream()

    def start_stream(self):
        self.stream.start_stream()

    def close_stream(self):
        if self.stream is not None:
            self.stream.stop_stream()
            self.stream.close()
            self.stream = None

    @property
    def stream_status(self):
        return "closed" if self.stream is None else "stopped" if self.stream.is_stopped() else "running"


if __name__ == "__main__":
    p = pyaudio.PyAudio()
    ar = None

    help_msg = 30 * "-" + "\n\n\nStatus:\nRunning=%s | Device=%s | output=%s\n\nCommands:\nlist\nrecord {device_index\\default}\npause\ncontinue\nstop {*.wav\\default}\n"
    target_device = None
    wave_file = None

    try:
        while True:
            print(help_msg % (ar.stream_status if ar else "closed", target_device["index"] if target_device else "None", filename))
            com = input("Enter command: ").split()

            if com[0] == "list":
                p.print_detailed_system_info()

            elif com[0] == "record":
                if wave_file:
                    wave_file.close()
                
                if len(com) > 1 and com[1].isdigit():
                    target_device = p.get_device_info_by_index(int(com[1]))
                else:    
                    try:
                        target_device = AudioRecorder.get_default_wasapi_device(p)
                    except ARException as E:
                        print(f"Something went wrong... {type(E)} = {str(E)[:30]}...\n")
                        continue
                
                wave_file = wave.open(filename, 'wb')
                wave_file.setnchannels(target_device["maxInputChannels"])
                wave_file.setsampwidth(pyaudio.get_sample_size(data_format))
                wave_file.setframerate(int(target_device["defaultSampleRate"]))
                
                ar = AudioRecorder(p, wave_file)
                ar.start_recording(target_device)

            elif com[0] == "pause":
                ar.stop_stream()
            elif com[0] == "continue":
                ar.start_stream()
            elif com[0] == "stop":
                ar.close_stream()
                wave_file.close()

                # Compress the recorded audio to FLAC format
                data, _ = sf.read(filename)
                # sf.write(compressed_filename, data, target_device["defaultSampleRate"], format="FLAC")
                sf.write(compressed_filename, data, int(target_device["defaultSampleRate"]), format="FLAC")

                
                print(f"The audio is written to [{filename}] and compressed to [{compressed_filename}]. Exit...")
                break

            else:
                print(f"[{com[0]}] is an unknown command")

    except KeyboardInterrupt:
        print("\n\nExit without saving...")
    finally:
        if ar:
            ar.close_stream()
        if wave_file:
            wave_file.close()
        p.terminate()

@s0d3s
Copy link
Owner

s0d3s commented Jun 6, 2023

⚠ Compressing audio without buffering is not the best idea, because compressing small fragments may not be efficient enough.

But you can do it like this:

import pyaudiowpatch as pyaudio
import soundfile as sf
from typing import Optional

filename = "loopback_record_class.flac"

format_from_pya_2_sf = {
    pyaudio.paInt16: "int16",
    pyaudio.paInt32: "int32",
    # pass
}

data_format = pyaudio.paInt16

if data_format not in format_from_pya_2_sf:
    raise ValueError("Are you sure that SoundFile accepts this format?")

sf_data_format = format_from_pya_2_sf[data_format]


class ARException(Exception):
    """Base class for AudioRecorder's exceptions"""
    ...


class WASAPINotFound(ARException):
    ...


class InvalidDevice(ARException):
    ...


class AudioRecorder:
    def __init__(self, p_audio: pyaudio.PyAudio, output_file_name: str):
        self.p = p_audio
        self.output_file_name = output_file_name
        self.stream = None # type: Optional[pyaudio.Stream]
        self.output_sf = None # type: Optional[sf.SoundFile]

    @staticmethod
    def get_default_wasapi_device(p_audio: pyaudio.PyAudio):
        try:  # Get default WASAPI info
            wasapi_info = p_audio.get_host_api_info_by_type(pyaudio.paWASAPI)
        except OSError:
            raise WASAPINotFound("Looks like WASAPI is not available on the system")

        # Get default WASAPI speakers
        sys_default_speakers = p_audio.get_device_info_by_index(wasapi_info["defaultOutputDevice"])

        if not sys_default_speakers["isLoopbackDevice"]:
            for loopback in p_audio.get_loopback_device_info_generator():
                if sys_default_speakers["name"] in loopback["name"]:
                    return loopback

            else:
                raise InvalidDevice(
                    "Default loopback output device not found.\n\n"
                    "Run `python -m pyaudio` to check available devices"
                )

    def callback(self, in_data, frame_count, time_info, status):
        """Write frames to file immediately and return PA flag"""
        self.output_sf.buffer_write(in_data, sf_data_format)
        return in_data, pyaudio.paContinue

    def start_recording(self, target_device: dict, output_file_name: Optional[str] = None):
        self.close_stream()

        sample_rate = int(target_device["defaultSampleRate"])

        self.output_sf = sf.SoundFile(
            output_file_name or self.output_file_name,
            mode="w",
            format="FLAC",
            channels=target_device["maxInputChannels"],
            samplerate=sample_rate,
        )

        self.stream = self.p.open(
            format=data_format,
            channels=target_device["maxInputChannels"],
            rate=sample_rate,
            frames_per_buffer=pyaudio.get_sample_size(data_format),
            input=True,
            input_device_index=target_device["index"],
            stream_callback=self.callback
        )

    def stop_stream(self):
        self.stream.stop_stream()

    def start_stream(self):
        self.stream.start_stream()

    def close_stream(self):
        if self.stream is not None:
            self.stream.stop_stream()
            self.stream.close()
            self.stream = None
            self.output_sf.close()

    @property
    def stream_status(self):
        return "closed" if self.stream is None else "stopped" if self.stream.is_stopped() else "running"


if __name__ == "__main__":
    p = pyaudio.PyAudio()
    ar = None

    help_msg = 30 * "-" + "\n\n\nStatus:\nRunning=%s | Device=%s | output=%s\n\nCommands:\nlist\nrecord {device_index\\default}\npause\ncontinue\nstop\n"
    target_device = None

    try:
        while True:
            print(
                help_msg % (
                    ar.stream_status
                    if ar else "closed", target_device["index"]
                    if target_device else "None", filename
                )
            )
            com = input("Enter command: ").split()

            if com[0] == "list":
                p.print_detailed_system_info()

            elif com[0] == "record":

                if len(com) > 1 and com[1].isdigit():
                    target_device = p.get_device_info_by_index(int(com[1]))
                else:
                    try:
                        target_device = AudioRecorder.get_default_wasapi_device(p)
                    except ARException as E:
                        print(f"Something went wrong... {type(E)} = {str(E)[:30]}...\n")
                        continue

                ar = AudioRecorder(p, filename)
                ar.start_recording(target_device)

            elif com[0] == "pause":
                ar.stop_stream()
            elif com[0] == "continue":
                ar.start_stream()
            elif com[0] == "stop":
                ar.close_stream()
                print(f"The audio is written to [{filename}]. Exit...")
                break

            else:
                print(f"[{com[0]}] is an unknown command")

    except KeyboardInterrupt:
        print("\n\nExit without saving...")
    finally:
        if ar:
            ar.close_stream()
        p.terminate()

Also, I wouldn't install a soundfile just to use flac. I'd rather choose pyflac, but it's up to you.

@s0d3s
Copy link
Owner

s0d3s commented Jun 6, 2023

Now about your second question. RNNoise is a neural network implemented in C, it is not part of the OBS (it is integrated via a plugin). So you can make a python wrapper around the C code and use it directly. Probably someone has already implemented a similar wrapper.

To record from several sources at once, you just need to use the second pyaudio.Stream instance. When using a callback, there should be no problems. But with direct reading, the use of threads will probably be relevant.

Your questions are not related to this fork. It would be more appropriate to publish them on stackoverflow. So if you don't have any questions regarding pyaudiowpatch, it would be appropriate to close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted The issuer has requested some help unrelated A question loosely related to the library
Projects
None yet
Development

No branches or pull requests

2 participants