Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I make pitch detection using Aubio works with PyAudio? #78

Closed
notalentgeek opened this issue Dec 20, 2016 · 6 comments
Closed
Labels

Comments

@notalentgeek
Copy link

notalentgeek commented Dec 20, 2016

Hello there! Some months ago I posted an issues here on how to get pitch detection with alsaaudio. I got the answer and it is working as what I want.

However, I think (CMIIW) alsaaudio will not working in Windows platform. I saw another library called PyAudio which I assume works cross platform. Since, PyAudio is capable of listening audio from microphone and so does alsaaudio I thought I can easily change alsaaudio with PyAudio.

I use alsaaudio to feed data to Aubio pitch detector object. I tried to use the same method with PyAudio instead, but the pitch returned always 0.0 (I do notice there are difference in the sample from alsaaudio and PyAudio). Here are my codes.

Here is my pitch detection codes with alsaaudio.

from mod_thread import ModThread as mt
from shared import GetDateTime as gdt
from timer_second_change import TimerSecondChange as tms
import alsaaudio as alsa
import aubio
import numpy as num

class MicPVDetect(mt):

    def __init__(
        self,
        _threadName,
        _array,
        _iDB
    ):

        # Append this object into array.
        _array.append(self)

        mt.__init__(
            self,
            _array.index(self) + 1,
            _array.index(self) + 1,
            _threadName
        )

        # Insert database object.
        self.iDB = _iDB

        # Constants.
        self.BUFFER_SIZE = 2048
        self.METHOD = "default"
        self.SAMPLE_RATE = 44100
        self.HOP_SIZE = self.BUFFER_SIZE//2
        self.PERIOD_SIZE_IN_FRAME = self.HOP_SIZE
        # Constant for database.
        self.MODULE_NAME = "mic"

        # Set up audio input. Determine the PCM device
        # (pulse code modulation). The default type is for
        # playback. Hence set the `type` into `alsa.
        # PCM_CAPTURE` instead to capture voice. The
        # microphone is the one that is set in `alsamixer`
        # terminal command. Hence, this alsa library is
        # only for Linux.
        self.recorder = alsa.PCM(
            type = alsa.PCM_CAPTURE
        )
        self.recorder.setchannels(1)
        self.recorder.setformat(
            alsa.PCM_FORMAT_FLOAT_LE
        )
        self.recorder.setperiodsize(
            self.PERIOD_SIZE_IN_FRAME
        )
        self.recorder.setrate(self.SAMPLE_RATE)

        # Set up Aubio energy (volume) and pitch detection.
        self.pitchDetector = aubio.pitch(
            self.METHOD,
            self.BUFFER_SIZE,
            self.HOP_SIZE,
            self.SAMPLE_RATE
        )
        # Set the output unit, it can be "cent", "midi",
        # "Hz", ....
        self.pitchDetector.set_unit("Hz")
        # Ignore frames under this level (dB).
        self.pitchDetector.set_silence(-40)

        # Data received from mic.
        self.data = None

        # Set up timer object. To make sure that
        # the audio calculation only once for
        # every second.
        self.tMS = tms()

    def run(self):

        while self.killMe == False:

            self.tMS.Update()
            self.PVDetectStream()
            if self.tMS.chngSec:
                self.PVDetect()

    # Function to format string before put in database.
    def SetupStringForDB(
        self,
        _pitch,
        _volume
    ):

        arrayForDB = [self.MODULE_NAME]
        arrayForDB.extend(gdt())
        arrayForDB.extend([
            "pitch",
            _pitch,
            "volume",
            _volume
        ])

        #print(arrayForDB)

        return arrayForDB

    # Function that need to be run every one second.
    def PVDetect(self):

        # Convert the data from alsa library into Aubio
        # format samples.
        samples = num.fromstring(
            self.data,
            dtype = aubio.float_type
        )
        # Pith of the current frame.
        pitch = self.pitchDetector(samples)[0]
        # Compute the energy (volume) of current frame.
        volume = num.sum(samples**2)/len(samples)
        volume = "{:.6f}".format(volume)

        # Database!
        self.iDB.mainArray.append(
            self.SetupStringForDB(str(pitch), str(volume))
        )

        #print("pitch = " + str(pitch))
        #print("volume = " + str(volume))

    # Function that need to be run for every tick
    def PVDetectStream(self):

        # Read data from audio input.
        length, self.data = self.recorder.read()

Take a look at this part (the pitch detection converter into Aubio).

# Convert the data from alsa library into Aubio
# format samples.
samples = num.fromstring(
    self.data,
    dtype = aubio.float_type
)
# Pith of the current frame.
pitch = self.pitchDetector(samples)[0]

I tried to do the same with PyAudio here.

import aubio
import numpy as num
import pyaudio
import wave

# PyAudio object.
p = pyaudio.PyAudio()

# Open stream.
stream = p.open(format=pyaudio.paInt16,
    channels=2, rate=44100, input=True,
    frames_per_buffer=1024)

# Aubio's pitch detection.
pDetection = aubio.pitch("default", 2048,
    2048//2, 44100)
# Set unit.
pDetection.set_unit("Hz")
pDetection.set_silence(-40)

while True:

    data = stream.read(1024)
    samples = num.fromstring(data,
        dtype=aubio.float_type)
    pitch = pDetection(samples)[0]

    print(pitch)

But the value of pitch are always 0.0.

@piem
Copy link
Member

piem commented Dec 21, 2016

Hi @notalentgeek ,

Yes, alsa is linux only, that's correct. In your last snippet of code, the stream is opened with format=pyaudio.paInt16, but aubio is expecting samples as float32. You could also convert integers to floats, simply multiplying them by 3.0517578125e-05, which is 1./32768.

let us know how that works,

best, piem

@piem piem added the question label Dec 21, 2016
@notalentgeek
Copy link
Author

notalentgeek commented Dec 21, 2016

I am not sure which part I need to multiply by 1./32768. I tried to multiply it into pitch but the result are still zero.

I changed stream = p.open(format=pyaudio.paInt16, into stream = p.open(format=pyaudio.paFloat32, it returns an error of ValueError: input size of pitch should be 1024, not 2048 at pitch = pDetection(samples)[0].

EDIT: Thanks for your response though!!!!

@notalentgeek
Copy link
Author

notalentgeek commented Dec 21, 2016

FUCK YEAH IT IS WORKING!!!!

import aubio
import numpy as num
import pyaudio
import wave

# PyAudio object.
p = pyaudio.PyAudio()

# Open stream.
stream = p.open(format=pyaudio.paFloat32,
    channels=1, rate=44100, input=True,
    frames_per_buffer=1024)

# Aubio's pitch detection.
pDetection = aubio.pitch("default", 2048,
    2048//2, 44100)
# Set unit.
pDetection.set_unit("Hz")
pDetection.set_silence(-40)

while True:

    data = stream.read(1024)
    samples = num.fromstring(data,
        dtype=aubio.float_type)
    pitch = pDetection(samples)[0]
    # Compute the energy (volume) of the
    # current frame.
    volume = num.sum(samples**2)/len(samples)
    # Format the volume output so that at most
    # it has six decimal numbers.
    volume = "{:.6f}".format(volume)

    print(pitch)
    print(volume)

Thanks!

@piem piem closed this as completed in e7da8ba Dec 21, 2016
@piem
Copy link
Member

piem commented Dec 21, 2016

Great! I just added python/demos/demo_pyaudio.py based on @jhoelzl code in #6 and your version above.

best, Paul

@SarahTeoh
Copy link

@piem I have tried python/demos/demo_pyaudio.py to detect the F0 frequency of audio stream from microphone. However, I got values that are way greater than the normal f0 frequency range of human voice, eg:1000+Hz. Any suggestions about how I can fix this?

@EuphonistiHack
Copy link

Breadcrumbs for future developers:

I'm working on an application that will be handling signal from a bass guitar, and found that I had to up the buffer size significantly (up to 8192) in order to get a large enough window to reliably identify tones in the 40-60 Hz ranges.

Thanks for making this code public though- this has saved me a TON of time and effort for figuring out the nuances of the aubio APIs!

dnkelly97 added a commit to dnkelly97/fretboard_mastery that referenced this issue Dec 3, 2021
…tar is played. This loop was based off of the pitch detection method posted here: aubio/aubio#78 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants