Implement beatnet by kobakos · Pull Request #1478 · ailia-ai/ailia-models

kobakos · 2024-05-22T09:01:53Z

#1450

kyakuno · 2024-08-09T04:01:30Z

@kobakos beatnetのonnxをいただくことは可能でしょうか？

Fix parser explanation and typo

kyakuno · 2024-08-09T05:58:57Z

モデルアップロード済み。
https://storage.googleapis.com/ailia-models/beatnet/beatnet_1.onnx

# Conflicts: # README.md

kyakuno · 2024-08-10T11:42:36Z

madmomで下記のエラーになる。

Traceback (most recent call last):
  File "/Users/kyakuno/Desktop/ailia/ailia-models-ax/audio_processing/beatnet/beatnet.py", line 4, in <module>
    from madmom.features import DBNDownBeatTrackingProcessor
  File "/usr/local/lib/python3.11/site-packages/madmom/__init__.py", line 24, in <module>
    from . import audio, evaluation, features, io, ml, models, processors, utils
  File "/usr/local/lib/python3.11/site-packages/madmom/audio/__init__.py", line 27, in <module>
    from . import comb_filters, filters, signal, spectrogram, stft
  File "madmom/audio/comb_filters.pyx", line 15, in init madmom.audio.comb_filters
  File "/usr/local/lib/python3.11/site-packages/madmom/processors.py", line 23, in <module>
    from collections import MutableSequence
ImportError: cannot import name 'MutableSequence' from 'collections' (/usr/local/Homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/collections/__init__.py)

kyakuno · 2024-08-10T11:43:38Z

これが原因みたい。
CPJKU/madmom#535

kyakuno · 2024-08-10T11:44:27Z

@kobakos madmomをlibrosaのstftなどに置き換え可能でしょうか？

kyakuno · 2024-08-10T11:46:13Z

madmom

from madmom.audio.signal import SignalProcessor, FramedSignalProcessor
from madmom.audio.stft import ShortTimeFourierTransformProcessor
from madmom.audio.spectrogram import (
    FilteredSpectrogramProcessor, LogarithmicSpectrogramProcessor,
    SpectrogramDifferenceProcessor)
from madmom.processors import ParallelProcessor, SequentialProcessor


# feature extractor that extracts magnitude spectrogram and its differences  

class LOG_SPECT(FeatureModule):
    def __init__(self, num_channels=1, sample_rate=22050, win_length=2048, hop_size=512, n_bands=[12], mode='online'):
        sig = SignalProcessor(num_channels=num_channels, win_length=win_length, sample_rate=sample_rate)
        self.sample_rate = sample_rate
        self.hop_length = hop_size
        self.num_channels = num_channels
        multi = ParallelProcessor([])
        frame_sizes = [win_length]  
        num_bands = n_bands  
        for frame_size, num_bands in zip(frame_sizes, num_bands):
            if mode == 'online' or mode == 'offline':
                frames = FramedSignalProcessor(frame_size=frame_size, hop_size=hop_size) 
            else:   # for real-time and streaming modes 
                frames = FramedSignalProcessor(frame_size=frame_size, hop_size=hop_size, num_frames=4) 
            stft = ShortTimeFourierTransformProcessor()  # caching FFT window
            filt = FilteredSpectrogramProcessor(
                num_bands=num_bands, fmin=30, fmax=17000, norm_filters=True)
            spec = LogarithmicSpectrogramProcessor(mul=1, add=1)
            diff = SpectrogramDifferenceProcessor(
                diff_ratio=0.5, positive_diffs=True, stack_diffs=np.hstack)
            # process each frame size with spec and diff sequentially
            multi.append(SequentialProcessor((frames, stft, filt, spec, diff)))
        # stack the features and process everything sequentially
        self.pipe = SequentialProcessor((sig, multi, np.hstack))

    def process_audio(self, audio):
        feats = self.pipe(audio)
        return feats.T

chatgptでlibrosaにしてもらったもの（動作未確認）

import numpy as np
import librosa
import librosa.display

class LOG_SPECT:
    def __init__(self, sample_rate=22050, win_length=2048, hop_length=512, n_mels=128):
        self.sample_rate = sample_rate
        self.win_length = win_length
        self.hop_length = hop_length
        self.n_mels = n_mels

    def process_audio(self, audio_path):
        y, sr = librosa.load(audio_path, sr=self.sample_rate)

        # Calculate the short-time Fourier transform (STFT)
        stft = librosa.stft(y, n_fft=self.win_length, hop_length=self.hop_length)
        
        # Get the magnitude spectrogram
        mag_spec = np.abs(stft)

        # Convert to mel scale
        mel_spec = librosa.feature.melspectrogram(S=mag_spec, sr=sr, n_mels=self.n_mels)

        # Convert to log scale
        log_mel_spec = librosa.power_to_db(mel_spec)

        # Compute the first order difference (delta)
        delta = librosa.feature.delta(log_mel_spec)

        # Combine the log mel spectrogram and its delta
        log_mel_spec_with_delta = np.vstack([log_mel_spec, delta])

        return log_mel_spec_with_delta.T

# 使用例
# feature_extractor = LOG_SPECT()
# features = feature_extractor.process_audio('path/to/audio/file.wav')

kobakos · 2024-08-22T09:53:29Z

エラーのもとになっているfeatures.pyはLOG_SPECTだけじゃなくてDBNDownBeatTrackingProcessorやparticle_filtering_cascadeを使うときにも必要なので、LOG_SPECTの修正だけではDBNと粒子フィルタどちらの推論モードでも依然エラーは出てしまいます。必要なコードだけmadmomからとってきて同じディレクトリに入れることができないかとも思いましたがCythonが使われているのでこれも簡単ではなさそう。
GitHubにある実装ではインポートの問題も解消されているのですが、pipのやつは古いままになっているのでrequirements.txtによる対応は難しそうです。

Add beatnet

efecccc

kobakos requested a review from kyakuno May 22, 2024 09:01

add madmom

a84ebb2

Update beatnet.py

9c164e1

Fix parser explanation and typo

kyakuno added 3 commits August 10, 2024 20:38

Merge branch 'master' into beatnet

4595680

# Conflicts: # README.md

Update readme

fa59c37

Add license

a4b852c

kyakuno added the waiting_enhancement label Aug 10, 2024

kyakuno closed this Apr 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Implement beatnet#1478

Implement beatnet#1478
kobakos wants to merge 6 commits intomasterfrom
beatnet

kobakos commented May 22, 2024

Uh oh!

kyakuno commented Aug 9, 2024

Uh oh!

kyakuno commented Aug 9, 2024

Uh oh!

kyakuno commented Aug 10, 2024

Uh oh!

kyakuno commented Aug 10, 2024

Uh oh!

kyakuno commented Aug 10, 2024

Uh oh!

kyakuno commented Aug 10, 2024 •

edited

Loading

Uh oh!

kobakos commented Aug 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

kobakos commented May 22, 2024

Uh oh!

kyakuno commented Aug 9, 2024

Uh oh!

kyakuno commented Aug 9, 2024

Uh oh!

kyakuno commented Aug 10, 2024

Uh oh!

kyakuno commented Aug 10, 2024

Uh oh!

kyakuno commented Aug 10, 2024

Uh oh!

kyakuno commented Aug 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kobakos commented Aug 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kyakuno commented Aug 10, 2024 •

edited

Loading