Skip to content

Otosaku/OtosakuFeatureExtractor-iOS

Repository files navigation

🎧 OtosakuFeatureExtractor

A lightweight Swift-based feature extraction library for transforming raw audio chunks into log-Mel spectrograms, suitable for use in CoreML and on-device inference.

Built with ❤️ for on-device audio intelligence.


📦 Installation

You can add OtosakuFeatureExtractor as a Swift Package dependency:

.package(url: "https://github.com/Otosaku/OtosakuFeatureExtractor-iOS.git", from: "1.0.2")

Then add it to the target dependencies:

.target(
    name: "YourApp",
    dependencies: [
        .product(name: "OtosakuFeatureExtractor", package: "OtosakuFeatureExtractor")
    ]
)

🔁 Audio Processing Pipeline

[Raw Audio Chunk (Float64)] 
       ↓ pre-emphasis
[Pre-emphasized audio] 
       ↓ STFT (with Hann window)
[STFT result (complex)]
       ↓ Power Spectrum
[|FFT|^2]
       ↓ Mel Filterbank Projection (matrix multiply)
[Mel energies]
       ↓ log(ε + x)
[Log-Mel Spectrogram]
       ↓ MLMultiArray
[CoreML-compatible tensor]

🧪 Usage

1. Initialize the Extractor

You must provide a directory containing:

  • filterbank.npy — shape [80, 201], float32 or float64
  • hann_window.npy — shape [400], float32 or float64
import OtosakuFeatureExtractor

let extractor = try OtosakuFeatureExtractor(directoryURL: featureFolderURL)

📥 Downloads

💬 Want a model trained on custom keywords?
Drop me a message at otosaku.dsp@gmail.com — let’s talk!


2. Process a Chunk of Audio

The input must be a raw audio chunk as Array<Double>, typically at 16kHz sample rate.

let logMel: MLMultiArray = try extractor.processChunk(chunk: audioChunk)

audioChunk should be at least 400 samples long to match the FFT window size.


3. (Optional) Save Log-Mel Features to JSON

saveLogMelToJSON(logMel: features)

📚 Dependencies


📁 File Structure

OtosakuFeatureExtractor/
├── Sources/
│   └── OtosakuFeatureExtractor/
│       ├── OtosakuFeatureExtractor.swift
├── filterbank.npy
├── hann_window.npy

🗣️ Attribution

Project by @otosaku-ai under the Otosaku brand.


🧪 License

MIT License