Kotlin Speech Features

Quick Links

📒 Introduction

This library is a complete port of python_speech_features in pure Kotlin available for Android and iOS projects.

It provides common speech features for Automated speech recognition (ASR) including MFCCs and filterbank energies.
To know more about MFCCs read more.

Features

🙋 How to use

We support multiple platforms using Kotlin multiplatform.

Android

Integration

Add jitpack.io to your project's repositories:

allProjects {
  repositories {
    google()
    maven { url 'https://jitpack.io' }
  }
}

Add the dependency:

dependencies {
    implementation "com.github.MerlynMind:kotlin_speech_features:${version}"
}

Example implementation

A sample app is included in this repo to help understand the implementation.

Convert your audio signal in the form of a float array. (A demo provided in the sample app)

Initialize speech features

private val speechFeatures = SpeechFeatures()

Perform any of the 4 operations:

val result = speechFeatures.mfcc(MathUtils.normalize(wav), nFilt = 64)
val result = speechFeatures.fbank(MathUtils.normalize(wav), nFilt = 64)
val result = speechFeatures.logfbank(MathUtils.normalize(wav), nFilt = 64)
val result = speechFeatures.ssc(MathUtils.normalize(wav), nFilt = 64)

The result will contain metrices with the expected features. Pass in these features for further processes (e.g. classification, speech recognition).

iOS

Integration

In XCode, go to File > Add Packages...
Paste in the URL of this repo in the search box
Select the package found
Click Add Package button

Example implementation

A sample app is included in this repo to help understand the implementation.

Convert your audio signal in the form of an KotlinIntArray and normalize it.

import KotlinSpeechFeatures

let signal = [Int](1...1000) // Example signal
let normalized = MathUtils.Companion.init().normalize(sig: toKotlinIntArray(arr: signal))

func toKotlinIntArray(arr: [Int]) -> KotlinIntArray {
    let result = KotlinIntArray(size: Int32(arr.capacity))
    for i in 0...(arr.count-1) {
        result.set(index: Int32(i), value: Int32(arr[i]))
    }
    return result
}

Initialize speech features
```
let speechFeatures = SpeechFeatures()
```

Perform any of the 4 operations:

let result = speechFeatures.mfcc(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, numCep: 13, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: ni;, preemph: 0.97, ceplifter: 22, appendEnergy: true, winFunc: nil)
let result = speechFeatures.fbank(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil)
let result = speechFeatures.logfbank(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil)
let result = speechFeatures.ssc(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil)

The result will contain metrices with the expected features. Pass in these features for further processes (e.g. classification, speech recognition).

JavaScript

Coming soon...

✍️ Contributing

Interested in contributing to the library? Thank you so much for your interest! We are always looking for improvements to the project and contributions from open-source developers are greatly appreciated.

Clone repo and create a new branch:

git checkout https://github.com/merlynmind/kotlin_speech_features -b name_for_new_branch

Make changes and test
Submit Pull Request with comprehensive description of changes

🌟 Spread the word!

If you want to say thank you and/or support active development of this library:

Add a GitHub Star to the project!
Tweet about the project on your Twitter! Tag @MerlynMind and/or #heyMerlnyn

Thank you so much for your interest in growing the reach of our library!

🧡 Credits

Arjun Sunil - Original Author of kotlin speech features
Raquib-Ul Alam - For major refactoring and making the code presentable
Rob Smith - For Mentoring and helping us to navigate through the task

📝 References

Original library - Python Speech Features
Reference Library - C Speech Features
Sample english.wav was obtained from

wget http://voyager.jpl.nasa.gov/spacecraft/audio/english.au
sox english.au -e signed-integer english.wav

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
KotlinSpeechFeatures.xcframework		KotlinSpeechFeatures.xcframework
app		app
docs		docs
gradle/wrapper		gradle/wrapper
kotlinspeechfeatures		kotlinspeechfeatures
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Package.swift		Package.swift
README.md		README.md
build.gradle		build.gradle
docs.sh		docs.sh
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
kotlinspeechfeatures.podspec		kotlinspeechfeatures.podspec
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kotlin Speech Features

Quick Links

📒 Introduction

Features

🙋 How to use

Integration

Example implementation

Integration

Example implementation

✍️ Contributing

🌟 Spread the word!

🧡 Credits

📝 References

About

Releases 1

Packages

Contributors 2

Languages

License

EmergenceAI/kotlin_speech_features

Folders and files

Latest commit

History

Repository files navigation

Kotlin Speech Features

Quick Links

📒 Introduction

Features

🙋 How to use

Integration

Example implementation

Integration

Example implementation

✍️ Contributing

🌟 Spread the word!

🧡 Credits

📝 References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages