Skip to content
forked from Picovoice/cheetah

On-device streaming speech-to-text engine powered by deep learning

License

Notifications You must be signed in to change notification settings

queenwu/cheetah

 
 

Repository files navigation

Cheetah

Made in Vancouver, Canada by Picovoice

Twitter URL YouTube Channel Views

Cheetah is an on-device streaming speech-to-text engine. Cheetah is:

  • Private; All voice processing runs locally.
  • Accurate [1]
  • Compact and Computationally-Efficient [2]
  • Cross-Platform:
    • Linux (x86_64)
    • macOS (x86_64, arm64)
    • Windows (x86_64)
    • Android
    • iOS
    • Raspberry Pi (4, 3)
    • NVIDIA Jetson Nano

Table of Contents

AccessKey

AccessKey is your authentication and authorization token for deploying Picovoice SDKs, including Cheetah. Anyone who is using Picovoice needs to have a valid AccessKey. You must keep your AccessKey secret. You would need internet connectivity to validate your AccessKey with Picovoice license servers even though the voice recognition is running 100% offline.

AccessKey also verifies that your usage is within the limits of your account. Everyone who signs up for Picovoice Console receives the Free Tier usage rights described here. If you wish to increase your limits, you can purchase a subscription plan.

Demos

Python Demos

Install the demo package:

pip3 install pvcheetahdemo
cheetah_demo_mic --access_key ${ACCESS_KEY}

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console.

C Demos

If using SSH, clone the repository with:

git clone --recurse-submodules git@github.com:Picovoice/cheetah.git

If using HTTPS, clone the repository with:

git clone --recurse-submodules https://github.com/Picovoice/cheetah.git

Build the demo:

cmake -S demo/c/ -B demo/c/build && cmake --build demo/c/build

Run the demo:

./demo/c/build/cheetah_demo_mic -a ${ACCESS_KEY} -l ${LIBRARY_PATH} -m ${MODEL_PATH}

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console, ${LIBRARY_PATH} with the path to appropriate library under lib, and ${MODEL_PATH} to path to default model file (or your custom one).

SDKs

Python

Install the Python SDK:

pip3 install pvcheetah

Create an instance of the engine and transcribe audio in real-time:

import pvcheetah

handle = pvcheetah.create(access_key='${ACCESS_KEY}')

def get_next_audio_frame():
    pass

while True:
    partial_transcript, is_endpoint = handle.process(get_next_audio_frame())
    if is_endpoint:
        final_transcript = handle.flush()

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console.

C

Create an instance of the engine and transcribe audio in real-time:

#include <stdio.h>
#include <stdlib.h>

#include "pv_cheetah.h"

pv_cheetah_t *handle = NULL;
const pv_status_t status = pv_cheetah_init("${ACCESS_KEY}", "${MODEL_PATH}", -1.f, &handle);
if (status != PV_STATUS_SUCCESS) {
    // error handling logic
}

extern const int16_t *get_next_audio_frame(void);

while (true) {
    char *partial_transcript = NULL;
    bool is_endpoint = false;
    const pv_status_t status = pv_cheetah_process(
            handle,
            get_next_audio_frame(),
            &partial_transcript,
            &is_endpoint);
    if (status != PV_STATUS_SUCCESS) {
        // error handling logic
    }
    // do something with transcript
    free(partial_transcript);
    if (is_endpoint) {
        char *final_transcript = NULL;
        const pv_status_t status = pv_cheetah_flush(handle, &final_transcript);
        if (status != PV_STATUS_SUCCESS) {
            // error handling logic
        }
        // do something with transcript
        free(final_transcript);
    }
}

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console and ${MODEL_PATH} to path to default model file (or your custom one). Finally, when done be sure to release resources acquired using pv_cheetah_delete(handle).

Releases

V1.0.0 — January 25th, 2022

  • Initial release.

About

On-device streaming speech-to-text engine powered by deep learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 74.9%
  • C 25.1%