i have one question #75

erquren · 2023-03-02T15:58:25Z

it is nice work,i real like it.
but, when i run this project,i have some questions.

i try this web 1 and 3 method to install this project.
if i install this with method 1
it work nice ,the words can recognize and can split when the talk have some time no voice.
but when i install with method 3, the words can recognize but can not split when the talk have long time no voice.it will output one line

can you tell me how to solve it.

Thank you

csukuangfj · 2023-03-02T16:18:06Z

Are you using python and which operating system are using?

csukuangfj · 2023-03-02T16:22:27Z

Also, could you describe in detail what commands are you using?

erquren · 2023-03-03T03:24:41Z

#!/usr/bin/env python3

# Real-time speech recognition from a microphone with sherpa-onnx Python API
# with endpoint detection.
#
# Please refer to
# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html
# to download pre-trained models

import argparse
import sys
from pathlib import Path

try:
    import sounddevice as sd
except ImportError as e:
    print("Please install sounddevice first. You can use")
    print()
    print("  pip install sounddevice")
    print()
    print("to install it")
    sys.exit(-1)

import sherpa_onnx


def assert_file_exists(filename: str):
    assert Path(
        filename
    ).is_file(), f"{filename} does not exist!\nPlease refer to https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it"


def get_args():
    parser = argparse.ArgumentParser(
        formatter_class=argparse.ArgumentDefaultsHelpFormatter
    )

    parser.add_argument(
        "--tokens",
        type=str,
        help="Path to tokens.txt",
    )

    parser.add_argument(
        "--encoder",
        type=str,
        help="Path to the encoder model",
    )

    parser.add_argument(
        "--decoder",
        type=str,
        help="Path to the decoder model",
    )

    parser.add_argument(
        "--joiner",
        type=str,
        help="Path to the joiner model",
    )

    parser.add_argument(
        "--wave-filename",
        type=str,
        help="""Path to the wave filename. Must be 16 kHz,
        mono with 16-bit samples""",
    )

    return parser.parse_args()


def create_recognizer():
    args = get_args()
    assert_file_exists(args.encoder)
    assert_file_exists(args.decoder)
    assert_file_exists(args.joiner)
    assert_file_exists(args.tokens)
    # Please replace the model files if needed.
    # See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html
    # for download links.
    recognizer = sherpa_onnx.OnlineRecognizer(
        tokens=args.tokens,
        encoder=args.encoder,
        decoder=args.decoder,
        joiner=args.joiner,
        enable_endpoint_detection=True,
        rule1_min_trailing_silence=2.4,
        rule2_min_trailing_silence=1.2,
        rule3_min_utterance_length=300,  # it essentially disables this rule
    )
    return recognizer


def main():
    print("Started! Please speak")
    recognizer = create_recognizer()
    sample_rate = 16000
    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms
    last_result = ""
    stream = recognizer.create_stream()

    last_result = ""
    segment_id = 0
    frame_num = 0
    with sd.InputStream(channels=1, dtype="float32", samplerate=sample_rate) as s:
        while True:
            frame_num+=1
            # print(f"frame_num: {frame_num}")
            samples, _ = s.read(samples_per_read)  # a blocking read
            samples = samples.reshape(-1)
            stream.accept_waveform(sample_rate, samples)
            while recognizer.is_ready(stream):
                recognizer.decode_stream(stream)

            is_endpoint = recognizer.is_endpoint(stream)

            result = recognizer.get_result(stream)

            if result and (last_result != result):
                last_result = result
                print(f"{segment_id}: {result}")

            if is_endpoint:
                if result:
                    segment_id += 1
                recognizer.reset(stream)


if __name__ == "__main__":
    devices = sd.query_devices()
    print(devices)
    default_input_device_idx = sd.default.device[0]
    print(f'Use default device: {devices[default_input_device_idx]["name"]}')

    try:
        main()
    except KeyboardInterrupt:
        print("\nCaught Ctrl + C. Exiting")

erquren · 2023-03-03T03:26:31Z

run the model onnx

https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tree/main

erquren · 2023-03-03T03:31:23Z

Are you using python and which operating system are using?
https://k2-fsa.github.io/sherpa/onnx/python/install.html#method-1-from-pre-compiled-wheels
https://k2-fsa.github.io/sherpa/onnx/python/install.html#method-3-for-developers

method-1 and method 3

my deivice have x86 and arm ,the same phenomenon

csukuangfj · 2023-03-03T03:37:23Z

@dongqianzhuan

Sorry, it is a bug introduced in #69

I am fixing it.

csukuangfj · 2023-03-03T04:11:42Z

@dongqianzhuan

please use the latest master. It should be fixed in #74

erquren · 2023-03-03T06:22:06Z

now , it is normal work!
thank you

erquren closed this as completed Mar 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

i have one question #75

i have one question #75

erquren commented Mar 2, 2023

csukuangfj commented Mar 2, 2023

csukuangfj commented Mar 2, 2023

erquren commented Mar 3, 2023

erquren commented Mar 3, 2023

erquren commented Mar 3, 2023

csukuangfj commented Mar 3, 2023

csukuangfj commented Mar 3, 2023

erquren commented Mar 3, 2023

i have one question #75

i have one question #75

Comments

erquren commented Mar 2, 2023

csukuangfj commented Mar 2, 2023

csukuangfj commented Mar 2, 2023

erquren commented Mar 3, 2023

erquren commented Mar 3, 2023

erquren commented Mar 3, 2023

csukuangfj commented Mar 3, 2023

csukuangfj commented Mar 3, 2023

erquren commented Mar 3, 2023