Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i have one question #75

Closed
erquren opened this issue Mar 2, 2023 · 8 comments
Closed

i have one question #75

erquren opened this issue Mar 2, 2023 · 8 comments

Comments

@erquren
Copy link
Contributor

erquren commented Mar 2, 2023

it is nice work,i real like it.
but, when i run this project,i have some questions.

i try this web 1 and 3 method to install this project.
if i install this with method 1
it work nice ,the words can recognize and can split when the talk have some time no voice.
but when i install with method 3, the words can recognize but can not split when the talk have long time no voice.it will output one line

can you tell me how to solve it.

Thank you

@csukuangfj
Copy link
Collaborator

Are you using python and which operating system are using?

@csukuangfj
Copy link
Collaborator

Also, could you describe in detail what commands are you using?

@erquren
Copy link
Contributor Author

erquren commented Mar 3, 2023

#!/usr/bin/env python3

# Real-time speech recognition from a microphone with sherpa-onnx Python API
# with endpoint detection.
#
# Please refer to
# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html
# to download pre-trained models

import argparse
import sys
from pathlib import Path

try:
    import sounddevice as sd
except ImportError as e:
    print("Please install sounddevice first. You can use")
    print()
    print("  pip install sounddevice")
    print()
    print("to install it")
    sys.exit(-1)

import sherpa_onnx


def assert_file_exists(filename: str):
    assert Path(
        filename
    ).is_file(), f"{filename} does not exist!\nPlease refer to https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it"


def get_args():
    parser = argparse.ArgumentParser(
        formatter_class=argparse.ArgumentDefaultsHelpFormatter
    )

    parser.add_argument(
        "--tokens",
        type=str,
        help="Path to tokens.txt",
    )

    parser.add_argument(
        "--encoder",
        type=str,
        help="Path to the encoder model",
    )

    parser.add_argument(
        "--decoder",
        type=str,
        help="Path to the decoder model",
    )

    parser.add_argument(
        "--joiner",
        type=str,
        help="Path to the joiner model",
    )

    parser.add_argument(
        "--wave-filename",
        type=str,
        help="""Path to the wave filename. Must be 16 kHz,
        mono with 16-bit samples""",
    )

    return parser.parse_args()


def create_recognizer():
    args = get_args()
    assert_file_exists(args.encoder)
    assert_file_exists(args.decoder)
    assert_file_exists(args.joiner)
    assert_file_exists(args.tokens)
    # Please replace the model files if needed.
    # See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html
    # for download links.
    recognizer = sherpa_onnx.OnlineRecognizer(
        tokens=args.tokens,
        encoder=args.encoder,
        decoder=args.decoder,
        joiner=args.joiner,
        enable_endpoint_detection=True,
        rule1_min_trailing_silence=2.4,
        rule2_min_trailing_silence=1.2,
        rule3_min_utterance_length=300,  # it essentially disables this rule
    )
    return recognizer


def main():
    print("Started! Please speak")
    recognizer = create_recognizer()
    sample_rate = 16000
    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms
    last_result = ""
    stream = recognizer.create_stream()

    last_result = ""
    segment_id = 0
    frame_num = 0
    with sd.InputStream(channels=1, dtype="float32", samplerate=sample_rate) as s:
        while True:
            frame_num+=1
            # print(f"frame_num: {frame_num}")
            samples, _ = s.read(samples_per_read)  # a blocking read
            samples = samples.reshape(-1)
            stream.accept_waveform(sample_rate, samples)
            while recognizer.is_ready(stream):
                recognizer.decode_stream(stream)

            is_endpoint = recognizer.is_endpoint(stream)

            result = recognizer.get_result(stream)

            if result and (last_result != result):
                last_result = result
                print(f"{segment_id}: {result}")

            if is_endpoint:
                if result:
                    segment_id += 1
                recognizer.reset(stream)


if __name__ == "__main__":
    devices = sd.query_devices()
    print(devices)
    default_input_device_idx = sd.default.device[0]
    print(f'Use default device: {devices[default_input_device_idx]["name"]}')

    try:
        main()
    except KeyboardInterrupt:
        print("\nCaught Ctrl + C. Exiting")

@erquren
Copy link
Contributor Author

erquren commented Mar 3, 2023

@erquren
Copy link
Contributor Author

erquren commented Mar 3, 2023

Are you using python and which operating system are using?
https://k2-fsa.github.io/sherpa/onnx/python/install.html#method-1-from-pre-compiled-wheels
https://k2-fsa.github.io/sherpa/onnx/python/install.html#method-3-for-developers

method-1 and method 3

my deivice have x86 and arm ,the same phenomenon

@csukuangfj
Copy link
Collaborator

@dongqianzhuan

Sorry, it is a bug introduced in #69

I am fixing it.

@csukuangfj
Copy link
Collaborator

@dongqianzhuan

please use the latest master. It should be fixed in #74

@erquren
Copy link
Contributor Author

erquren commented Mar 3, 2023

now , it is normal work!
thank you

@erquren erquren closed this as completed Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants