You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
##Web_microphone_websocket demo work fine and not very well on android_mic_streaming demo
I have trained Arabic model using the following configuration, domain language vocabulary size 50k words only
using the following configuration
Android Microphone Streaming
./generate_scorer_package --alphabet alphabet.txt --lm /lm.binary --vocab /vocab-50000.txt --package /new.scorer --default_alpha 0.931289039105002 --default_beta 1.1834137581510284
I got - WER: 0.055143, CER: 0.017157, loss: 12.519335
and good result on demo
web_microphone_websocket
my problem is on android demo
android_mic_streaming
the problem is that it only recognizes one utterance only then stuck, this happened on short sentences, but on long sentences, it recognizes more than one utterance
example
if my language model contains the following sentences
*1 As he crossed toward the pharmacy*
*2 As he crossed toward the pharmacy at the corner he involuntarily turned his head because of a burst of light that had ricocheted from his temple*
3 the man who then stole his car*
4 a blindingly white parallelogram of the sky being unloaded from the van — a dresser with mirror, across which, as across a cinema screen, passed a flawlessly clear reflection of boughs, sliding and swaying not aboreally, but with a human vacillation, produced by the nature of those who were carrying this sky, these boughs, this sliding facade*
and if I said ‘As he crossed toward the pharmacy’ it recognizes it and stuck
if I said ‘ at the corner he involuntarily turned his head ’ it recognizes it
if I then stopped and said ‘because of a burst of light that’ it continues to recognize it
and keep recognizing tell the end of a long sentence
if I speak any part of sentence number 4 it will not be stuck and it will keep recognizing tell the end of a sentence
where is my mistake
note that the android demo works fine on the English model
my training date about 3000 hours
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
##Web_microphone_websocket demo work fine and not very well on android_mic_streaming demo
I have trained Arabic model using the following configuration, domain language vocabulary size 50k words only
using the following configuration
Android Microphone Streaming
–train_files myfile.csv
–checkpoint_dir ‘/checkpoint-4-new’
–alphabet_config_path ‘alphabet.txt’
–dev_files dev.csv
–test_files test.csv
–summary_dir summaries-4-new’
–train_batch_size 32
–dropout_rate 0.35
–dev_batch_size 50
–test_batch_size 30
–test_output_file ‘/test/test.txt’
–scorer_path ‘/full.kenlm.scorer’
–n_hidden 1024
–export_dir ‘/export’
–export_tflite true
–learning_rate 0.0001
–epochs 25
–max_to_keep 2
–use_allow_growth “true”
–stderrthreshold debug
–noearly_stop
–automatic_mixed_precision
–augment reverb[p=0.1,delay=50.0
30.0,decay=10.0:2.01.0].
.
.
I bulit scorer using the followin comands
python3 generate_lm.py --input_txt 1000.Lines.LM.txt --output_dir --top_k 50000 --kenlm_bins /home/ubuntu/DeepSpeech_latest/EX-HD/kenlm/bin/ --arpa_order 5 --max_arpa_memory "85%" --arpa_prune "0|0|1" --binary_a_bits 255 --binary_q_bits 8 --binary_type trie --discount_fallback
./generate_scorer_package --alphabet alphabet.txt --lm /lm.binary --vocab /vocab-50000.txt --package /new.scorer --default_alpha 0.931289039105002 --default_beta 1.1834137581510284
I got - WER: 0.055143, CER: 0.017157, loss: 12.519335
and good result on demo
web_microphone_websocket
my problem is on android demo
android_mic_streaming
the problem is that it only recognizes one utterance only then stuck, this happened on short sentences, but on long sentences, it recognizes more than one utterance
example
if my language model contains the following sentences
*1 As he crossed toward the pharmacy*
3 the man who then stole his car*
4 a blindingly white parallelogram of the sky being unloaded from the van — a dresser with mirror, across which, as across a cinema screen, passed a flawlessly clear reflection of boughs, sliding and swaying not aboreally, but with a human vacillation, produced by the nature of those who were carrying this sky, these boughs, this sliding facade*
and if I said ‘As he crossed toward the pharmacy’ it recognizes it and stuck
if I said ‘ at the corner he involuntarily turned his head ’ it recognizes it
if I then stopped and said ‘because of a burst of light that’ it continues to recognize it
and keep recognizing tell the end of a long sentence
if I speak any part of sentence number 4 it will not be stuck and it will keep recognizing tell the end of a sentence
where is my mistake
note that the android demo works fine on the English model
my training date about 3000 hours
Beta Was this translation helpful? Give feedback.
All reactions