Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling frequency mismatch - consider adding --allow_{upsample,downsample} #18

Closed
dtreskunov opened this issue Jan 25, 2020 · 4 comments

Comments

@dtreskunov
Copy link

Using the fr model from kaldi-android-demo with vosk 0.3, running the following code crashes the Python interpreter.

from vosk import Model, KaldiRecognizer

try:
  model = Model('fr')
  rec = KaldiRecognizer(model, 8000)
  data = b'\x00' * 1000
  print(rec.AcceptWaveform(data))
except Exception as e:
  print('Exception!', e)
  raise e

print('OK')
root@77cd514bb1ec:/models# python3 /server/bug_repro.py 
vosk --min-active=200 --max-active=3000 --beam=10.0 --lattice-beam=2.0 --acoustic-scale=1.0 --frame-subsampling-factor=3 --endpoint.silence-phones=1:2:3:4:5:6:7:8:9:10 --endpoint.rule2.min-trailing-silence=0.5 --endpoint.rule3.min-trailing-silence=1.0 --endpoint.rule4.min-trailing-silence=2.0
LOG (vosk[5.5.599~1-11bed]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (vosk[5.5.599~1-11bed]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (vosk[5.5.599~1-11bed]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 1 orphan nodes.
LOG (vosk[5.5.599~1-11bed]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 2 orphan components.
LOG (vosk[5.5.599~1-11bed]:Collapse():nnet-utils.cc:1472) Added 1 components, removed 2
LOG (vosk[5.5.599~1-11bed]:CompileLooped():nnet-compile-looped.cc:345) Spent 0.0316849 seconds in looped compilation.
ERROR (vosk[5.5.599~1-11bed]:MaybeCreateResampler():online-feature.cc:99) Sampling frequency mismatch, expected 16000, got 8000
Perhaps you want to use the options --allow_{upsample,downsample}

[ Stack-Trace: ]
/usr/local/lib/python3.7/site-packages/vosk/_vosk.so(kaldi::MessageLogger::LogMessage() const+0x7a6) [0x7f1b546ba6ae]
/usr/local/lib/python3.7/site-packages/vosk/_vosk.so(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x13) [0x7f1b5435cd43]
/usr/local/lib/python3.7/site-packages/vosk/_vosk.so(kaldi::OnlineGenericBaseFeature<kaldi::MfccComputer>::MaybeCreateResampler(float)+0x23a) [0x7f1b5456baa2]
/usr/local/lib/python3.7/site-packages/vosk/_vosk.so(kaldi::OnlineGenericBaseFeature<kaldi::MfccComputer>::AcceptWaveform(float, kaldi::VectorBase<float> const&)+0x55) [0x7f1b5456c477]       
/usr/local/lib/python3.7/site-packages/vosk/_vosk.so(kaldi::OnlineNnet2FeaturePipeline::AcceptWaveform(float, kaldi::VectorBase<float> const&)+0x1c) [0x7f1b543adc2a]
/usr/local/lib/python3.7/site-packages/vosk/_vosk.so(KaldiRecognizer::AcceptWaveform(kaldi::Vector<float>&)+0x1c) [0x7f1b5435742c]
/usr/local/lib/python3.7/site-packages/vosk/_vosk.so(KaldiRecognizer::AcceptWaveform(char const*, int)+0x17e) [0x7f1b543575ee]
/usr/local/lib/python3.7/site-packages/vosk/_vosk.so(+0x21fe56) [0x7f1b54354e56]
/usr/local/lib/libpython3.7m.so.1.0(_PyMethodDef_RawFastCallKeywords+0x155) [0x7f1b555a64c5]
/usr/local/lib/libpython3.7m.so.1.0(_PyCFunction_FastCallKeywords+0x20) [0x7f1b555a6350]
/usr/local/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0x4525) [0x7f1b5561bb95]
/usr/local/lib/libpython3.7m.so.1.0(_PyFunction_FastCallKeywords+0xfa) [0x7f1b555a748a]
/usr/local/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0x672) [0x7f1b55617ce2]
/usr/local/lib/libpython3.7m.so.1.0(_PyEval_EvalCodeWithName+0x2f1) [0x7f1b55616b71]
/usr/local/lib/libpython3.7m.so.1.0(PyEval_EvalCodeEx+0x39) [0x7f1b55616879]
/usr/local/lib/libpython3.7m.so.1.0(PyEval_EvalCode+0x1b) [0x7f1b5561683b]
/usr/local/lib/libpython3.7m.so.1.0(+0x20b97e) [0x7f1b5569a97e]
/usr/local/lib/libpython3.7m.so.1.0(PyRun_FileExFlags+0x97) [0x7f1b55699d37]
/usr/local/lib/libpython3.7m.so.1.0(PyRun_SimpleFileExFlags+0x18e) [0x7f1b55699b5e]
/usr/local/lib/libpython3.7m.so.1.0(+0x210108) [0x7f1b5569f108]
/usr/local/lib/libpython3.7m.so.1.0(_Py_UnixMain+0x2e) [0x7f1b5569edfe]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7f1b5510809b]
python3(_start+0x2a) [0x55f3e267308a]

terminate called after throwing an instance of 'kaldi::KaldiFatalError'
  what():  kaldi::KaldiFatalError
Aborted
@nshmyrev
Copy link
Collaborator

Android models are wideband (except Russian), they will not work for 8khz audio, only for 16. That is what error is about.

@dtreskunov
Copy link
Author

dtreskunov commented Jan 26, 2020

Yes, the error message is clear - but I was wondering if adding --allow_upsample and --allow_downsample to extra_args in model.cc would allow Kaldi to resample the input to the correct rate.

That way you don't need to worry whether the model is 8khz or 16khz.

@nshmyrev
Copy link
Collaborator

To decode 8khz you need 8khz model, upsampling will not fix anything.

@nshmyrev
Copy link
Collaborator

Updated with 378fa34

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants