-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segmentation fault using side arpa or binary #159
Comments
Can you give more detail? When you say "arpa model" do you mean a model trained with the 2-letter arpa symbols as your vocab? I have used IPA/Arpa for my decoding and this library does not currently support multichar symbols or unicode as discussed in #31. This caused errors for me but it looked nothing like this, maybe try running a non-threaded version until you get it working so we can see if that's the issue? |
it seems that problem is here. I use unicode symbols, as my model is in Russian. As I understood, unicode is not supported. |
Your understanding is correct, unicode is not supported and will not work, but there is a way to hack it. It doesn't matter what vocab/symbols/tokens you pass in as the results you get back are encoded ints. This means you can pass a bunch of ASCII symbols as your vocab, and then decode to your real vocab. Here's an example
Hope this helps and let me know if you try it and get stuck somewhere. |
thanks for making it clear. |
Hi.
I am trying to test ctcdecode with my arpa model.
test are done well, but when I am using my generated arpa, I get segfault.
Here is gdb output
(gdb) run my_test.py
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /usr/bin/python3 my_test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffa5b06700 (LWP 26274)]
[New Thread 0x7fffa5305700 (LWP 26275)]
[New Thread 0x7fffa0b04700 (LWP 26276)]
[New Thread 0x7fff9e303700 (LWP 26277)]
[New Thread 0x7fff9db02700 (LWP 26278)]
[New Thread 0x7fff99301700 (LWP 26279)]
[New Thread 0x7fff98b00700 (LWP 26280)]
[New Thread 0x7fff942ff700 (LWP 26281)]
[New Thread 0x7fff91afe700 (LWP 26282)]
[New Thread 0x7fff8f2fd700 (LWP 26283)]
[New Thread 0x7fff8cafc700 (LWP 26284)]
[New Thread 0x7fff8a2fb700 (LWP 26285)]
[New Thread 0x7fff87afa700 (LWP 26286)]
[New Thread 0x7fff852f9700 (LWP 26287)]
[New Thread 0x7fff82af8700 (LWP 26288)]
[New Thread 0x7fff640d5700 (LWP 26289)]
[New Thread 0x7fff638d4700 (LWP 26290)]
[New Thread 0x7fff630d3700 (LWP 26291)]
[New Thread 0x7fff628d2700 (LWP 26292)]
Thread 18 "python3" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff638d4700 (LWP 26290)]
fst::SortedMatcher<fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl > > > > >::BinarySearch (this=0x7fff54000cc0) at /tmp/pip-req-build-yz1tiwhq/third_party/openfst-1.6.7/src/include/fst/matcher.h:360
360 /tmp/pip-req-build-yz1tiwhq/third_party/openfst-1.6.7/src/include/fst/matcher.h: No such file or directory.
and backtrace output
(gdb) bt
#0 fst::SortedMatcher<fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl > > > > >::BinarySearch (this=0x7fff54000cc0) at /tmp/pip-req-build-yz1tiwhq/third_party/openfst-1.6.7/src/include/fst/matcher.h:360
#1 fst::SortedMatcher<fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl > > > > >::Search (this=0x7fff54000cc0) at /tmp/pip-req-build-yz1tiwhq/third_party/openfst-1.6.7/src/include/fst/matcher.h:384
#2 fst::SortedMatcher<fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl > > > > >::Find (match_label=2, this=) at /tmp/pip-req-build-yz1tiwhq/third_party/openfst-1.6.7/src/include/fst/matcher.h:256
#3 PathTrie::get_path_trie (this=this@entry=0x7fff638d3be8, new_char=new_char@entry=1, new_timestep=0, cur_log_prob_c=cur_log_prob_c@entry=-16.6958027, reset=reset@entry=true)
at /tmp/pip-req-build-yz1tiwhq/ctcdecode/src/path_trie.cpp:61
#4 0x00007fff7b0ae2c4 in DecoderState::next (this=this@entry=0x7fff638d3b80, probs_seq=std::vector of length 353, capacity 353 = {...})
at /tmp/pip-req-build-yz1tiwhq/ctcdecode/src/ctc_beam_search_decoder.cpp:107
#5 0x00007fff7b0afc41 in ctc_beam_search_decoder (probs_seq=std::vector of length 353, capacity 353 = {...}, vocabulary=..., beam_size=, cutoff_prob=,
cutoff_top_n=, blank_id=, log_input=0, ext_scorer=0x24da860) at /tmp/pip-req-build-yz1tiwhq/ctcdecode/src/ctc_beam_search_decoder.cpp:224
#6 0x00007fff7b0b06da in std::__invoke_impl<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > >, std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > (&)(std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > > const&, std::vector<std::string, std::allocatorstd::string > const&, unsigned long, double, unsigned long, unsigned long, int, Scorer), std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > >&, std::vector<std::string, std::allocatorstd::string >&, unsigned long&, double&, unsigned long&, unsigned long&, int&, Scorer*&> (__f=) at /usr/include/c++/7/bits/invoke.h:60
#7 std::__invoke<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > (&)(std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > > const&, std::vector<std::string, std::allocatorstd::string > const&, unsigned long, double, unsigned long, unsigned long, int, Scorer), std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > >&, std::vector<std::string, std::allocatorstd::string >&, unsigned long&, double&, unsigned long&, unsigned long&, int&, Scorer*&> (__fn=) at /usr/include/c++/7/bits/invoke.h:96
#8 std::_Bind<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > ((std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > >, std::vector<std::string, std::allocatorstd::string >, unsigned long, double, unsigned long, unsigned long, int, Scorer))(std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > > const&, std::vector<std::string, std::allocatorstd::string > const&, unsigned long, double, unsigned long, unsigned long, int, Scorer*)>::__call<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > >, , 0ul, 1ul, 2ul, 3ul, 4ul, 5ul, 6ul, 7ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul, 2ul, 3ul, 4ul, 5ul, 6ul, 7ul>) (__args=..., this=) at /usr/include/c++/7/functional:469
#9 std::_Bind<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > ((std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > >, std::vector<std::string, std::allocatorstd::string >, unsigned long, double, unsigned long, unsigned long, int, Scorer))(std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > > const&, std::vector<std::string, std::allocatorstd::string > const&, unsigned long, double, unsigned long, unsigned long, int, Scorer*)>::operator()<, std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > >() (this=) at /usr/include/c++/7/functional:551
#10 std::__invoke_impl<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > >, std::_Bind<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > ((std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > >, std::vector<std::string, std::allocatorstd::string >, unsigned long, double, unsigned long, unsigned long, int, Scorer))(std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > > const&, std::vector<std::string, std::allocatorstd::string > const&, unsigned long, double, unsigned long, unsigned long, int, Scorer*)>&>(std::__invoke_other, std::_Bind<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > ((std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > >, std::vector<std::string, std::allocatorstd::string >, unsigned long, double, unsigned long, unsigned long, int, Scorer))(std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > > const&, std::vector<std::string, std::allocatorstd::string > const&, unsigned long, double, unsigned long, unsigned long, int, Scorer*)>&) (__f=...) at /usr/include/c++/7/bits/invoke.h:60
#11 std::__invoke<std::_Bind<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > ((std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > >, std::vector<std::string, std::allocatorstd::string >, unsigned long, double, unsigned long, unsigned long, int, Scorer))(std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > > const&, std::vector<std::string, std::allocatorstd::string > const&, unsigned long, double, unsigned long, unsigned long, int, Scorer*)>&>(std::_Bind<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > ((std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > >, std::vector<std::string, std::allocatorstd::string >, unsigned long, double, unsigned long, unsigned long, int, Scorer))(std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > > const&, std::vector<std::string, std::allocatorstd::string > const&, unsigned long, double, unsigned long, unsigned long, int, Scorer*)>&) (__fn=...) at /usr/include/c++/7/bits/invoke.h:96
#12 std::__future_base::_Task_state<std::_Bind<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > ((std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > >, std::vector<std::string, std::allocatorstd::string >, unsigned long, double, unsigned long, unsigned long, int, Scorer))(std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > > const&, std::vector<std::string, std::allocatorstd::string > const&, unsigned long, double, unsigned long, unsigned long, int, Scorer*)>, std::allocator, std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > ()>::_M_run()::{lambda()#1}::operator()() const (
__closure=) at /usr/include/c++/7/future:1421
#13 std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > >, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::_Bind<std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > ((std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > >, std::vector<std::string, std::allocatorstd::string >, unsigned long, double, unsigned long, unsigned long, int, Scorer))(std::vector<std::vector<double, std::allocator >, std::allocator<std::vector<double, std::allocator > > > const&, std::vector<std::string, std::allocatorstd::string > const&, unsigned long, double, unsigned long, unsigned long, int, Scorer*)>, std::allocator, std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > ()>::_M_run()::{lambda()#1}, std::vector<std::pair<double, Output>, std::allocator<std::pair<double, Output> > > >::operator()() const (this=0x7fff638d3df0) at /usr/include/c++/7/future:1339
The text was updated successfully, but these errors were encountered: