Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thread 'main' panicked at 'internal error: entered unreachable code', helix-core/src/movement.rs:203:9 #123

Closed
CBenoit opened this issue Jun 5, 2021 · 8 comments · Fixed by #129 or #135

Comments

@CBenoit
Copy link
Member

CBenoit commented Jun 5, 2021

I'm running on commit 407b37c

I got this error when writing some Japanese text in my buffer. Issue is likely with the character . I changed code to display the character, here is the full trace:

thread 'main' panicked at 'internal error: entered unreachable code: '。'', helix-core/src/movement.rs:203:9
stack backtrace:
   0: rust_begin_unwind
             at /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/std/src/panicking.rs:493:5
   1: core::panicking::panic_fmt
             at /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/core/src/panicking.rs:92:14
   2: helix_core::movement::categorize
             at ./helix-core/src/movement.rs:203:9
   3: helix_core::movement::move_prev_word_start
             at ./helix-core/src/movement.rs:113:30
   4: hx::commands::move_prev_word_start::{{closure}}
             at ./helix-term/src/commands.rs:241:9
   5: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &mut F>::call_once
             at /home/auroden/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:280:13
   6: core::option::Option<T>::map
             at /home/auroden/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/option.rs:487:29
   7: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next
             at /home/auroden/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/iter/adapters/map.rs:101:9
   8: <smallvec::SmallVec<A> as core::iter::traits::collect::Extend<<A as smallvec::Array>::Item>>::extend
             at /home/auroden/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.6.1/src/lib.rs:1663:36
   9: <smallvec::SmallVec<A> as core::iter::traits::collect::FromIterator<<A as smallvec::Array>::Item>>::from_iter
             at /home/auroden/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.6.1/src/lib.rs:1648:9
  10: core::iter::traits::iterator::Iterator::collect
             at /home/auroden/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/iter/traits/iterator.rs:1764:9
  11: helix_core::selection::Selection::transform
             at ./helix-core/src/selection.rs:281:13
  12: hx::commands::move_prev_word_start
             at ./helix-term/src/commands.rs:240:21
  13: hx::ui::editor::EditorView::command_mode
             at ./helix-term/src/ui/editor.rs:530:21
  14: <hx::ui::editor::EditorView as hx::compositor::Component>::handle_event
             at ./helix-term/src/ui/editor.rs:618:33
  15: hx::compositor::Compositor::handle_event
             at ./helix-term/src/compositor.rs:112:19
  16: hx::application::Application::handle_terminal_events
             at ./helix-term/src/application.rs:144:32
  17: hx::application::Application::event_loop::{{closure}}
             at ./helix-term/src/application.rs:108:21
  18: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /home/auroden/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80:19
  19: hx::application::Application::run::{{closure}}
             at ./helix-term/src/application.rs:266:9
  20: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /home/auroden/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80:19
  21: hx::main::{{closure}}
             at ./helix-term/src/main.rs:157:5
  22: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /home/auroden/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80:19
  23: tokio::park::thread::CachedParkThread::block_on::{{closure}}
             at /home/auroden/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.1/src/park/thread.rs:263:54
  24: tokio::coop::with_budget::{{closure}}
             at /home/auroden/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.1/src/coop.rs:106:9
  25: std::thread::local::LocalKey<T>::try_with
             at /home/auroden/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:272:16
  26: std::thread::local::LocalKey<T>::with
             at /home/auroden/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:248:9
  27: tokio::coop::with_budget
             at /home/auroden/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.1/src/coop.rs:99:5
  28: tokio::coop::budget
             at /home/auroden/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.1/src/coop.rs:76:5
  29: tokio::park::thread::CachedParkThread::block_on
             at /home/auroden/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.1/src/park/thread.rs:263:31
  30: tokio::runtime::enter::Enter::block_on
             at /home/auroden/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.1/src/runtime/enter.rs:151:13
  31: tokio::runtime::thread_pool::ThreadPool::block_on
             at /home/auroden/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.1/src/runtime/thread_pool/mod.rs:71:9
  32: tokio::runtime::Runtime::block_on
             at /home/auroden/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.1/src/runtime/mod.rs:452:43
  33: hx::main
             at ./helix-term/src/main.rs:159:5
  34: core::ops::function::FnOnce::call_once
             at /home/auroden/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5

UPDATE: it happens when moving using commands like b, e or w with various unicode characters (another one is ).

@kirawi
Copy link
Member

kirawi commented Jun 5, 2021

Oh boy, here we go again. At least it wasn't a regression. I'll go look at this tomorrow and hopefully fix it the same day.

@archseer
Copy link
Member

archseer commented Jun 5, 2021

It's categorize not handling UTF-8: https://github.com/helix-editor/helix/blob/master/helix-core/src/movement.rs#L202

Kakoune falls back to categorizing everything as punctuation: https://github.com/mawww/kakoune/blob/master/src/unicode.hh#L123

@archseer
Copy link
Member

archseer commented Jun 5, 2021

It's not fully correct to do that, but it should work with movement since we usually don't care about the category, just whether two characters fit in the same category.

@pickfire
Copy link
Contributor

pickfire commented Jun 5, 2021

I wonder how can we handle these and stuff, I think it should be limited so maybe we can just add them? Just 。,、!?「」-……ー, oh wait, there are quite a lot. Should we just add the common ones until someone complains again? I don't think there is a good way to solve this.

A quick hack to let others aware of that. I saw that as well, the unreachable but I thought it is safe. Oh, I accidentally commited to master branch.

unreachable!("unknown '{}' character category", ch)

pickfire added a commit that referenced this issue Jun 5, 2021
Better error for #123
@archseer
Copy link
Member

archseer commented Jun 5, 2021

I think for the time being we could just match Kakoune and fallback to punctuation, that way there's at least no panic. For categorization I saw https://docs.rs/unicode_categories/0.1.1/unicode_categories/index.html but it doesn't seem maintained much

@CBenoit
Copy link
Member Author

CBenoit commented Jun 5, 2021

I would go for the kakoune approach because I don't really see what else we could get in that unreachable! and it works well enough in kakoune

CBenoit added a commit to CBenoit/helix that referenced this issue Jun 5, 2021
`is_ascii_punctuation` will only work for ASCII punctuations, and when
we have unicode punctuation we jump into the `unreachable`.
This patch fallback into categorizing everything in this branch as
punctuation (kakoune approach).

Fixes helix-editor#123
@kirawi
Copy link
Member

kirawi commented Jun 5, 2021

I think for the time being we could just match Kakoune and fallback to punctuation, that way there's at least no panic. For categorization I saw https://docs.rs/unicode_categories/0.1.1/unicode_categories/index.html but it doesn't seem maintained much

I don't think that'll be a problem, since it doesn't look like it's something that needs to be maintained since it's more or less complete. I'll look into integrating it and leave #129 as a fallback, though I'm curious as to what you mean by us not usually caring about the category while moving?

Edit:
Actually, it looks like regex already has support for this? Could we leverage that? We could also look into icu4x.

@kirawi
Copy link
Member

kirawi commented Jun 6, 2021

I think I found a good solution using this crate:

use unicode_general_category::{GeneralCategory, get_general_category};

fn main() {
    const TEST_CASE: &'static str = ".,!?;:。、!?;:";

    for chr in TEST_CASE.chars() {
        assert_eq!(get_general_category(chr), GeneralCategory::OtherPunctuation);
    }

    assert_eq!(get_general_category('お'), GeneralCategory::OtherPunctuation); // Fails
}

@kirawi kirawi mentioned this issue Jun 6, 2021
CBenoit added a commit to CBenoit/helix that referenced this issue Jun 6, 2021
`is_ascii_punctuation` will only work for ASCII punctuations, and when
we have unicode punctuation we jump into the `unreachable`.
This patch fallback into categorizing everything in this branch as
punctuation (kakoune approach).

Fixes helix-editor#123
CBenoit added a commit to CBenoit/helix that referenced this issue Jun 6, 2021
`is_ascii_punctuation` will only work for ASCII punctuations, and when
we have unicode punctuation (or other) we jump into the `unreachable`.
This patch fallback into categorizing everything in this branch as
`Unknown`.

Fixes helix-editor#123

helix-editor#135: add better support for
unicode categories.
archseer pushed a commit that referenced this issue Jun 7, 2021
`is_ascii_punctuation` will only work for ASCII punctuations, and when
we have unicode punctuation (or other) we jump into the `unreachable`.
This patch fallback into categorizing everything in this branch as
`Unknown`.

Fixes #123

#135: add better support for
unicode categories.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants