Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indenting gets confused with multibyte chars #26

Open
hukka opened this issue Jun 17, 2018 · 2 comments
Open

Indenting gets confused with multibyte chars #26

hukka opened this issue Jun 17, 2018 · 2 comments

Comments

@hukka
Copy link

hukka commented Jun 17, 2018

If there are some chars that take multiple bytes in UTF-8, parinfer-rust refuses to let them be in the same visual indent level, instead requiring as much indentation spaces as there are bytes before the correct level in the previous line:

(def äää {:foo 1
             :bar 2})

(def aaa {:foo 1
          :bar 2})

(def äää {:foo 1}
          :bar 2)
@hukka
Copy link
Author

hukka commented Jun 17, 2018

I suppose there's no way to do it with the standard library and instead something like https://crates.io/crates/unicode-segmentation is needed to do the "iteration over grapheme clusters", as the docs put it.

@hukka
Copy link
Author

hukka commented Jun 17, 2018

Or perhaps http://unicode-rs.github.io/unicode-width/unicode_width/index.html is better. I'm way over my knowledge here. I can see the problem with some European languages, but I have no idea how easily this could be solved "generally". Or if it's even possible, given current terminals, fonts and OS font rendering.

FWIW my specific problem would probably go away even by counting code points, which — I realize — is a horrible, horrible hack to do Unicode "right".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant