-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnicodeEncodeError: ascii codec can't encode character u'\u2019' #2066
Comments
the reason that I am issuing a ticket here is because I don't have any non ascii code there. And it seems other's previous reported problem is related to international locale charaters... |
what is your system locale setting, I have a similar issue on my archlinux, but when I change my locale setting from zh_CN to en_US, the issue disappear. |
in mac, I can only think of setting language when initializing the machine and it was english...
|
I am not a mac user, so I can not help you. In linux, you can just type "locale" in the terminal to check your locale setting. |
Ah just filed a ticket for the same #2069 |
for my case I am writing in bash scripts. And I can also replicate in c/c++ files as well |
Seems to be OSX + any syntax highlighted file + |
this problem seems not always occurring. I just tried it now and it disappears... five min ago it was still there. |
it seems so far the problem is not appearing anymore... I think something might be helpful:
and reinstall ycm again. |
I'll give that try, if it solves mine i'll update here and close my issue too |
also, you might want to check your vimrc. I am now removing all my own stuff and only leave plugin stuff there right now. If this problem will be going away for a while, I will add my vimrc back to see if there is any problems.
|
yeah your first suggestion didn't do the trick unfortunately |
does this help? YcmRestartServer I am also having the problem back, but it seems that using YcmRestartServer will help? |
also try kill all ycm processes: |
No luck. I got work to do, I'll disable it for now. Hopefully this can be resolved soon. |
@dragonxlwang It would be really useful if you could checkout my YouCompleteMe and ycmd branches which I hope resolve this (and a number of other) unicode issues:
Feedback on whether or not it resolves your particular issue will speed up getting it PR'd :) Thanks! |
@puremourning It seems after last few update ago, I could no longer reproduce the problem any more... If i noticed any problem there, I would say that after clicking the first character, the path completion disappears. However, it doesn't throw any errors.... |
The error is still there usually when you enable With my change, you should still get completions, but just the ones containing unicode chars are not listed (after typing a query). Unfortunately supporting queries on non-ASCII strings is not possible with ycmd's search infrastructure. What I have fixed is that it would crash trying to. |
Hi @puremourning , after checking out your version, I still get error after enabling noshowmode I have to run Please let me know if I missed anything... |
by running
I could update the submodules int he YCM branch i suppose... |
following your way seems still missing some submodules... I will wait for further update. |
That's very strange :/ Anyway i have updated the submodules in my YouCompleteMe branch so the new instructions are:
The CI build is running so i'm confident that all the modules are correctly set up. |
Again, when hitting ``~/'' I am getting the same error. All I did was
|
I have recently updated to the newer version ycm and this problem occurs again (wasn't been for quite a while)... @puremourning , is there any follow up? thanks! This problem is OS X specific and might be more general than it seems: on a fresh installed mac, the problem occurs if in the directory candidates there is a filename with space in between, like |
[READY] Fix issues with multi-byte characters ## Summary This change introduces more general support for non-ASCII characters in buffers handled by YCMD. In ycmd's public API, all offsets are byte offsets into the UTF-8 encoded buffers. We also assume (because, we have no other choice) that files stored on disk are also UTF-8 encoded. Internally, almost all of ycmd's functionality operates on unicode strings (python 2 `unicode()` and python 3 `str()` objects, transparently via `future`). Many of the downstream completion engines expect unicode code points as the offsets in their APIs. One special case is the `ycm_core` library (identifier completer and clang completer), which requires instances of the _native_ `str` type. All strings used within the c++ using `boost::python` require passing through `ToCppStringCompatible` Previously, we were largely just assuming that `code point == byte offset` - i.e. all buffers contained only ASCII characters. This worked up to a point, but more by luck than judgement in a number of places. ## References In combination with a YCM change and PR #453, I hope this: - fixes #109 - fixes ycm-core/YouCompleteMe#2096 - fixes ycm-core/YouCompleteMe#2088 - fixes ycm-core/YouCompleteMe#2069 - fixes ycm-core/YouCompleteMe#2066 - fixes ycm-core/YouCompleteMe#1378 ## Overview of changes The changes fall into the following areas: - Providing access to and conversion to/from code points and byte offsets (`request_wrap.py`) - Changing certain algorithms/features to work entirely in codepoint space when they are trying to operate on logical 'characters' within the buffer (see known issues for why this isn't perfect, but probably most of the way there) - Changing the completers to convert between the external (on both sides) and internal representations by using the shortcuts provided in `request_wrap.py` - Adding tests for each of the completers for both completions and subcommands ## Completer-specific notes Pretty much all of the completers I tested required some changes: - clang uses utf-8 and byte offsets, but had some bugs with the `GetDoc` parsing stuff - OmniSharp speaks codepoint offsets - Tern speaks codepoint offsets - JediHTTP speaks codepoint offsets - tsserver speaks codepoint offsets - gocode speaks byte offsets - racer i did not test ## Further work / Known issues - we act blissfully ignorant of the case where a unicode character consumes multiple code points (such as where there is a modifier after the code point) - when typing a unicode character, we still get an exception from `bitset` (see #453 for that fix) - the filtering and sorting system is 100% designed for ASCII only, and it is not in the scope of this PR to change that. Currently after any filtering operation, words containing non-ASCII characters are excluded. - I did not get round to testing rust using racer - there are further changes required to YouCompleteMe client (a further PR is coming for that) <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/valloric/ycmd/455) <!-- Reviewable:end -->
vim version:
and I am having the following error log:
Mac OS X10.11.3
And YCM: commit f44435b
The text was updated successfully, but these errors were encountered: