Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to enter Chinese prompt #646

Closed
LainNya opened this issue Mar 31, 2023 · 8 comments
Closed

Unable to enter Chinese prompt #646

LainNya opened this issue Mar 31, 2023 · 8 comments
Labels
windows Issues specific to Windows

Comments

@LainNya
Copy link

LainNya commented Mar 31, 2023

Hi!My use is compiled under Windows main.exe, when I type Chinese Prompt, I found that the model seems to be unable to understand, under debugging found that std::getline(std::cin,line) get is empty lines, then I tried Japanese, are the same result.
(Since I am a native Chinese speaker, this question was translated by DeepL)
image

@TingTingin
Copy link

have you tried this https://github.com/ymcui/Chinese-LLaMA-Alpaca?

@LainNya
Copy link
Author

LainNya commented Mar 31, 2023

have you tried this https://github.com/ymcui/Chinese-LLaMA-Alpaca?

Yes, this is the model I use,I ran the command in git bash and it seems works fine?

@LainNya
Copy link
Author

LainNya commented Mar 31, 2023

The Chinese value obtained in the debug looks like this
image

@boholder
Copy link

boholder commented Apr 1, 2023

Let me provide more information about this issue. Hope this helps to solve the problem.

@LainNya found out that:
in the CMD and Powershell (also Cygwin64 Terminal as I tested) terminals of the Windows environment, the Chinese entered could not be fetched by llama.cpp. But in other terminals like git-bash and linux subsystem for windows (no it's just linux env) it works.

It seems that the std::getline doesn't support UTF-8 character set.

This is the detailed cause and solution I found, but I can't tell since I'm not familiar with C++:

https://cboard.cprogramming.com/cplusplus-programming/145590-non-english-characters-cout-post1086757.html#post1086757

I'm not sure if this should be chalked up to a Windows terminal compatibility issue with C++.

Anyway, use git-bash and linux subsystem for windows as the workaround for now.

@josStorer
Copy link

I modified the implementation of the getline part, rewrote a simple getline with _getwch, and it can work, but it's not cross-platform

Here is the repo: https://github.com/josStorer/llama.cpp-unicode-windows

O%BKDC3SM}TZQ8ZKQBPG4FE

@tomsnunes
Copy link

This problem for unicode characters had been fixed here for me in the following PR #420 , but I confirm that it is currently not possible to insert or display utf-8 characters at moment.

@prusnak
Copy link
Sponsor Collaborator

prusnak commented Apr 8, 2023

#840 has been merged - try pulling the latest master and please test whether this fixed your issue

@boholder
Copy link

boholder commented Apr 9, 2023

LGTM after replace the latest binaries in https://github.com/ggerganov/llama.cpp/releases/tag/master-aaf3b23
the screenshot will be expired after one month
Modifying the prompt is also works, nice job! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
windows Issues specific to Windows
Projects
None yet
Development

No branches or pull requests

6 participants