New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
grep: Invalid collation character #6
Comments
Must be an issue with your locale. Try calling If that doesn't fix it, let me know. If that does fix it, I can add it to the script. |
@polm I worked around it by substituting the grep string's special characters (i.e. "wide latin" characters) with their respective unicode hex numbers i.e. |
Well that's weird. Can you tell me what your LOCALE and OS are and confirm whether or not the The number of entries I mentioned is for UniDic 2.3.0, which is still the latest version. So if you're just getting 100 it's possible something is going wrong. |
@polm I just realised I have downloaded the spoken japanese dict (as that is version 3.0.1.1). I'll download the written language one (tomorrow) and check again :) |
@freebiesoft Any update on this? |
Closing due to lack of activity. |
@polm FYI
With With |
Thanks for the report. I'll add the collate setting to the script to avoid issues like this. |
Oh wait... your locale settings look right and it's still not doing what I would expect. Hm. I'll take a look at this later, but I would recommend using ggrep. |
Sorry for misleading. Mine is indeed not a locale issue. |
So I'm looking into the difference between BSD and GNU grep and I'm still unsure why this is happening. Here are some questions I have:
|
@srctaha Can you upload the 8962 entries that BSD grep picks up as a gist or something so I can check it? |
Thanks, that's helpful. The results contain punctuation, kaomoji, and alphabetic entries with more than one character, which is weird... I'll see if I can figure out what's up. |
FYI, given
|
For at least |
Thanks for the details about Macs. Since installing GNU grep isn't that hard, and since I'm still not sure exactly what's going on here, I'll just leave this alone for now, but I may come back to it later. |
When I run the clean-lex.sh file I get
grep: Invalid collation character
errorThe text was updated successfully, but these errors were encountered: