Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tr: Illegal byte sequence #36

Closed
sureshjoshi opened this issue Oct 19, 2021 · 5 comments
Closed

tr: Illegal byte sequence #36

sureshjoshi opened this issue Oct 19, 2021 · 5 comments

Comments

@sureshjoshi
Copy link

On almost every invocation of ./pwd.sh w someusername 30, I receive a "tr: Illegal byte sequence" response. On the very few invocations that work (one in every 20), my random safe/filename is only 1-2 characters long.

It appears to be a MacOS problem (I'm on 11.6) - and the workaround here appears to work: https://unix.stackexchange.com/questions/45404/why-cant-tr-read-from-dev-urandom-on-osx

LC_CTYPE=C tr -dc "[:lower:]" < /dev/urandom | fold -w8 | head -n1

I don't have any locales in my environment by default, and my zshrc doesn't set any by default - so I think the data pulled from urandom is going haywire.

I'm also not sure if my workaround above is "the" solution, or just a workaround.

@drduh
Copy link
Owner

drduh commented Oct 24, 2021

Thanks for reporting this issue.

Can you let me know if it works with LC_CYPE=en_US.UTF-8? I recommend setting it in your zshrc for now like https://github.com/drduh/config/blob/master/zshrc#L25-L29

@sureshjoshi
Copy link
Author

Actually, it turns out that no, using that CTYPE doesn't work - which has me all kinds of surprised. I wasn't expecting it to fail. I've tried setting both LC_CTYPE and LC_ALL to that locale, and I still get the illegal byte sequence.

@mrpudn
Copy link

mrpudn commented Jan 7, 2022

I am also having this issue on macos with zsh.

Here's the offending command pipeline and sample output:

$ tr -dc "foobar" < /dev/urandom | fold -w8 | head -n1 
tr: Illegal byte sequence

Like @sureshjoshi, I was unable to get this line working by prefixing LC_CTYPE=en_US.UTF-8:

$ LC_CTYPE=en_US.UTF-8 tr -dc "foobar" < /dev/urandom | fold -w8 | head -n1
tr: Illegal byte sequence

LC_CTYPE=en_US.UTF8 seems to work though for some reason:

$ LC_CTYPE=en_US.UTF8 tr -dc "foobar" < /dev/urandom | fold -w8 | head -n1
orbrrrba

LC_CTYPE=C also works:

$ LC_CTYPE=C tr -dc "foobar" < /dev/urandom | fold -w8 | head -n1
ffrbrooo

As @drduh mentioned, this is probably something that should be configured in our .zshrc.

Edit: This probably explains the above:

$ LANG=en_US.utf8 locale                                               
LANG="en_US.utf8"
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL="C"

en_US.utf8 is probably invalid, so it just defaults to C for the locale, which works above. I can't seem to get en_US.utf-8 to work, though.

By default, my locale looks like this (nothing in .zshrc):

$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

So I'm kind of stumped now. The only thing I can think to do is to patch this line in my copy of pwd.sh by prefixing it with LC_ALL=C. This works for me for the time being.

@drduh
Copy link
Owner

drduh commented Aug 21, 2022

We can make the same fix as drduh/Purse#3

@drduh
Copy link
Owner

drduh commented Dec 26, 2022

Fix pushed, thanks for reporting and investigating!

@drduh drduh closed this as completed Dec 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants