Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Broken encoding of Unicode updates from the command line #12

Open
astanin opened this Issue · 14 comments

5 participants

@astanin

1.0.2 prints Unicode tweets correctly, but corrupts them when sending an update from the command line

$ twidge update 'unicode message'

An example: http://twitter.com/jetxee/status/15322966107
Instead of: "... проверка twidge 1.0.2"

Sending an update from the stdin works correctly:

$ twidge update
unicode message^D
@astanin

I don't know how to find the current locale in Haskell and decode the output of getArgs. Probably it is GHC which should handle this.

Probably related GHC bugs:

http://hackage.haskell.org/trac/ghc/ticket/3307
http://hackage.haskell.org/trac/ghc/ticket/3309

@astanin

I use this patch as a private workaround: http://gist.github.com/423901

@jgoerzen
Owner

Is there a way to make that work with utf8-string 0.3.4? It's currently being shipped in Debian, for instance, and I'd like to be able to be compatible with it if I can.

@astanin

0.3.4 doesn't have isUTF8Encoded. We can either don't check for UTF8 encoding at all and decode anyway (likely to break for those with other locales), or copy-paste isUTF8Encoded from 0.3.5 under a different name into twidge.

isUTF8Encoded: http://hackage.haskell.org/packages/archive/utf8-string/0.3.6/doc/html/src/Codec-Binary-UTF8-String.html#isUTF8Encoded

@murrayf

... spanish "tildes" (á é í ó ú) are not correctly shown in twitter.com from update twidge command.

@murrayf

... spanish eñe letter (ñ) are not supported also, the ISO code for all this symbols is ISO-8859-15.
Hope it helps!

@jgoerzen
Owner

What version of twidge are you using, murrayf? Are you piping the data to twidge on stdin or giving it on the command line?

@murrayf

Hello again John,
I'm using version 1.0.2. from ubuntu maverick deb package. This errors come from updating command.

@jgoerzen
Owner

One potential problem is that your system locale is something other than UTF-8. twitter and twidge both are designed to operate with UTF-8 only.

Can you check on that?

@murrayf

... this is the result of locale command for my system:
LANG=es_ES.utf8
LC_CTYPE="es_ES.utf8"
LC_NUMERIC="es_ES.utf8"
LC_TIME="es_ES.utf8"
LC_COLLATE="es_ES.utf8"
LC_MONETARY="es_ES.utf8"
LC_MESSAGES="es_ES.utf8"
LC_PAPER="es_ES.utf8"
LC_NAME="es_ES.utf8"
LC_ADDRESS="es_ES.utf8"
LC_TELEPHONE="es_ES.utf8"
LC_MEASUREMENT="es_ES.utf8"
LC_IDENTIFICATION="es_ES.utf8"
LC_ALL=

@jgoerzen
Owner

OK. And are you providing the update as a command-line parameter or on stdin?

@murrayf

command-line parameter. I'm not sure but as you can see in message above LC_ALL= is empty by default, I don't know if that has to be that way.

@AlexeyPrishchepo

It doesn't post updates with utf-8 symbols for me too.

@tatxo

A workaround that works for me is to echo something pipelined to twidge (my locale is UTF-8)
$ echo "Algo en español" | twidge update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.