Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mimic GNU Gettext with encoding conversions #41

Merged
merged 1 commit into from Sep 22, 2015

Conversation

375gnu
Copy link
Contributor

@375gnu 375gnu commented Sep 21, 2015

Hi, there is one more issue was found by Debian users: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=799194.

Hello!
As discussed on bug #799050 [1] and on debian-ruby@l.d.o [2], I've
found a weird behavior of ruby-gettext.

[1] https://bugs.debian.org/799050#22
[2] https://lists.debian.org/debian-ruby/2015/09/msg00042.html

Steps to reproduce (with apt-listbugs/0.1.17 installed)

  $ locale
  LANG=en_US.UTF-8
  LANGUAGE=en_US:en
  LC_CTYPE="en_US.UTF-8"
  LC_NUMERIC="en_US.UTF-8"
  LC_TIME="en_US.UTF-8"
  LC_COLLATE="en_US.UTF-8"
  LC_MONETARY="en_US.UTF-8"
  LC_MESSAGES="en_US.UTF-8"
  LC_PAPER="en_US.UTF-8"
  LC_NAME="en_US.UTF-8"
  LC_ADDRESS="en_US.UTF-8"
  LC_TELEPHONE="en_US.UTF-8"
  LC_MEASUREMENT="en_US.UTF-8"
  LC_IDENTIFICATION="en_US.UTF-8"
  LC_ALL=
  $ ruby -e 'require "gettext" ; GetText::bindtextdomain("apt-listbugs") ; puts GetText.gettext("Forwarded")'
  Forwarded
  $ LANGUAGE='fr' ruby -e 'require "gettext" ; GetText::bindtextdomain("apt-listbugs") ; puts GetText.gettext("Forwarded")'
  Transférés
  $ LANGUAGE='fr' ruby -e 'require "gettext" ; GetText::bindtextdomain("apt-listbugs") ; puts GetText.gettext("Forwarded").encoding'
  UTF-8

Everything's fine so far.

  $ LANGUAGE='fr' LC_CTYPE='C' ruby -e 'require "gettext" ; GetText::bindtextdomain("apt-listbugs") ; puts GetText.gettext("Forwarded").encoding'
  ASCII-8BIT

Here the encoding is wrong and the content of the returned string
includes rubbish characters:

  $ LANGUAGE='fr' LC_CTYPE='C' irb
  irb(main):001:0> require "gettext"
  => true
  irb(main):002:0> GetText::bindtextdomain("apt-listbugs")
  [...]
  irb(main):003:0> GetText.gettext("Forwarded")
  => "Transf\xC3\xA9r\xC3\xA9s"
  irb(main):004:0> exit

The awkward finding is that, if I print the string, it gets magically
converted back to UTF-8:

  $ LANGUAGE='fr' LC_CTYPE='C' ruby -e 'require "gettext" ; GetText::bindtextdomain("apt-listbugs") ; puts GetText.gettext("Forwarded")'
  Transférés

But I cannot compute the width of the string with ruby-unicode:

  $ LANGUAGE='fr' LC_CTYPE='C' ruby -e 'require "gettext" ; GetText::bindtextdomain("apt-listbugs") ; require "unicode" ; puts Unicode.width(GetText.gettext("Forwarded"))'
  -e:1:in `width': "\xC3" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)
          from -e:1:in `<main>'

What's wrong?
Please investigate and/or forward my bug report upstream.

Thanks for your time!

@375gnu
Copy link
Contributor Author

375gnu commented Sep 21, 2015

With my change ruby-gettext now behaves like GNU Gettext in such cases.

@kou
Copy link
Member

kou commented Sep 21, 2015

Thanks for your report.
Should we always use "?" as replacement character?
How about using the default behavior? (We can use the default behavior by removing :repalce => '?'.)

http://ruby-doc.org/core-2.2.0/String.html#method-i-encode

:replace

Sets the replacement string to the given value. The default replacement string is “uFFFD” for Unicode encoding forms, and “?” otherwise.

@375gnu
Copy link
Contributor Author

375gnu commented Sep 21, 2015

On 9/21/15, Kouhei Sutou notifications@github.com wrote:

Thanks for your report.
Should we always use "?" as replacement character?
How about using the default behavior? (We can use the default behavior by
removing :repalce => '?'.)

http://ruby-doc.org/core-2.2.0/String.html#method-i-encode

:replace

Sets the replacement string to the given value. The default replacement
string is “uFFFD” for Unicode encoding forms, and “?” otherwise.

Sound reasonably. I've pushed updated PR

kou added a commit that referenced this pull request Sep 22, 2015
mimic GNU Gettext with encoding conversions

Patch by Hleb Valoshka. Thanks!!!
@kou kou merged commit ee60ccf into ruby-gettext:master Sep 22, 2015
@kou
Copy link
Member

kou commented Sep 22, 2015

Thanks!
I've merged and released a new version!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants