Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Umlaut problem: UTF-8 chars get converted to unicode entities #20

Closed
Soleone opened this issue Jun 16, 2009 · 10 comments
Closed

The Umlaut problem: UTF-8 chars get converted to unicode entities #20

Soleone opened this issue Jun 16, 2009 · 10 comments
Labels

Comments

@Soleone
Copy link

Soleone commented Jun 16, 2009

When using an UTF-8 character (like the german umlaut ü) in the Rakefile, it gets converted to (2!) UTF-8 entities it seems.

For example in Rakefile it reads:
gem.authors = ["Markus Hüberman"]

And in the resulting gemspec file after releasing it gets converted to:
s.authors = ["Markus H\303\274berman"]

@technicalpickles
Copy link
Owner

I'm not super familiar with UTF-8 in general, so I'm not sure I can fix this myself.

I've found some notes about doing a File.open to write as utf-8, but it seems to be Ruby 1.9 specific.

@chastell
Copy link

Hm, I’m using Ruby 1.9 and UTF-8 in my summary field and both the gemspec file and the resulting gem’s metadata contain the proper UTF-8 strings – i.e., this bug does not manifest itself in my case.

Markus: I tried your name in my gem and it worked. Did you specify the right Ruby file encoding (my Rakefile’s first line is ‘# encoding: UTF-8’)?

@chastell
Copy link

BTW: \303\274 is the proper binary representation of UTF-8 ü:

>> "Markus Hüberman" == "Markus H\303\274berman"
=> true

@SFEley
Copy link

SFEley commented Jan 25, 2010

It would make sense for the generated Rakefile to add # encoding: UTF-8 as its first line. This is harmless in Ruby 1.8 but very important in Ruby 1.9 for consistency.

@danopia
Copy link

danopia commented Feb 10, 2010

It probably writes like that under 1.8 because it looks like it uses .inspect to dump the string back out. 1.8 isn't aware about UTF-8, it just sees special chars and escapes them. 1.9 is more intelligent about UTF-8 so it'll output as expected.

@ghost
Copy link

ghost commented Apr 20, 2010

To fix this if you apply the following string:
# encoding: utf-8
with the hash at the top of each generated gemspec file it should resolve the issue.
I could be wrong but this works for me locally

@ghost
Copy link

ghost commented Apr 20, 2010

sorry just realised SFEley also wrote this. +1 on this working :-)

@amatsuda
Copy link
Contributor

Just sent a pull request that adds the magic comment on top of Rakefile. #165

The original decoded UTF-8 problem still remains though. I guess this is a Rubygems' problem.

@technicalpickles
Copy link
Owner

I've merged in amatsuda's change. It adds # encoding: utf-8 to new project's Rakefile.

Between that, and generated gemspects including # encoding: utf-8, things should be good. Unfortunately, it's not working on 1.8 and existing projects would need to add that line to their own Rakefile.

@gmallard
Copy link

In 1.8 you might be able to cure the 1.8 difficulties with:

$KCODE = "U"

Just a thought.

erithmetic pushed a commit to erithmetic/bueller that referenced this issue Feb 6, 2012
webmat pushed a commit to webmat/jeweler that referenced this issue Mar 10, 2012
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants