Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Ruby 1.9 character encoding changes #188

Closed
blakesmith opened this Issue · 15 comments
@blakesmith

With Ruby 1.8, incorrect UTF-8 encoded characters are silently ignored. If you have a post with incorrect UTF-8 characters in the content body, they will show up in your rendered page as question marks (unknown characters).

A user upgrading from Ruby 1.8 to Ruby 1.9 who's site seemed to be working fine would get a weird error when trying to render their site (assuming it had incorrectly encoded UTF-8 characters):

/Users/blake/projects/jekyll/lib/jekyll/convertible.rb:26:in `read_yaml': invalid byte sequence in UTF-8
 (ArgumentError)
        from /Users/blake/projects/jekyll/lib/jekyll/post.rb:39:in `initialize'
        from /Users/blake/projects/jekyll/lib/jekyll/site.rb:110:in `new'
        from /Users/blake/projects/jekyll/lib/jekyll/site.rb:110:in `block in read_posts'
        from /Users/blake/projects/jekyll/lib/jekyll/site.rb:108:in `each'
        from /Users/blake/projects/jekyll/lib/jekyll/site.rb:108:in `read_posts'
        from /Users/blake/projects/jekyll/lib/jekyll/site.rb:169:in `read_directories'
        from /Users/blake/projects/jekyll/lib/jekyll/site.rb:79:in `read'
        from /Users/blake/projects/jekyll/lib/jekyll/site.rb:71:in `process'
        from ../jekyll/bin/jekyll:150:in `'

This doesn't really help the user fix the problem post. This commit will at least display the problem post so that the user knows what needs to be fixed for the site to render successfully.

This is mainly an issue of how Ruby decides to handle String encodings by default. You can read more about it here: http://blog.grayproductions.net/articles/ruby_19s_string

@lmmendes

In my case i was getting the following error:

/usr/local/rvm/gems/ruby-1.9.1-p378/gems/jekyll-0.7.0/lib/jekyll/convertible.rb:26:in `read_yaml': invalid byte sequence in US-ASCII (ArgumentError)
    from /usr/local/rvm/gems/ruby-1.9.1-p378/gems/jekyll-0.7.0/lib/jekyll/page.rb:24:in `initialize'
    from /usr/local/rvm/gems/ruby-1.9.1-p378/gems/jekyll-0.7.0/lib/jekyll/site.rb:185:in `new'
    from /usr/local/rvm/gems/ruby-1.9.1-p378/gems/jekyll-0.7.0/lib/jekyll/site.rb:185:in `block in read_directories'
    from /usr/local/rvm/gems/ruby-1.9.1-p378/gems/jekyll-0.7.0/lib/jekyll/site.rb:175:in `each'

And solved the problem declaring the following locale in my shell:

export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8

@tatey

Just got bitten by this after recently switching to 1.9 as my default Ruby. Thanks for the patch.

@lloydh

I think I'm running into this problem, but only when running the jekyll command via SSH, not if I run jekyll directly on the host machine. Jekyll also runs without errors on the client machine — it's only over SSH that I encounter this problem:

/usr/local/lib/ruby/gems/1.9.1/gems/jekyll-0.10.0/lib/jekyll/convertible.rb:26:in `read_yaml': invalid byte sequence in US-ASCII (ArgumentError)
    from /usr/local/lib/ruby/gems/1.9.1/gems/jekyll-0.10.0/lib/jekyll/post.rb:39:in `initialize'
    from /usr/local/lib/ruby/gems/1.9.1/gems/jekyll-0.10.0/lib/jekyll/site.rb:119:in `new'
    from /usr/local/lib/ruby/gems/1.9.1/gems/jekyll-0.10.0/lib/jekyll/site.rb:119:in `block in read_posts'
    from /usr/local/lib/ruby/gems/1.9.1/gems/jekyll-0.10.0/lib/jekyll/site.rb:117:in `each'
    from /usr/local/lib/ruby/gems/1.9.1/gems/jekyll-0.10.0/lib/jekyll/site.rb:117:in `read_posts'
    from /usr/local/lib/ruby/gems/1.9.1/gems/jekyll-0.10.0/lib/jekyll/site.rb:211:in `read_directories'
    from /usr/local/lib/ruby/gems/1.9.1/gems/jekyll-0.10.0/lib/jekyll/site.rb:88:in `read'
    from /usr/local/lib/ruby/gems/1.9.1/gems/jekyll-0.10.0/lib/jekyll/site.rb:79:in `process'
    from /usr/local/lib/ruby/gems/1.9.1/gems/jekyll-0.10.0/bin/jekyll:164:in `<top (required)>'
    from /usr/local/bin/jekyll:19:in `load'
    from /usr/local/bin/jekyll:19:in `<main>'

I haven't tried lmmendes' fix yet (sorry, how/where do I declare those locales, and just on the host machine, or both?) but does anybody have any ideas why SSH is creating these problems?

Thanks.

@Kwpolska

Put these two lines to .bashrc:

export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
@lloydh

Thanks Kwpolska.

I ended up having to put those lines in my .profile, but they did the trick.

@dengwh

I just got the similar error.
My environment is Windows XP with ruby 1.9.2.
Any recommends under Windows?

Thanks.

@fhemberger

@dengwh, for Windows set the same environment variables. In your cmd.exe, type

set LC_ALL=en_US.UTF-8
set LANG=en_US.UTF-8
@stereobooster

@dengwh for windows you can use

chcp 65001  

seems connected to #117

@sdsalyer

I'm trying to get a post-receive hook to work on Arch Linux with Ruby 1.9 and I'm getting this ASCII error. I've tried adding the UTF-8 settings to my .profile, but I'm still getting the error. I assume the git hook doesn't use my .profile, though. Any further suggestions?

EDIT: I just applied to patch to this file and it works fine now. Duh... and Thank you!

@stereobooster

connected to #226, #201

@ehtb

This fix worked for me, whereas the others didn't: http://stackoverflow.com/a/8274677/1303499

@svnpenn

I had a text file with a ü, but accidentally had it saved with ANSI encoding. Changing the encoding to UTF-8 fixed it for me. @stereobooster patch would be very helpful though.

@kevinSuttle

Still getting errors but it just started out of nowhere:

/Users/kevinsuttle/.rbenv/versions/1.9.3-p194/lib/ruby/gems/1.9.1/gems/jekyll-0.11.2/lib/jekyll/convertible.rb:29:in `read_yaml': invalid byte sequence in UTF-8 (ArgumentError)

This isn't new by the way. See issues 117, 188, 493, 135.

@crazymaster crazymaster referenced this issue in moto-net/moto-net.github.com
Closed

Windowsにてjekyllでページ生成できない #5

@parkr
Owner

Merged in #718.

@parkr parkr closed this
@mikej888 mikej888 referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@heidsoft

Liquid Exception: invalid byte sequence in UTF-8 in index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.