Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error about invalid byte sequence in read_yaml #144

Closed
chartjes opened this issue Sep 10, 2011 · 28 comments
Closed

Error about invalid byte sequence in read_yaml #144

chartjes opened this issue Sep 10, 2011 · 28 comments

Comments

@chartjes
Copy link

When running 'rake generate' I am getting the following error message:

/Users/chartjes/.rvm/gems/ruby-1.9.2-p290/gems/jekyll-0.11.0/lib/jekyll/convertible.rb:29:in `read_yaml': invalid byte sequence in US-ASCII (ArgumentError)

How can I go about fixing this problem? I have 427 posts in markdown and having gone through all of them I cannot find the offending character.

@imathis
Copy link
Owner

imathis commented Sep 10, 2011

@chartjes
Copy link
Author

It could very well be. I'm using vim on OS-X and I'd bet it's set to encode everything as UTF8. So how would I go about re-incoding all those files?

Chris Hartjes
http://www.littlehart.net/atthekeyboard
@chartjes

On Saturday, 10 September, 2011 at 5:35 PM, Brandon Mathis wrote:

Could this be a utf8 issue?

http://stackoverflow.com/questions/3140111/jekyll-does-not-parse-utf-8

Reply to this email directly or view it on GitHub:
#144 (comment)

@imathis
Copy link
Owner

imathis commented Sep 10, 2011

I use MacVim on OS X too, that's not the issue. Didn't you convert from Wordpress? Do find in page for 'ASCII' on http://zanshin.net/2011/08/11/switching-to-octopress/ that might help you

@chartjes
Copy link
Author

Yes, I was converting from WordPress. That post doesn't offer any insight into how to actually re-encode the files so that Jekyll / Octopress will be happy

Chris Hartjes
http://www.littlehart.net/atthekeyboard
@chartjes

On Saturday, 10 September, 2011 at 5:39 PM, Brandon Mathis wrote:

I use MacVim on OS X too, that's not the issue. Didn't you convert from Wordpress? Do find in page for 'ASCII' on http://zanshin.net/2011/08/11/switching-to-octopress/ that might help you

Reply to this email directly or view it on GitHub:
#144 (comment)

@imathis
Copy link
Owner

imathis commented Sep 10, 2011

:( I was hoping it'd set you on a course. I'll look further.

@imathis
Copy link
Owner

imathis commented Sep 10, 2011

Here's the script he used: https://gist.github.com/1133266 note the 'encode' bit on line 44. Hope this helps.

@chartjes
Copy link
Author

That would be cool if my stuff was exported as XML…I had to use the Jekyll migration script that pulled stuff in from the database.

Chris Hartjes
http://www.littlehart.net/atthekeyboard
@chartjes

On Saturday, 10 September, 2011 at 5:46 PM, Brandon Mathis wrote:

Here's the script he used: https://gist.github.com/1133266 note the 'encode' bit on line 44. Hope this helps.

Reply to this email directly or view it on GitHub:
#144 (comment)

@chartjes
Copy link
Author

Using the file command that is suggested in that blog post tells me I have "HTML document text" and "ASCII English Text" documents in my .markdown files

Chris Hartjes
http://www.littlehart.net/atthekeyboard
@chartjes

On Saturday, 10 September, 2011 at 5:44 PM, Brandon Mathis wrote:

:( I was hoping it'd set you on a course. I'll look further.

Reply to this email directly or view it on GitHub:
#144 (comment)

@imathis
Copy link
Owner

imathis commented Sep 10, 2011

Your avatar is intimidating me into helping you figure this out, but I don't know what to suggest :/

@zanshin
Copy link
Contributor

zanshin commented Sep 11, 2011

Why not just create a small ruby script to read each .markdown file that was output from your Jekyll conversion, do the encode bit from my script to clean things up, and re-save the file?

@imathis
Copy link
Owner

imathis commented Sep 17, 2011

@chartjes I think you solved this right? Closing, but you're welcome to reopen.

@imathis imathis closed this as completed Sep 17, 2011
@chartjes
Copy link
Author

Yeah, problem was installed by reinstalling Octopress and following the directions about RVM

@imathis
Copy link
Owner

imathis commented Sep 17, 2011

Great, thanks.

@pfleidi
Copy link

pfleidi commented Jan 24, 2012

Had the same problem. For me, exporting the right locale setting in my shell environment fixed that:

 export LC_CTYPE=en_US.UTF-8
 export LANG=en_US.UTF-8

You can validate your settings by executing locale in your current shell session.

@aetherwu
Copy link

@pfleidi is right. thank you!

@lscott3
Copy link

lscott3 commented Jul 17, 2012

@pfleidi Thanks! This fixed my issue as well!

@andrewmichaelsmith
Copy link

@pfleidi Thanks, that did the job

@PHironaka
Copy link

Should I write blog posts in .textile to avid UTF-8 issues?

@dearprakash
Copy link

@PHironaka not necessary. If you encounter this issue, just use the tip from @pfleidi

I use Coda2 and i had issues even when i set encoding in the editor. But the shell is the right place to correct.

@imathis
Copy link
Owner

imathis commented Aug 18, 2012

@dearprakash I've seen times where Coda2 has issues with UTF8. Recently I helped someone over Twitter who resolved that his encoding problems were because Coda2 wasn't respecting the setting and was outputting mixed content.

@PHironaka if I were you, I'd use Markdown. My experience with Textile hasn't been too good. It's really hard to extend compared to Markdown and features like backtick code blocks have been a real pain to get working in Textile. Also there's not a lot of community support for Textile as Markdown seems to be what everyone is developing around, building Markdown support in lots of writing applications (and GitHub's comment system). I think Textile is on the way out.

@PHironaka
Copy link

Thank you @imathis and @dearprakash for the help! I'm guessing it's just an error in my _config.yml file. Is there some sort of validation tool to check for errors in YAML?

@imathis
Copy link
Owner

imathis commented Aug 18, 2012

@PHironaka a tip: check make sure your strings are quoted. Some unescaped characters freak out the parser.

@itsPG
Copy link

itsPG commented Dec 9, 2012

Thanks, this helped me a lot

@jeremyckahn
Copy link

If Ubuntu users are also having this issue, this thread helped me figure it out.

I'm having another issue that is preventing me from generating my site, but I believe that issue is unrelated.

@zlx
Copy link

zlx commented Mar 19, 2013

@pfleidi Perfect, can not thanks any more

@peterwillcn
Copy link

just add export RUBYOPT="-KU -E utf-8:utf-8" into ~/.bash_profile

@jiangxiaopeng
Copy link

@pfleidi thanks very much!

@zosiu
Copy link

zosiu commented Mar 10, 2014

@pfleidi thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests