New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem extracting titles from Drupal 7 site #90

Closed
davad opened this Issue Nov 21, 2013 · 13 comments

Comments

Projects
None yet
4 participants
@davad

davad commented Nov 21, 2013

I'm having this problem with 0.1.0.beta1.

All of the posts I export from my Drupal 7 blog end up with lines like this for the title metadata:

title: !binary |-
  SGVsbG8gV29ybGQ=

To debug it, I added a 'puts title' after line 69 of the Drupal 7 importer.

It looks like it's extracting the title correctly from the Drupal database, but somehow it's not being written to the meta-data correctly. Any idea what's going on?

@parkr

This comment has been minimized.

Member

parkr commented Nov 21, 2013

Can you try the rc1? The Drupal importers have been updated a lot since beta1.

@davad

This comment has been minimized.

davad commented Nov 21, 2013

I actually had beta1 and rc1 installed side-by-side. I'm not sure which it was using before. I removed beta1 to make sure it was using rc1 and am still having the same problem.

@parkr

This comment has been minimized.

Member

parkr commented Nov 21, 2013

@parkr

This comment has been minimized.

Member

parkr commented Nov 21, 2013

I vote line 88.

@parkr

This comment has been minimized.

Member

parkr commented Nov 21, 2013

Should call YAML::dump instead of relying on the implicit #to_s call when it's passed as an arg to #puts.

@davad

This comment has been minimized.

davad commented Nov 21, 2013

I think you're right. I ran across this where the poster had a similar problem for a object.to_s.to_yaml after upgrading to 1.9.3 (which is the version of I'm using).

@davad

This comment has been minimized.

davad commented Nov 21, 2013

Tried applying that one line change in your pull request, and strange things happened:

--- ! "---\nlayout: default\ntitle: !binary |-\n  SG93IFRvIENyZWF0ZSBhbiBlQ29tbWVyY2Ugc2l0ZQ==\ncreated:
  1286502887\n"
---
<content>
@parkr

This comment has been minimized.

Member

parkr commented Nov 21, 2013

My god man! What terrible beast is that!

@davad

This comment has been minimized.

davad commented Nov 21, 2013

Another (cleaner) solution might be to force the string to UTF-8. E.g. title.encode("UTF-8").

@parkr

This comment has been minimized.

Member

parkr commented Nov 23, 2013

Did you try the suggestion at the bottom of: jekyll/jekyll#1132 ?

Might offer the final solution.

@davad

This comment has been minimized.

davad commented Nov 27, 2013

Nope. My LC_ALL environmental variable wasn't set, but setting it with LC_ALL="en_US.UTF-8" didn't change anything.

antonizoon pushed a commit to antonizoon/jekyll-import that referenced this issue Mar 20, 2015

Lawrence Wu
Convert Post titles into UTF-8 strings, not binary junk
This patch is designed to solve this Drupal 7 import bug: jekyll#90

where titles are dumped as ugly binary strings, and not UTF-8 strings.
@antonizoon

This comment has been minimized.

antonizoon commented Mar 20, 2015

I have solved this issue in this pull request.

For these pesky titles (such as those with stray \xE2 junk that screws everything up), I applied strip and force_encoding("UTF-8") to the title string.

@parkr

This comment has been minimized.

Member

parkr commented Mar 20, 2015

Fixed by #192.

@parkr parkr closed this Mar 20, 2015

@jekyll jekyll locked and limited conversation to collaborators Feb 27, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.