Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Correctly slugify the title from wordpress.xml #565

Closed
wants to merge 2 commits into from

2 participants

@tombell

If wp:post_name was empty in wordpress.xml it would correctly use the title which wasn't turned correctly into a slug, not handling forward slashes.

I've added a sort-of test case using an example wordpress.xml from an example blog I made, and modified it so one post would be missing the post name. None of the other migrators had any test cases, so I kind of rode in blind creating this one. If the test seems like overkill feel free to drop the commit tombell/jekyll@988a7a8

tombell added some commits
@tombell tombell Turn the title into a slug
This allows the title to be correctly used as a filename.
Fixes #523
73212a2
@tombell tombell Add test case for wordpressdotcom migrator
* Add hpricot to dev dependencies so we can use the migrator
* Add example wordpress.xml file for testing with
* Add `fixtures_dir` method to helper file
* Add test to see if 2 posts and 1 page are imported
988a7a8
@mojombo
Owner

I would prefer that we use Nokogiri instead of Hpricot, it's more modern and well maintained, otherwise this is looking good.

@tombell

This pull request is some what redundant given the newer migrator->importer pull request. I'll work on swapping Hpricot to Nokogiri in that pull request instead.

@tombell tombell closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on May 27, 2012
  1. @tombell

    Turn the title into a slug

    tombell authored
    This allows the title to be correctly used as a filename.
    Fixes #523
Commits on May 31, 2012
  1. @tombell

    Add test case for wordpressdotcom migrator

    tombell authored
    * Add hpricot to dev dependencies so we can use the migrator
    * Add example wordpress.xml file for testing with
    * Add `fixtures_dir` method to helper file
    * Add test to see if 2 posts and 1 page are imported
This page is out of date. Refresh to see the latest.
View
1  jekyll.gemspec
@@ -38,6 +38,7 @@ Gem::Specification.new do |s|
s.add_development_dependency('RedCloth', "~> 4.2")
s.add_development_dependency('rdiscount', "~> 1.6")
s.add_development_dependency('redcarpet', "~> 1.9")
+ s.add_development_dependency('hpricot', "~> 0.8.6")
# = MANIFEST =
s.files = %w[
View
4 lib/jekyll/migrators/wordpressdotcom.rb
@@ -19,7 +19,7 @@ def self.process(filename = "wordpress.xml")
permalink_title = item.at('wp:post_name').inner_text
# Fallback to "prettified" title if post_name is empty (can happen)
if permalink_title == ""
- permalink_title = title.downcase.split.join('-')
+ permalink_title = title.gsub(/[^[:alnum:]]+/, '-').downcase
end
date = Time.parse(item.at('wp:post_date').inner_text)
@@ -65,6 +65,8 @@ def self.process(filename = "wordpress.xml")
import_count.each do |key, value|
puts "Imported #{value} #{key}s"
end
+
+ import_count
end
end
end
View
149 test/fixtures/wordpress.xml
@@ -0,0 +1,149 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!-- This is a WordPress eXtended RSS file generated by WordPress as an export of your site. -->
+<!-- It contains information about your site's posts, pages, comments, categories, and other content. -->
+<!-- You may use this file to transfer that content from one site to another. -->
+<!-- This file is not intended to serve as a complete backup of your site. -->
+
+<!-- To import this information into a WordPress site follow these steps: -->
+<!-- 1. Log in to that site as an administrator. -->
+<!-- 2. Go to Tools: Import in the WordPress admin panel. -->
+<!-- 3. Install the "WordPress" importer from the list. -->
+<!-- 4. Activate & Run Importer. -->
+<!-- 5. Upload this file using the form provided on that page. -->
+<!-- 6. You will first be asked to map the authors in this export file to users -->
+<!-- on the site. For each author, you may choose to map to an -->
+<!-- existing user on the site or to create a new user. -->
+<!-- 7. WordPress will then import each of the posts, pages, comments, categories, etc. -->
+<!-- contained in this file into your site. -->
+
+<!-- generator="WordPress.com" created="2012-05-30 23:38"-->
+<rss version="2.0"
+ xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/"
+ xmlns:content="http://purl.org/rss/1.0/modules/content/"
+ xmlns:wfw="http://wellformedweb.org/CommentAPI/"
+ xmlns:dc="http://purl.org/dc/elements/1.1/"
+ xmlns:wp="http://wordpress.org/export/1.2/"
+>
+
+<channel>
+ <title>Test</title>
+ <link>http://tombtest.wordpress.com</link>
+ <description>A fine WordPress.com site</description>
+ <pubDate>Wed, 30 May 2012 23:38:48 +0000</pubDate>
+ <language>en</language>
+ <wp:wxr_version>1.2</wp:wxr_version>
+ <wp:base_site_url>http://wordpress.com/</wp:base_site_url>
+ <wp:base_blog_url>http://tombtest.wordpress.com</wp:base_blog_url>
+
+ <wp:author><wp:author_id>1402643</wp:author_id><wp:author_login>tombell</wp:author_login><wp:author_email>tomb@tomb.io</wp:author_email><wp:author_display_name><![CDATA[tombell]]></wp:author_display_name><wp:author_first_name><![CDATA[]]></wp:author_first_name><wp:author_last_name><![CDATA[]]></wp:author_last_name></wp:author>
+
+ <wp:category><wp:term_id>1</wp:term_id><wp:category_nicename>uncategorized</wp:category_nicename><wp:category_parent></wp:category_parent><wp:cat_name><![CDATA[Uncategorized]]></wp:cat_name></wp:category>
+
+ <generator>http://wordpress.com/</generator>
+<cloud domain='tombtest.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
+<image>
+ <url>http://s2.wp.com/i/buttonw-com.png</url>
+ <title>Test</title>
+ <link>http://tombtest.wordpress.com</link>
+ </image>
+ <atom:link rel="search" type="application/opensearchdescription+xml" href="http://tombtest.wordpress.com/osd.xml" title="Test" />
+ <atom:link rel='hub' href='http://tombtest.wordpress.com/?pushpress=hub'/>
+
+ <item>
+ <title>Hello world!</title>
+ <link>http://tombtest.wordpress.com/2012/05/30/hello-world/</link>
+ <pubDate>Wed, 30 May 2012 23:37:02 +0000</pubDate>
+ <dc:creator>tombell</dc:creator>
+ <guid isPermaLink="false">http://tombtest.wordpress.com/?p=1</guid>
+ <description></description>
+ <content:encoded><![CDATA[Welcome to <a href="https://wordpress.com/">WordPress.com</a>! This is your very first post. Click the Edit link to modify or delete it, or <a title="Direct link to Add New in the Admin Dashboard" href="/wp-admin/post-new.php">start a new post</a>. If you like, use this post to tell readers why you started this blog and what you plan to do with it.
+
+Happy blogging!]]></content:encoded>
+ <excerpt:encoded><![CDATA[]]></excerpt:encoded>
+ <wp:post_id>1</wp:post_id>
+ <wp:post_date>2012-05-30 23:37:02</wp:post_date>
+ <wp:post_date_gmt>2012-05-30 23:37:02</wp:post_date_gmt>
+ <wp:comment_status>open</wp:comment_status>
+ <wp:ping_status>open</wp:ping_status>
+ <wp:post_name>hello-world</wp:post_name>
+ <wp:status>publish</wp:status>
+ <wp:post_parent>0</wp:post_parent>
+ <wp:menu_order>0</wp:menu_order>
+ <wp:post_type>post</wp:post_type>
+ <wp:post_password></wp:post_password>
+ <wp:is_sticky>0</wp:is_sticky>
+ <category domain="category" nicename="uncategorized"><![CDATA[Uncategorized]]></category>
+ <wp:comment>
+ <wp:comment_id>1</wp:comment_id>
+ <wp:comment_author><![CDATA[Mr WordPress]]></wp:comment_author>
+ <wp:comment_author_email></wp:comment_author_email>
+ <wp:comment_author_url>http://WordPress.com/</wp:comment_author_url>
+ <wp:comment_author_IP></wp:comment_author_IP>
+ <wp:comment_date>2012-05-30 23:37:02</wp:comment_date>
+ <wp:comment_date_gmt>2012-05-30 23:37:02</wp:comment_date_gmt>
+ <wp:comment_content><![CDATA[Hi, this is a comment.<br />To delete a comment, just log in, and view the posts' comments, there you will have the option to edit or delete them.]]></wp:comment_content>
+ <wp:comment_approved>1</wp:comment_approved>
+ <wp:comment_type></wp:comment_type>
+ <wp:comment_parent>0</wp:comment_parent>
+ <wp:comment_user_id>0</wp:comment_user_id>
+ </wp:comment>
+ </item>
+ <item>
+ <title>About</title>
+ <link>http://tombtest.wordpress.com/about/</link>
+ <pubDate>Wed, 30 May 2012 23:37:02 +0000</pubDate>
+ <dc:creator>tombell</dc:creator>
+ <guid isPermaLink="false">http://tombtest.wordpress.com/?page_id=2</guid>
+ <description></description>
+ <content:encoded><![CDATA[This is an example of a page. Unlike posts, which are displayed on your blog’s front page in the order they’re published, pages are better suited for more timeless content that you want to be easily accessible, like your About or Contact information. Click the Edit link to make changes to this page or <a title="Direct link to Add New in the Admin Dashboard" href="/wp-admin/post-new.php?post_type=page">add another page</a>.]]></content:encoded>
+ <excerpt:encoded><![CDATA[]]></excerpt:encoded>
+ <wp:post_id>2</wp:post_id>
+ <wp:post_date>2012-05-30 23:37:02</wp:post_date>
+ <wp:post_date_gmt>2012-05-30 23:37:02</wp:post_date_gmt>
+ <wp:comment_status>open</wp:comment_status>
+ <wp:ping_status>open</wp:ping_status>
+ <wp:post_name>about</wp:post_name>
+ <wp:status>publish</wp:status>
+ <wp:post_parent>0</wp:post_parent>
+ <wp:menu_order>0</wp:menu_order>
+ <wp:post_type>page</wp:post_type>
+ <wp:post_password></wp:post_password>
+ <wp:is_sticky>0</wp:is_sticky>
+ <wp:postmeta>
+ <wp:meta_key>_wp_page_template</wp:meta_key>
+ <wp:meta_value><![CDATA[default]]></wp:meta_value>
+ </wp:postmeta>
+ </item>
+ <item>
+ <title>Another/Post With Slash</title>
+ <link>http://tombtest.wordpress.com/2012/05/30/anotherpost-with-slash/</link>
+ <pubDate>Wed, 30 May 2012 23:37:47 +0000</pubDate>
+ <dc:creator>tombell</dc:creator>
+ <guid isPermaLink="false">http://tombtest.wordpress.com/?p=3</guid>
+ <description></description>
+ <content:encoded><![CDATA[This is some blog content]]></content:encoded>
+ <excerpt:encoded><![CDATA[]]></excerpt:encoded>
+ <wp:post_id>3</wp:post_id>
+ <wp:post_date>2012-05-30 23:37:47</wp:post_date>
+ <wp:post_date_gmt>2012-05-30 23:37:47</wp:post_date_gmt>
+ <wp:post_name></wp:post_name>
+ <wp:comment_status>open</wp:comment_status>
+ <wp:ping_status>open</wp:ping_status>
+ <wp:status>publish</wp:status>
+ <wp:post_parent>0</wp:post_parent>
+ <wp:menu_order>0</wp:menu_order>
+ <wp:post_type>post</wp:post_type>
+ <wp:post_password></wp:post_password>
+ <wp:is_sticky>0</wp:is_sticky>
+ <category domain="category" nicename="uncategorized"><![CDATA[Uncategorized]]></category>
+ <wp:postmeta>
+ <wp:meta_key>_edit_last</wp:meta_key>
+ <wp:meta_value><![CDATA[1402643]]></wp:meta_value>
+ </wp:postmeta>
+ <wp:postmeta>
+ <wp:meta_key>jabber_published</wp:meta_key>
+ <wp:meta_value><![CDATA[1338421069]]></wp:meta_value>
+ </wp:postmeta>
+ </item>
+</channel>
+</rss>
View
4 test/helper.rb
@@ -28,6 +28,10 @@ def source_dir(*subdirs)
File.join(File.dirname(__FILE__), 'source', *subdirs)
end
+ def fixtures_dir(*subdirs)
+ File.join(File.dirname(__FILE__), 'fixtures', *subdirs)
+ end
+
def clear_dest
FileUtils.rm_rf(dest_dir)
end
View
18 test/migrators/test_wordpressdotcom.rb
@@ -0,0 +1,18 @@
+require 'helper'
+require 'jekyll/migrators/wordpressdotcom'
+
+class TestWordpressDotCom < Test::Unit::TestCase
+ context 'migrating from wordpress.com' do
+ setup do
+ @wordpressxml = File.join(fixtures_dir, 'wordpress.xml')
+ end
+
+ should 'import post with a slash in the title from wordpress.xml' do
+ stub(FileUtils).mkdir_p(anything)
+ stub(File).open(anything, anything)
+ imported = Jekyll::WordpressDotCom.process(@wordpressxml)
+ assert_equal 1, imported['page']
+ assert_equal 2, imported['post']
+ end
+ end
+end
Something went wrong with that request. Please try again.