Author: Ethan J. Eldridge
Python tool to migrate WP contents to JSON for use in harp
- Creates JSON for Posts
- Creates JSON for Comments
- Creates JSON for Nav
- Creates JSON for Pages
- Creates .md files for the Posts
- Creates .md files for the pages
- If using the PULL_TYPES, will pull down entire wp_post table and convert it into usable _data.
Fill out the database credentials at the top of the file $python wp2json4harp.py You'll now have an example.jade file, and a few directories with _data.json inside.
Once you've ran the script, you'll get folders for each of the _data.json files. This makes things a bit easier to coordinate, and you can see from the example jade file how you can access the information pulled from your blog.
It's pretty heavy on I/O from all the writes, but I pulled down a sizable wordpress database within a reasonable time (less than a minute) that had 347915 rows in the postmeta, and 34617 in the posts table, so it works alright.
If you'd like to try it out:
- Install Wordpress
- Install Harp
- Download this script
- Configure it to your liking using the options below
- Run the script!
- Move the folders and files into your harp site area.
Some configuration details:
Configuration is at the top of the script, you'll need to enter your database credentials. Optionally, you can fully configure the script using the constants below:
Configuration of Script
|Constant||What it does|
|MYSQL_HOST||Defines the host of the database to connect to.|
|MYSQL_USER||Defines the user to connect to the database as|
|MYSQL_PASS||Defines the password to the database|
|MYSQL_DB||Defines the database name connected to on the host.|
|WP_PREFIX||The prefix to your wordpress tables, typically this is `wp_`|
|ONLY_PUBLISHED||Only retrieve posts and pages that have been published|
|GENERATE_PAGES||Generate a markdown file for the pages being pulled from the WP database. This will exist in the PAGES_DIR|
|GENERATE_POSTS||Generate a markdown file for the posts being pulled from the WP database. This will exist in the BLOG_DIR|
|ROOT_DIR||Where to generate all the files this script creates, leave empty by default for the area where the script is being ran|
|ENCODING||The encoding to decode the content from the database in, I've defaulted it to latin to handle some annoying unicode errors|
|OUTPUT_ENCODING||The encoding to encode the _data.json files in|
|STRIP_NON_ASCII||strips out non-ascii characters from data being written into _data.json|
|PULL_TYPES||Specify this to true and all post types will be pulled out of the database and _data files created for eachs, if you use this, then the *_DIR constants mean nothing.|
|PAGES_DIR||The directory name where the pages will be stored|
|BLOG_DIR||The directory name where the blog posts will be stored|
|NAV_DIR||The directory where the navigation json will be stored|
|COMMENTS_DIR||The directory where comments will be stored.|
|EXAMPLE_FILE||The name of the file that will be generated to show some of the posts and pages.|
|STOP_ON_ERR||Boolean value that causes errors to stop the script,|
- Add nav stuff to the PULL_TYPES area as well to help out with navigation
- How does one pull in the comments to a post?
- Use getopt to make cmd line arguments instead of constants
- More examples!