Skip to content


Repository files navigation


This tool performs two functions; preparing WordPress blog exports in WXR for migration, and integrating with the Drupal migrate module for actual migration.


A number of transformations and preparations are prepared, including:

  • Stripping comments (useful if using Disqus or a similar system)
  • Combining separate files into one monolithic file
  • Extracting author, tag, image information into separate files
  • Replacing all WordPress shortcodes and filters with HTML


The following migrations are included:

Name of Migration Mapping Review
EwWpAuthor /admin/content/migrate/EwWpAuthor
EwWpImage /admin/content/migrate/EwWpImage
EwWpRecapShow /admin/content/migrate/EwWpRecapShow
EwWpRecapShowSeason /admin/content/migrate/EwWpRecapShowSeason
EwWpPost /admin/content/migrate/EwWpPost
EwWpCategory /admin/content/migrate/EwWpCategory
EwWpTag /admin/content/migrate/EwWpTag
EwWpTaxonomyRule /admin/content/migrate/EwWpTaxonomyRule


Create the following folder:


For each WordPress blog that you wish to process, add a sub-folder containing all WXR (XML) files associated with the export. Unzipping the archive containing the WXR files works quite nicely.

# Enable the module.
drush -y en ew_wordpress_migrate
# Clear caches.
drush cc all
drush cc drush
# Verify Migrations are ready.
drush migrate-status
# Import Taxonomy rules.
drush mi EwWpTaxonomyRule

Tags and Categories

Since the target system has a number of custom taxonomies and vocabularies, a straight 1:1 import isn't possible. Some terms will become content types, such as a person or a creative work, and other terms will be mapped to a specific vocabulary like event, genre, and so forth. Additionally, some terms are just invalid and should be ignored instead of imported.

To handle all these transformations, a system has been devised that first consolidates all the tags and categories in one place, then exports a spreadsheet where the transformation logic can be defined. The resulting spreadsheet can be edited by stakeholders with the appropriate domain knowledge to make decisions about how the tags will be handled. When complete, the spreadsheet is re-imported, customizing the rules. Once the rules are populated, the actual migration can take place with the customized logic.


A number of steps are involved. First, pre-process all WordPress blogs using:

drush ew-wp-preimport

Then, run the following migrations to import all the tags.

drush mi EwWpCategory
drush mi EwWpTag

The tags themselves need customization by staff; to do that, it will need to be exported:

# Writes to public://migration/wordpress/target/migration_taxonomy_rules.csv
drush ew-wp-tax-rule-export
# Selectively export non-customized rules; useful for deltas.
drush ew-wp-tax-rule-export --skip-existing

The spreadsheet is then customized, then placed in this module folder.

The rules can then be imported.

drush mi EwWpTaxonomyRule

Once the rules are in place, the WordPress content migrations can commence.

The rule spreadsheet

The CSV file can be edited in any text or spreadsheet editor. Before import, it must have Unicode (UTF-8) character encoding, Unix (LF) line endings, and have a header row with the field names (it will be skipped on import).

Field Name Editable Possible Values Purpose
id No Integers Primary key of Table
field_legacy_id No Integers WordPress Term ID
source_term No String The original term
blog_name No String The source blog name, if any for context
source_type No tag, category Context
rename_term Yes Any string If set, will transform the term
ignore_term Yes 0, 1 If set to 1, will not import the term
person Yes 0, 1 If set to 1, will target Person content type, creating as necessary
movie Yes 0, 1 If set to 1, will target Creative Work type Movie, creating as necessary
tv Yes 0, 1 If set to 1, will target Creative Work type TV, creating as necessary
book Yes 0, 1 If set to 1, will target Creative Work type Book, creating as necessary
music Yes 0, 1 If set to 1, will target Creative Work type Music, creating as necessary
stage Yes 0, 1 If set to 1, will target Creative Work type Stage, creating as necessary
web_series Yes 0, 1 If set to 1, will target Creative Work type Web Series, creating as necessary
video_game Yes 0, 1 If set to 1, will target Creative Work type Video Game, creating as necessary
taxonomy_genre Yes 0, 1 If set to 1, will add term to specified Taxonomy
taxonomy_ew_franchise Yes 0, 1 If set to 1, will add term to specified Taxonomy
taxonomy_franchise Yes 0, 1 If set to 1, will add term to specified Taxonomy
taxonomy_event Yes 0, 1 If set to 1, will add term to specified Taxonomy
taxonomy_freeform Yes 0, 1 If set to 1, will add term to specified Taxonomy
taxonomy_category Yes 0, 1 If set to 1, will add term to specified Taxonomy

To consolidate terms, use the rename field. For example, with the following TV show:

  • 666 Park

  • 666 Park Avenue

Specify the rename field for "666 Park" and set it to "666 Park Avenue".

For both entries, set tv to 1.

Make no change to "666 Park Avenue". The end result will be one Creative Work named "666 Park Avenue" type TV.

Importing content from WordPress

Ensure that content types have all the necessary fields. Reverting features is the easiest way.

# Revert all features.
drush fra -y

Prepare the files for import.

drush ew-wp-preimport --all_operations
# Specify folder to process.

Perform the import.

drush migrate-import EwWpAuthor
drush migrate-import EwWpImage
drush migrate-import EwWpRecapShow
drush migrate-import EwWpRecapShowSeason
drush migrate-import EwWpPost
# Undo all migrations.
drush migrate-rollback --all -y

Testing images

If you don't feel like downloading every target image, there is a test mode available.

Put a test image in public://test.png, then edit and change $test_mode to TRUE;. The migration will use the same paths and file names, but the actual image will be the test image.

I recommend making the test image as small as possible.

Known limitations

Does not account for or import:

  • category
  • wp:postmeta
    • blog-related-* - TBD
  • pkg_label
  • module_title - DEPRECATE
  • module_classes - DEPRECATE
  • module_more_link - DEPRECATE
  • _publicize_pending - DEPRECATE
  • tagazine-media - DEPRECATE
  • sharing_disabled - DEPRECATE
  • geo_public - DEPRECATE
  • _ew_wp_meta
  • image variations (not known if actual crops or just resized)

Drush utilities


Prepares WXR files for ingestion with the WordPress Migrate module. Recommend using the --all_operations option for convenience.


 source                                    Folder within sites/default/files/migration/wordpress/source


 --all_operations                          If set, will perform every operation, such as removing comments, combining, etc.
 --combine                                 If set, creates one monolithic WXR file.
 --create_authors                          If set, will create an additional file, authors.xml, containing all authors.
 --create_tags                             If set, will create an additional file, tags.xml, containing all tags & categories.
 --ew_image_extract                        If set, creates ew_image.json containing all metadata ew_image tags.
 --img_src_extract                         If set, creates images.json containing all img tags and metadata.
 --remove_authors                          If set, will remove authors from each WXR file.
 --remove_comments                         If set, will remove comments from each WXR file.
 --remove_tags                             If set, will remove tags & categories from each WXR file.
 --shortcodes                              If set, replaces shortcodes with HTML.


Export Taxonomy Term Rules for Migration into CSV format.


 --skip-existing                           If set, skips rules that have already been defined.

WordPress shortcodes and filters

This was an interesting problem; how to render the hundreds of standard and custom filters and shortcodes. The solution was to include a bare-minimum subset of WordPress function files (see contrib/wordpress/includes) and some placeholder functions that do nothing. Search for "Colossal WordPress hack" to see what was done and how. If something is rendering strangely, look in that area first.


When complete, disable and uninstall the module.

drush pm-disable ew_wordpress_migrate
drush pm-uninstall ew_wordpress_migrate


You can format the output to make it human readable.


xmllint --format combined.xml --output combined-working.xml


cat images.json | python -mjson.tool > images-working.json


No description, website, or topics provided.






No releases published


No packages published
