Skip to content

Use new export API #525

Merged
merged 18 commits into from Dec 14, 2013

3 participants

@scribu
wp-cli member
scribu commented Jun 16, 2013

The current code that implements wp export is a mess, because the WordPress export code is a mess.

@nb proposed a proper export API (he says it's already being used on wordpress.com).

Ticket: http://core.trac.wordpress.org/ticket/22435
Diff: nb/WordPress@master...export-api

If we namespace it properly, we can start using it before it lands in Core.

Tasks:

  • do something with the --verbose flag
  • replace VerboseExportWriter class with action inside WP_Export_Split_Files_Writer
  • add parameter for specifying filename format #issuecomment-20013095
  • call export_wp action again?
  • manually test with large export
@scribu
wp-cli member
scribu commented Jun 16, 2013

@danielbachhuber Please give this a spin when you get the chance.

@danielbachhuber
wp-cli member
scribu and others added some commits Jun 16, 2013
@scribu scribu copy files from wp-includes/export verbatim 8329d4b
@scribu scribu load Oxymel via Composer adfdc88
@scribu scribu copy wp-includes/iterators.php to export/iterators.php c348058
@scribu scribu load export classes via Composer 9ae66fa
@scribu scribu first pass at loading and using the export API
* replace --file_item_count with --max_file_size
* --verbose is ignored, for now
1e50229
@scribu scribu add _wp_export_build_IN_condition() helper 128b9c0
@scribu scribu log each exported file path
had to copy over the whole WP_Export_Split_Files_Writer class.
4bfc6be
@scribu scribu remove --verbose parameter
The idea is to be reasonably verbose by default and completely silent
when --quiet is passed.
aa54258
@danielbachhuber danielbachhuber Restore af65fe4, which I accidentally overwrote :( fb5da0a
@scribu
wp-cli member
scribu commented Jun 25, 2013

I find myself overusing git rebase too. It makes the history cleaner, but it's a fake history. Plus, you lose the build results for intermediate commits.

Let's make a deal pact: no more rebasing of branches. Instead, when a branch gets dirty outdated, we just merge master into it and fix the conflicts.

@scribu
wp-cli member
scribu commented Jun 25, 2013

If a branch gets really outdated, we just abandon it and start a new one, with a different name.

@danielbachhuber
wp-cli member

Let's make a deal: no more rebasing of branches. Instead, when a branch gets dirty, we just merge master into it and fix the conflicts.

Works for me. Hopefully this leads to smaller branches too, so there aren't a bunch of merge commits clogging the history :)

I took a brief look at this. It seems like a good improvement, but I'm not yet sold on it as a great improvement.

Some thoughts:

  • Does it properly handle large exports (e.g. 200MB+)? We had some object cache reset hack previously to avoid memory leaks that's been removed.
  • Seems much faster. Do we need the CLI progress indicators back?
  • I included the CLI arguments in the filename originally so you could easily create multiple export files on a single day without worrying about them overwriting each other. It would be nice to restore those.
  • Not sure how I feel about removing file_item_count and replacing it with splitting based on file size.

cc @nb

@scribu
wp-cli member
scribu commented Jun 25, 2013

Does it properly handle large exports (e.g. 200MB+)? We had some object cache reset hack previously to avoid memory leaks that's been removed.

No idea; need to test.

Seems much faster. Do we need the CLI progress indicators back?

I'm not sure how easy it would be to add them back; we'd have to fork even more of the new export API.

I included the CLI arguments in the filename originally so you could easily create multiple export files on a single day without worrying about them overwriting each other. It would be nice to restore those.

I don't think including the CLI arguments verbatim was a good idea to start. We could add their md5 hash though.

Not sure how I feel about removing file_item_count and replacing it with splitting based on file size.

I think it makes sense; the file size is what you really care about when splitting.

@danielbachhuber
wp-cli member

we'd have to fork even more of the new export API.

Can we instead ask Nikolay to include actions within the export API that achieve the end results we want? I haven't done a diff to see what we've changed, but my preference would be no forking at all.

I don't think including the CLI arguments verbatim was a good idea to start. We could add their md5 hash though.

Care to explain? Including them verbatim achieves two things:

  1. Tells me which export the file corresponds with.
  2. Prevents a second export from overwriting the first.

An md5 hash only achieves the second.

@scribu
wp-cli member
scribu commented Jun 25, 2013

It's not a good idea because the arguments can get really long and might contain unusual characters (for a file name).

I understand it's convenient, but it doesn't justify the potential headaches.

@danielbachhuber
wp-cli member

It's not a good idea because the arguments can get really long and might contain unusual characters (for a file name).

Let that become a problem when it becomes a problem. I was sanitizing the filename previously.

@scribu
wp-cli member
scribu commented Jun 25, 2013

Copying the relevant code chunk here, for convenience:

$append = array( date( 'Y-m-d' ) );
foreach( array_keys( $args ) as $arg_key ) {
    if ( $defaults[$arg_key] <> $args[$arg_key] && 'post__in' != $arg_key )
        $append[]= "$arg_key-" . (string) $args[$arg_key];
}
$file_name_base = sanitize_file_name( $sitename . 'wordpress.' . implode( ".", $append ) );

Still not sold; it would be cleaner to allow the user to specify an "ID" for a batch of export files. For example:

wp export --batch-id=events-for-x --post_type=event

which would generate:

testground.wordpress.2013-06-events-for-x.0.xml
testground.wordpress.2013-06-events-for-x.1.xml
testground.wordpress.2013-06-events-for-x.2.xml

Or even allow them control of the entire filename format:

wp export --filename-format='events-for-x-{site}-{date}-{nr}.xml' --post_type=event
@scribu
wp-cli member
scribu commented Jun 25, 2013

I guess my beef with the old code is that 1) the filenames generated by default are too dynamic and 2) you can't change the default.

@nb
nb commented Jun 25, 2013

I am cool with adding actions and filters, I almost haven't thought how it will be extended, so I almost haven't added actions or filters.

@scribu
wp-cli member
scribu commented Jul 5, 2013

Added 'wp_export_new_file' hook: 743092d

Suggestions for better name and/or better placement (while achieving similar functionality) welcome.

@scribu
wp-cli member
scribu commented Dec 14, 2013

Guys, I'm going to merge this pretty much as-is.

My hunch is that people will be a lot more likely to submit incremental improvements once they see that the code is relatively clean.

@scribu scribu merged commit fd8cefb into master Dec 14, 2013

1 check passed

Details default The Travis CI build passed
@scribu scribu deleted the export-api branch Dec 14, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.