Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: retain paragraph #79

Merged
merged 2 commits into from
Jul 28, 2020
Merged

fix: retain paragraph #79

merged 2 commits into from
Jul 28, 2020

Conversation

curbengh
Copy link
Contributor

@curbengh curbengh commented Jul 23, 2020

Fixes #35

In WP classic editor, exported post doesn't have <p> which is required by turndown to retain newline; without <p> (or any other element), turndown will remove newline mixmark-io/turndown#264.

I'm assuming most users used the modern editor, hence this workaround is not applied by default. To enable:

$ hexo migrate wordpress exported.xml --paragraph-fix

@coveralls
Copy link

coveralls commented Jul 23, 2020

Coverage Status

Coverage increased (+0.09%) to 95.862% when pulling c3ffa5b on curbengh:restore-paragraph into b5b2565 on hexojs:master.

lib/migrator.js Outdated Show resolved Hide resolved
@jehy
Copy link

jehy commented Jul 23, 2020

Code blocks are usually inserted via external plugins like WP-Syntax, so there can be different use cases... My code blocks use WP-Syntax,are added with <pre lang="php"> and look like this:

<content:encoded><![CDATA[
<p>Data base of wikimedia-based project in several monthes can gain awful size. Since there are no solutions from wikimedia itself, but you can use wonderful plugin "SpecialDeleteOldRevisions", который эту функциональность обеспечивает. It helps you to delete articles, filtering by<UL></p>
<li>Article Category</li>
<li>Revision creation time</li>
<li>Article name</li>
<p></UL>Also you have an option - if you want to delete deleted articles from database or not. I checked it on <a href="https://jehy.ru/wiki">my wiki</a> - everything works wonderful. But, as always, after some bugfix. I made this work and published fixed version.</p>
<p><a href="http://www.mediawiki.org/wiki/Extension:SpecialDeleteOldRevisions">Original plugin page</a></p>
<p><a href="https://jehy.ru/dload/specialdeleteoldrevisions.zip">My patched version for wiki 13.2</a></p>
<p>To install plugin, copy it's directory "SpecialDeleteOldRevisions" to your "/extensions", and add to LocalSettings.php the following lines:</p>
<pre lang="php">
 $wgGroupPermissions['sysop']['DeleteOldRevisions'] = true;
 include_once('extensions/SpecialDeleteOldRevisions/SpecialDeleteOldRevisions.php');
</pre>
<p>After it in "special pages" you will see new link - "Delete old revisions" - use it. And better make backup firstly ;).</p>
]]></content:encoded>

@curbengh
Copy link
Contributor Author

The sample you gave has <p> which I believe is from modern editor? Can you try create another post with classic editor?

@jehy
Copy link

jehy commented Jul 24, 2020

The sample you gave has <p> which I believe is from modern editor? Can you try create another post with classic editor?

I've written posts with wordpress for 14 years, and I've used different post formats, and I'm not even sure what was written in editor, what I coded in html, and where I used post formatting plugins (you know, it is even possible to write wordpress posts in markdown...).

So I suppose that my posts are not the best source for exploring. May be we need a new clean wordpress installation to deal our best with default formatting.

@curbengh
Copy link
Contributor Author

From a sample provided by @adnan360, the post Post with Image (Classic Editor) doesn't have <p> which causes #35. Looks like Classic Editor refers to this plugin.

@adnan360
Copy link

Looks like Classic Editor refers to this plugin.

Yes, you're right. WP has phased out the classic editor in favor of the new Gutenberg editor since v5. But for backwards compatibility they have kept it supported with this plugin in case something breaks, or someone needs it.

I have added some inline code, code blocks and block quote (to test) into the posts on a new gist here. Unfortunately, classic editor does not have a "code" button to create a code block. So I manually went into "Text" mode and typed in code within <pre> tag. Everything else is same as previous gist I shared.

@curbengh
Copy link
Contributor Author

curbengh commented Jul 25, 2020

Unfortunately, classic editor does not have a "code" button to create a code block. So I manually went into "Text" mode and typed in code within <pre> tag.

so the workaround !/<pre>/i.test(str) wouldn't work unless <pre> is manually inserted before running this plugin. I removed the workaround. It would be easier for users to fix (i.e. add ```) after import.

@curbengh curbengh marked this pull request as ready for review July 25, 2020 04:30
@adnan360
Copy link

Let me know if I'm missing something. But I've tried this branch (npm install curbengh/hexo-migrator-wordpress#restore-paragraph --save) and this is what I got:

wp-new-line-01

On the left the Hexo site shows the lines on one paragraph, on the right you can see the original post (in Text mode) to show that there is a new line. But Hexo is still showing the lines together.

HTML source on Hexo page shows:

<p>This is a post written in classic WP editor. So this is the excerpt before more tag.</p>
--
  | <a id="more"></a>
  | <p>This is the post body after more tag. This is an <code>inline code</code> example on the classic editor. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Justo nec ultrices dui sapien. ...

@curbengh
Copy link
Contributor Author

The workaround is not enabled by default, need to

$ hexo migrate wordpress exported.xml --paragraph_fix

@adnan360
Copy link

The workaround is not enabled by default, need to

$ hexo migrate wordpress exported.xml --paragraph_fix

OK. Working now with the parameter.
I think we should change parameter syntax from --paragraph_fix to --paragraph-fix. Most cli programs I use follow this rule.

@curbengh
Copy link
Contributor Author

I think we should change parameter syntax from --paragraph_fix to --paragraph-fix

Updated. I will also update --import_image to --import-image.

@curbengh curbengh merged commit e0bb437 into hexojs:master Jul 28, 2020
@curbengh curbengh deleted the restore-paragraph branch July 28, 2020 06:56
@curbengh
Copy link
Contributor Author

curbengh commented Jul 28, 2020

Seems alright with codeblock.

const TurndownService = require('turndown');
const tomd = new TurndownService({ headingStyle: 'atx', codeBlockStyle: 'fenced' });

const paragraph_fix = true;

const md = str => {
  // #35
  if (paragraph_fix && !/<p>/i.test(str)) {
    str = '<p>' + str.replace(/(\r?\n){2}/g, '</p>\n\n<p>') + '</p>';
  }

  return tomd.turndown(str);
};
const content = `
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Aenean vel elit scelerisque mauris pellentesque. Dictumst quisque sagittis purus sit amet volutpat. Urna cursus eget nunc scelerisque viverra mauris in aliquam. Non enim praesent elementum facilisis leo vel fringilla est ullamcorper. Ultrices sagittis orci a scelerisque purus semper.

<pre><code>
const TurndownService = require('turndown');
const tomd = new TurndownService({ headingStyle: 'atx', codeBlockStyle: 'fenced' });

const paragraph_fix = true;

console.log(tomd);

</code></pre>

Quis blandit turpis cursus in hac. Massa enim nec dui nunc mattis enim ut tellus. Justo eget magna fermentum iaculis eu non. Facilisis gravida neque convallis a cras semper. Est velit egestas dui id ornare arcu odio ut sem. Justo eget magna fermentum iaculis eu.
`

console.log(md(content))
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Aenean vel elit scelerisque mauris pellentesque. Dictumst quisque sagittis purus sit amet volutpat. Urna cursus eget nunc scelerisque viverra mauris in aliquam. Non enim praesent elementum facilisis leo vel fringilla est ullamcorper. Ultrices sagittis orci a scelerisque purus semper.

```

const TurndownService = require('turndown');
const tomd = new TurndownService({ headingStyle: 'atx', codeBlockStyle: 'fenced' });

const paragraph_fix = true;

console.log(tomd);

```

Quis blandit turpis cursus in hac. Massa enim nec dui nunc mattis enim ut tellus. Justo eget magna fermentum iaculis eu non. Facilisis gravida neque convallis a cras semper. Est velit egestas dui id ornare arcu odio ut sem. Justo eget magna fermentum iaculis eu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

paragraphs lost post-migration
5 participants