Add `noMarkup` option to escape user HTML #224

hkdobrev · 2014-09-20T11:58:10Z

Resolves #106. Addresses #161.

This change introduces a new option - noMarkup. You could set it in the
setNoMarkup() method similar to the setBreaksEnabled() one.

Example usage:

<?php

$parsedown = new Parsedown();
$parsedown->setNoMarkup(true);
$parsedown->text('<div><strong>*Some text*</strong></div>');

// Outputs:
// <p>&lt;div>&lt;strong><em>Some text</em>&lts;/strong>&lt;/div></p>

Resolves #106. This change introduces a new option - `noMarkup`. You could set it the `setNoMarkup()` method similar to the `setBreaksEnabled()` one. Example usage: ``` php <?php $parsedown = new Parsedown(); $parsedown->setNoMarkup(true); $parsedown->text('<div><strong>*Some text*</strong></div>'); // Outputs: // <p><div><strong><em>Some text</em>&lts;/strong></div></p> ```

xPaw · 2014-09-20T12:18:52Z

Shouldn't it escape '>' like it does with '<'?

hkdobrev · 2014-09-20T12:31:42Z

@xPaw No need. http://stackoverflow.com/questions/9010678/html-should-i-encode-greater-than-or-not-gt

I think you'd need to escape < in an XML document.

It should also be escaped in attributes, but this is not relevant to this change.

I think Parsedown does not escape it for performance reasons and because there is no reason to do it.

xPaw · 2014-09-20T12:38:52Z

HTML spec suggests escaping it. See http://www.w3.org/html/wg/drafts/html/master/syntax.html#serializing-html-fragments

This includes & and ".

hkdobrev · 2014-09-20T13:34:49Z

@xPaw I believe this text in the spec is targeted towards user agents and other software parsing HTML or working with the DOM.

Even if Parsedown should escape <, this is not related to this pull request. This is the way Parsedown works regardless of the noMarkup option. You should probably discuss it here: #197 or open a new issue.

hkdobrev · 2014-09-20T14:28:02Z

BTW @javiereguiluz has raised the question about releasing in #106 (comment)

@erusev I'd like to add that if you want to be very strict about Semantic Versioning this should be released as a new minor version instead of a patch version, because it changes the interface and the major version is bigger than 0. It doesn't matter that it is backwards compatible.

Minor version Y (x.Y.z | x > 0) MUST be incremented if new, backwards compatible functionality is introduced to the public API.

Add `noMarkup` option to escape user HTML

erusev · 2014-09-21T20:20:16Z

@hkdobrev

I'd like to add that if you want to be very strict about Semantic Versioning this should be released as a new minor version instead of a patch version

👍

hkdobrev · 2014-09-21T20:27:56Z

@erusev I've just updated the wiki with the example usage.

https://github.com/erusev/parsedown/wiki/Usage

erusev · 2014-09-21T20:31:46Z

@hkdobrev I'm contemplating changing the name of the method to make it more consistent with the name of the other setter. Probably to setMarkupEscaped.

hkdobrev · 2014-09-21T20:38:14Z

@erusev I was considering the following names: disallowMarkup, setMarkupDisallowed, setEscapeMarkup, setDisallowHtml, setNoHtml.

I went for setNoMarkup for the following reasons:

It is consistent with what users have used with other parsers.
It is short and clear.
It is very clear that you are setting an option which is named noMarkup. In the case of setMarkupEscaped one could think you could pass it some markup which is already escaped. Generally such options should be named so it is clear they are options. Thus said I am not happy with breaksEnabled as well. It should be enableBreaks and the method would be setEnableBreaks.

erusev · 2014-09-21T23:21:51Z

@hkdobrev

The name needs to be an adjective, as it represents a boolean, and the attribute in question doesn't remove markup. It escapes it.

I appreciate the PR and I hope you wouldn't mind me making this change.

markseuffert · 2014-10-08T22:12:42Z

Does setMarkupEscaped(true) escape HTML elements and entities?

erusev · 2014-10-09T17:00:31Z

@markseu It does. You can test at http://parsedown.org/demo?set[MarkupEscaped]=1.

javiereguiluz · 2014-10-09T18:42:32Z

@markseu it does and you must be aware of this if you use the HTML generated by Parsedown afterwards. For instance, I use Parsedown to convert Markdown into HTML and then I use GeSHi to highlight the code listings. This is my workflow:

// use Parsedown to get HTML from Markdown
$parser = new \ParsedownExtra();
$parser->setMarkupEscaped(true);

$html = $parser->text($content);


// highlight code with GeSHi
$html = preg_replace_callback(
    '/^<pre><code( class="language\-(?<syntax>.*)")?>(?<source>.*)<\/code><\/pre>/Ums',
    'highlightCode',
    $html
);

The problem is that I need an ugly hack on highlightCode() function in order to unescape elements and entities. Otherwise, GeSHi would double escape them:

public function highlightCode($matches)
{
    $source = trim($matches['source']);

    // this is the ugly hack to avoid double escaping
    $source = str_replace(
        array('&lt;', '&gt;', '&amp;'),
        array('<',    '>',    '&'),
        $source
    );

    // highlight source code with GeSHi ...
}

This was referenced Sep 20, 2014

Safe Mode #161

Closed

Parse all html tags as normal strings #106

Closed

erusev added a commit that referenced this pull request Sep 21, 2014

Merge pull request #224 from hkdobrev/no-markup-option

e0965ce

Add `noMarkup` option to escape user HTML

erusev merged commit e0965ce into erusev:master Sep 21, 2014

hkdobrev deleted the no-markup-option branch September 21, 2014 20:23

hkdobrev mentioned this pull request Oct 9, 2014

Problem testing escaped HTML entity #235

Closed

JoostvanPinxten mentioned this pull request Oct 10, 2014

Make Parsedown options available GrahamCampbell/Laravel-Markdown#14

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `noMarkup` option to escape user HTML #224

Add `noMarkup` option to escape user HTML #224

hkdobrev commented Sep 20, 2014

xPaw commented Sep 20, 2014

hkdobrev commented Sep 20, 2014

xPaw commented Sep 20, 2014

hkdobrev commented Sep 20, 2014

hkdobrev commented Sep 20, 2014

erusev commented Sep 21, 2014

hkdobrev commented Sep 21, 2014

erusev commented Sep 21, 2014

hkdobrev commented Sep 21, 2014

erusev commented Sep 21, 2014

markseuffert commented Oct 8, 2014

erusev commented Oct 9, 2014

javiereguiluz commented Oct 9, 2014

Add noMarkup option to escape user HTML #224

Add noMarkup option to escape user HTML #224

Conversation

hkdobrev commented Sep 20, 2014

xPaw commented Sep 20, 2014

hkdobrev commented Sep 20, 2014

xPaw commented Sep 20, 2014

hkdobrev commented Sep 20, 2014

hkdobrev commented Sep 20, 2014

erusev commented Sep 21, 2014

hkdobrev commented Sep 21, 2014

erusev commented Sep 21, 2014

hkdobrev commented Sep 21, 2014

erusev commented Sep 21, 2014

markseuffert commented Oct 8, 2014

erusev commented Oct 9, 2014

javiereguiluz commented Oct 9, 2014

Add `noMarkup` option to escape user HTML #224

Add `noMarkup` option to escape user HTML #224