Token type `paragraph` becomes `text` in master #1247

vsemozhetbyt · 2018-05-02T12:42:47Z

Marked version: ~~0.3.19~~ master

Markdown flavor: CommonMark ?

Test Script

'use strict';

const marked = require('marked');

const md = `
A Paragraph.

> A blockquote

`;

const tokens = marked.lexer(md);
console.log(tokens);

const html = marked.parser(tokens);
console.log(`\n${html}`);

Expectation

The output from the old version 0.3.6

[ { type: 'paragraph', text: 'A Paragraph.' },
  { type: 'blockquote_start' },
  { type: 'paragraph', text: 'A blockquote' },
  { type: 'blockquote_end' },
  links: {} ]

<p>A Paragraph.</p>
<blockquote>
<p>A blockquote</p>
</blockquote>

Result

The output from the ~~0.3.19~~ master:

[ { type: 'text', text: 'A Paragraph.' },
  { type: 'space' },
  { type: 'blockquote_start' },
  { type: 'text', text: 'A blockquote' },
  { type: 'blockquote_end' },
  links: {} ]

<p>A Paragraph.</p>
<blockquote>
<p>A blockquote</p>
</blockquote>

Is this cange intended?

UziTech · 2018-05-02T13:20:48Z

v0.3.19 actually outputs the correct type, it is master that outputs 'text' instead of 'paragraph'

looks like the culprit is d08039e#diff-81ab0a5aa39b7a91951fc28d2d151496R36

Just adding that question mark back seems to fix this issue. I'm not sure if it introduces new issues.

@Feder1co5oave was that a mistake or is there a reason the question mark was removed?

looks like all tests pass when I just add the question mark back

UziTech · 2018-05-02T13:56:52Z

I added a PR that adds back the question mark and adds a test for this issue #1248

UziTech · 2018-05-02T14:25:10Z

@vsemozhetbyt thanks for reporting this 💯

Feder1co5oave · 2018-05-02T15:38:23Z

@UziTech it made sense to me at the time, because the types of token matched by the negative lookahead that follows (these are the ones capable of interrupting paragraphs) should be on a new line anyway, but it was a naive change.

The matching of one \n is implicit from the leading [^\n]+ that eats up everything up until the first newline. So in theory the ? should be stripped away, but in practice you cannot, because of the + quantifier at the end of that group.

paragraph: /^([^\n]+(?:\n(?!hr|heading|lheading| {0,3}>|<\/?(?:tag)(?: +|\\n|\/?>)|<(?:script|pre|style|!--))[^\n]+)+)/,
//                                                                                                             this ^

In the case of a one-line paragraph like in the reported example, the required newline (followed by some other [^\n]+ text on a new line) cannot be found.
So you either put back the ? or you change the final + to a *, the grammar should be the same (the latter makes more sense to me at the moment).

Also, I just noticed that the \\n after (?:tag) should really be a \n, it was a copy&paste mishap from the html rule above.

vsemozhetbyt changed the title ~~Token type paragraph becomes text in last versions~~ Token type paragraph becomes text in master May 2, 2018

UziTech mentioned this issue May 2, 2018

fix paragraph type #1248

Merged

4 tasks

UziTech added the L1 - broken Valid usage causes incorrect output OR a crash AND there is no known workaround for the issue label May 2, 2018

joshbruce closed this as completed in #1248 May 2, 2018

UziTech mentioned this issue May 2, 2018

fix paragraph #1249

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token type `paragraph` becomes `text` in master #1247

Token type `paragraph` becomes `text` in master #1247

vsemozhetbyt commented May 2, 2018 •

edited

Loading

UziTech commented May 2, 2018 •

edited

Loading

UziTech commented May 2, 2018

UziTech commented May 2, 2018

Feder1co5oave commented May 2, 2018

Token type paragraph becomes text in master #1247

Token type paragraph becomes text in master #1247

Comments

vsemozhetbyt commented May 2, 2018 • edited Loading

Test Script

Expectation

Result

UziTech commented May 2, 2018 • edited Loading

UziTech commented May 2, 2018

UziTech commented May 2, 2018

Feder1co5oave commented May 2, 2018

Token type `paragraph` becomes `text` in master #1247

Token type `paragraph` becomes `text` in master #1247

vsemozhetbyt commented May 2, 2018 •

edited

Loading

UziTech commented May 2, 2018 •

edited

Loading