Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect behaviour for mixed list types? #437

Closed
aidantwoods opened this issue Oct 10, 2016 · 8 comments
Closed

Incorrect behaviour for mixed list types? #437

aidantwoods opened this issue Oct 10, 2016 · 8 comments
Milestone

Comments

@aidantwoods
Copy link
Collaborator

For the following data (I'll prepend this at the beginning of each section to avoid scrolling to compare with output):

* unordered, item 1
  * sub-unordered item 1
  * sub-unordered item 2
  1. sub-ordered item 1
  2. sub-ordered item 2
  * sub-unordered item 3
    1. sub-sub ordered item 1
* unordered, item 2

This data produces different results for each parser I've tested in, so I'm not clear on what the standard actually should be. That said my expectation would have been that, the parser would honour the solution closest the the user input.

This isn't a bug-report per-say, because I'm not entirely sure that it is a bug, but more a query to see if Parsedown is acting as expected?

The aforementioned expected result is as follows

Expected

Source

* unordered, item 1
  * sub-unordered item 1
  * sub-unordered item 2
  1. sub-ordered item 1
  2. sub-ordered item 2
  * sub-unordered item 3
    1. sub-sub ordered item 1
* unordered, item 2

Result

screen shot 2016-10-10 at 13 13 58

#### HTML
<ul>
  <li>unordered, item 1
    <ul>
      <li>sub-unordered item 1</li>
      <li>sub-unordered item 2</li>
    </ul>
    <ol>
      <li>sub-ordered item 1</li>
      <li>sub-ordered item 2</li>
    </ol>
    <ul>
      <li>sub-unordered item 3
        <ol>
          <li>sub-sub ordered item 1</li>
        </ol>
      </li>
    </ul>
  </li>
  <li>unordered, item 2</li>
</ul>

However, these are the results of tested parsers:

Parsedown

Source

* unordered, item 1
  * sub-unordered item 1
  * sub-unordered item 2
  1. sub-ordered item 1
  2. sub-ordered item 2
  * sub-unordered item 3
    1. sub-sub ordered item 1
* unordered, item 2

Result

screen shot 2016-10-10 at 13 13 37

#### HTML
<ul>
<li>unordered, item 1
<ul>
<li>sub-unordered item 1</li>
<li>sub-unordered item 2
<ol>
<li>sub-ordered item 1</li>
<li>sub-ordered item 2</li>
</ol></li>
<li>sub-unordered item 3
<ol>
<li>sub-sub ordered item 1</li>
</ol></li>
</ul></li>
<li>unordered, item 2</li>
</ul>

Markdown PHP 1.3

Source

* unordered, item 1
  * sub-unordered item 1
  * sub-unordered item 2
  1. sub-ordered item 1
  2. sub-ordered item 2
  * sub-unordered item 3
    1. sub-sub ordered item 1
* unordered, item 2

Result

screen shot 2016-10-10 at 13 13 34

#### HTML
<ul>
<li>unordered, item 1

<ul>
<li>sub-unordered item 1</li>
<li>sub-unordered item 2</li>
</ul>



<ol>
<li>sub-ordered item 1</li>
<li>sub-ordered item 2</li>
</ol>

<ul>
<li>sub-unordered item 3</li>
</ul>



<ol>
<li>sub-sub ordered item 1</li>
</ol></li>
<li>unordered, item 2</li>
</ul>

GitHub

Source

* unordered, item 1
  * sub-unordered item 1
  * sub-unordered item 2
  1. sub-ordered item 1
  2. sub-ordered item 2
  * sub-unordered item 3
    1. sub-sub ordered item 1
* unordered, item 2

Result

Obviously this one is horrendously wrong, still, it's a comparison point:
screen shot 2016-10-10 at 13 45 10

HTML

<ul>
<li>unordered, item 1

<ul>
<li>sub-unordered item 1</li>
<li>sub-unordered item 2</li>
<li>sub-ordered item 1</li>
<li>sub-ordered item 2</li>
<li>sub-unordered item 3

<ol>
<li>sub-sub ordered item 1</li>
</ol>
</li>
</ul>
</li>
<li>unordered, item 2</li>
</ul>
@PhrozenByte
Copy link
Contributor

PhrozenByte commented Oct 10, 2016

First of all: Perhaps the most serious problem of Markdown is, that it never really was standardized. The only common denominator of all Markdown parsers is a descriptive document from John Gruber and his original Markdown.pl parser as some sort of a reference parser. His description of Markdown is very ambiguous, partially even contradicts itself and even the reference parser violates its own rules in many ways. The result is, that all Markdown parsers behave more or less differently - especially in cases which aren't really covered by John Gruber's document or in cases in which Markdown.pl produces impractical results.

However, in the recent past some people started the CommonMark project to resolve these uncertainties. The project is very promising, it comes with both a pretty complete specification of Markdown and a (very slow) reference parser. Many Markdown parser projects have already declared that they will try to improve compliance to the CommonMark specs and Parsedown is no exception here.

Anyway, to cut a long story short, if there's a problem or open question about how Parsedown behaves or should behave, the CommonMark specs should be consulted.

In this case the expected result you're describing matches the specs (unfortunately there's no matching example in the specs, it results from the "of-the-same-type" rule). You can also have a look at the CommonMark reference parser. Therefore, yes, this should be considered as a bug.

@aidantwoods
Copy link
Collaborator Author

Okay, since it's a bug I'll take a look at a fix ;-)

It doesn't look too bad upon first glance, so I'll take a stab at it later on when I get a chance to test out some changes :)

@erusev
Copy link
Owner

erusev commented Oct 10, 2016

Great, as long as the solution doesn't conflict with CommonMark and doesn't add too much complexity, I'm in favor :)

@aidantwoods
Copy link
Collaborator Author

Getting the blockListContinue function to correctly identify one of the offending lists isn't too difficult a change.

i.e.

* unordered, item 1 (new list 1)
  * sub-unordered item 1 (new list 1.1)
  * sub-unordered item 2 
  + sub-unordered item 1 (new list 1.2)
  - sub-unordered item 1 (new list 1.3)
  1. sub-ordered item 1 (new list 1.4)
  2. sub-ordered item 2
  * sub-unordered item 4 (new list 1.5)
    1. sub-sub ordered item 1 (new list 1.5.1)
* unordered, item 2

Can produce the following
screen shot 2016-10-10 at 19 16 36

You'll notice the above isn't quite correct – a problem I'm having is checking whether the list nesting is the same as the list that has currently been started. I figured that $Block['indent'] === $Line['indent'] should be sufficient, however, it's not proving to be enough to determine list depth.

Any ideas on another piece of data I could use to determine the depth of the current line relative to the block? Or should the line indent vs. block indent be enough?

@erusev
Copy link
Owner

erusev commented Oct 10, 2016

@aidantwoods

I'm super busy these days working on @careteditor.

I could give it some thought later this month.

@PhrozenByte
Copy link
Contributor

@aidantwoods: See http://spec.commonmark.org/0.26/#example-255

A sublist must be indented the same number of spaces a paragraph would need to be in order to be included in the list item.

Thus comparing the block's and line's indent is completely fine to determine whether the list is continued or not, and whether a sublist is started or not.

fyi: You can use test/CommonMarkTest.php or test/CommonMarkTestWeak.php from #423 to test whether your changes comply to the CommonMark specs or not.

@aidantwoods
Copy link
Collaborator Author

@PhrozenByte
Oh yeah, I get that comparing the indent is inline with the spec (that's why I picked that method) – more my issue is that the code isn't behaving in the way I'm expecting it to regarding the reported indentation.

I'm creating a pull request, just so you can see the changes I've made to start this off – obviously it's not fit for merging yet.

@aidantwoods aidantwoods changed the title (Query) Is this the correct behaviour for mixed list types? Incorrect behaviour for mixed list types? Feb 26, 2018
@aidantwoods
Copy link
Collaborator Author

#439 is merged, closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants