Add support for normalizing Markdown #32

xoofx · 2016-06-28T14:58:19Z

Related to issue #17

This is still in wip.

support all CommonMark core components
- wip
Add normalize options (limit text width....etc.)
support all extensions

eamodio · 2016-10-16T18:31:00Z

Is there a blocking issue on PR? Or is there anything we can help out with? This would be a great feature to have -- as I would love to be able to programmatically create a MarkdownDocument and be able to generate the markdown string from it.

xoofx · 2016-10-16T19:05:04Z

Nothing blocking, just my limited spare time and lower personal interest for this specific feature.

eamodio · 2016-10-16T20:41:36Z

I'd like too help out on it - could you provide any overview of what is left?

xoofx · 2016-10-17T14:36:15Z

@eamodio sure, you are welcome. The first round is to support the core CommonMark syntax. The current commit changes on my branch contains some work for this (but still some syntaxes are not supported/well supported, like escape characters - I can explain later how to implement it)
Then, once the core part is done, we can start to select candidate for extensions (prioritize first the extensions that are part of the Github Flavored Markdown like pipetables, strikethrough... etc.).
Let me know if you need more details.

tthiery · 2017-09-15T11:59:51Z

I just stumbled over this. I need something similar (I want to analyze the AST, fetch some stuff from it, and then serialize some parts of the AST (normalized or not).

What is the progress? How heavy would this be to complete it? 2 hours, 10 hours, 20 hours? Willing to help out.

xoofx · 2017-09-15T12:23:32Z

What is the progress? How heavy would this be to complete it? 2 hours, 10 hours, 20 hours? Willing to help out.

I have no spare-time to work on this (as I have no personal requirement for this feature), so no progress has been done since I started to prototype the idea. For the core Markdown features (CommonMark only), I guess it requires a few hours of work (with tests...etc.). For all the extensions, quite more.

xoofx · 2017-09-15T13:34:11Z

Feel free to branch from my original PR

tthiery · 2017-09-18T18:43:15Z

I forked the repo and the normalize branch and saw into what is missing. This is my view (correct me if I am wrong)

rebase to the HEAD of master works
Renderer for normalization exist in similar scale than for the HtmlRenderer
Most of the unit test cases are missing.
There are no test cases for normalization. But maybe that is already covered in the parser tests.
TODOs for
- List
- CodeInline
- LineBreakInline
No options like
- Limit Text Width

Questions @xoofx

Are the renderer - ignoring TODOs - covering the complete syntax?
Extensions are not covered. Maybe better as separate PR?
Do we need unit tests for the "normalization" test cases? Or is that covered as part of the parser unit tests?
Is there any list of "normalized" markdown? I checked the CommonMark spec for (as a sample) http://spec.commonmark.org/0.27/#thematic-breaksbut they do not specify a preferred sample.
What about LinkReferenceDefinition?
I found some bugs, check this commit file content

xoofx · 2017-09-18T19:25:56Z

About normalization, it is just options to modify the way is printed back to markdown (like limit the column of paragraphs). You don't have to worry too much of it at this point.

The original PR is very far from complete even for the renderers that were implemented. It was just a bootstrap to check how much work it would require and if there was any changes required to some existing classes (like handling the indent). Most likely there are missing TODO as well.

Are the renderer - ignoring TODOs - covering the complete syntax?

Can't remember the state of each, so you have to check them. The CommonMark part is quite limited, so it is only a few classes.

Extensions are not covered. Maybe better as separate PR?

yes

Do we need unit tests for the "normalization" test cases? Or is that covered as part of the parser unit tests?

we will (for the part specific with options activated, like limit column). One possible automated part for all tests is to output some markdown, re-feed it to the parser, re-output it and check that it doesn't shift.

Is there any list of "normalized" markdown?

I don't know, but "normalized" is nothing more than just a few rendering options. We have the freedom to add whatever is relevant for common use cases. We will see which one are valuable. Obviously, limit the width could have an impact on many renderers, so some normalization are more generally intrusive than others (e.g a normalization could work on list numbering for example).

What about LinkReferenceDefinition?

They should be part of the rendering as well... not sure exactly what do you imply?

I found some bugs, check this commit file content

Sure, again, the original PR was a very small proof of concept and in the early stage of a WIP.
For example, there will be may be some challenges with detecting characters that need to be escaped also (for the inline part)

- Code fences with attributes - Escaping inline code - Escaping link titles - Encode hard line breaks - HTML Entity emitting - List intend.

tthiery · 2017-09-18T20:25:30Z

I have now unit tests for the straight forward things and fixed the bugs and TODOs so far.

I will go ahead by ...

cleaning the test suite
spec the normalization cases
not implementing the line breaking thing. Too complex for me right now ;) That will also go to an extra PR.

tthiery · 2017-10-20T14:17:16Z

@xoofx can you take a look at https://github.com/tthiery/markdig/blob/normalize/src/Markdig.Tests/TestNormalize.cs

I am curious about your opinion in testing AST to nomalized Markdown? I find the AST very hard to work with (I know.. damn youth... we enjoyed XLinq to much ;)). Is there a better way to annotate that? Or should we better base testing on alernative-to-normalizet text cases.

xoofx · 2017-10-20T15:32:45Z

I am curious about your opinion in testing AST to nomalized Markdown? I find the AST very hard to work with (I know.. damn youth... we enjoyed XLinq to much ;)). Is there a better way to annotate that? Or should we better base testing on alernative-to-normalizet text cases.

Yeah, it will be too laborious. I would prefer if we could test as I started by checking plain markdown text:

Input and renormalize ouput are equal, so just an AssertNormalize with only one input (that will be the output used to check the renormalize)
Input and renormalize output are different: same AssertNormalize with the two args (input and output)

tthiery · 2017-10-21T15:27:06Z

Okay, understood. How far should normalization go? I think I will start to support the variations offered by the AST (e.g. different fencing characters for a FencedCodeBlock). Later PR should focus on limiting this by (default) settings.

# Conflicts: # src/Markdig.Tests/Markdig.Tests.csproj

Add support for normalize (core CommonMark for now)

xoofx · 2017-10-24T15:40:34Z

Closing this issue as the first part (core CommonMark) is handled by the PR #154 which was a continuation of this PR.

Will open a separate issue to track the remaining work

xoofx added 3 commits June 28, 2016 10:53

wip normalize (issue #17)

9d52732

Add normalize renderers for core CommonMark components (wip #17)

6ac2429

Try to workaround tight/loose lists and paragraphs (wip #17)

b20b111

xoofx added the enhancement label Jun 28, 2016

This was referenced Sep 12, 2017

Added EnableHtmlForBlock flag on HtmlRenderer for issue #104 #140

Merged

Strip markdown #104

Closed

xoofx and others added 4 commits September 17, 2017 11:07

wip normalize (issue #17)

d58db53

Add normalize renderers for core CommonMark components (wip #17)

3cc405b

Try to workaround tight/loose lists and paragraphs (wip #17)

218a094

Add straight forward unit tests.

0c40895

Fixed issues found in straight forward normalization test cases

1bb35c5

- Code fences with attributes - Escaping inline code - Escaping link titles - Encode hard line breaks - HTML Entity emitting - List intend.

tthiery added 3 commits September 18, 2017 23:04

Relabeled unit tests.

34579b5

Add finish block method and adjusted naming in test suite.

c633926

Add test approach for AST based arrange.

6002195

tthiery and others added 3 commits October 24, 2017 15:51

Remove incomplete UnorderedList syntax test case

7174e32

Add normalization tests for space between leaves.

72c2c06

Merge tag 'v0.13.4' into normalize

fc8adc7

# Conflicts: # src/Markdig.Tests/Markdig.Tests.csproj

tthiery and others added 3 commits October 24, 2017 16:53

Merge remote-tracking branch 'upstream/normalize' into normalize

e2b3f81

Review Fix: Make IntLog10Fast method private static.

59630ae

Merge pull request #154 from tthiery/normalize

4cf4bfb

Add support for normalize (core CommonMark for now)

xoofx closed this Oct 24, 2017

xoofx mentioned this pull request Oct 24, 2017

Add support for normalize - markdown document rountrip #155

Open

35 tasks

xoofx deleted the normalize branch October 25, 2017 15:11

Uh oh!

Add support for normalizing Markdown #32

Add support for normalizing Markdown #32

Uh oh!

Conversation

xoofx commented Jun 28, 2016

Uh oh!

eamodio commented Oct 16, 2016

Uh oh!

xoofx commented Oct 16, 2016

Uh oh!

eamodio commented Oct 16, 2016

Uh oh!

xoofx commented Oct 17, 2016

Uh oh!

tthiery commented Sep 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xoofx commented Sep 15, 2017

Uh oh!

xoofx commented Sep 15, 2017

Uh oh!

tthiery commented Sep 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xoofx commented Sep 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tthiery commented Sep 18, 2017

Uh oh!

tthiery commented Oct 20, 2017

Uh oh!

xoofx commented Oct 20, 2017

Uh oh!

tthiery commented Oct 21, 2017

Uh oh!

xoofx commented Oct 24, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tthiery commented Sep 15, 2017 •

edited

Loading

tthiery commented Sep 18, 2017 •

edited

Loading

xoofx commented Sep 18, 2017 •

edited

Loading