Skip to content

Conversation

@xoofx
Copy link
Owner

@xoofx xoofx commented Jun 28, 2016

Related to issue #17

This is still in wip.

  • support all CommonMark core components
    • wip
  • Add normalize options (limit text width....etc.)
  • support all extensions

@eamodio
Copy link

eamodio commented Oct 16, 2016

Is there a blocking issue on PR? Or is there anything we can help out with? This would be a great feature to have -- as I would love to be able to programmatically create a MarkdownDocument and be able to generate the markdown string from it.

@xoofx
Copy link
Owner Author

xoofx commented Oct 16, 2016

Nothing blocking, just my limited spare time and lower personal interest for this specific feature.

@eamodio
Copy link

eamodio commented Oct 16, 2016

I'd like too help out on it - could you provide any overview of what is left?

@xoofx
Copy link
Owner Author

xoofx commented Oct 17, 2016

@eamodio sure, you are welcome. The first round is to support the core CommonMark syntax. The current commit changes on my branch contains some work for this (but still some syntaxes are not supported/well supported, like escape characters - I can explain later how to implement it)
Then, once the core part is done, we can start to select candidate for extensions (prioritize first the extensions that are part of the Github Flavored Markdown like pipetables, strikethrough... etc.).
Let me know if you need more details.

@tthiery
Copy link

tthiery commented Sep 15, 2017

I just stumbled over this. I need something similar (I want to analyze the AST, fetch some stuff from it, and then serialize some parts of the AST (normalized or not).

What is the progress? How heavy would this be to complete it? 2 hours, 10 hours, 20 hours? Willing to help out.

@xoofx
Copy link
Owner Author

xoofx commented Sep 15, 2017

What is the progress? How heavy would this be to complete it? 2 hours, 10 hours, 20 hours? Willing to help out.

I have no spare-time to work on this (as I have no personal requirement for this feature), so no progress has been done since I started to prototype the idea. For the core Markdown features (CommonMark only), I guess it requires a few hours of work (with tests...etc.). For all the extensions, quite more.

@xoofx
Copy link
Owner Author

xoofx commented Sep 15, 2017

Feel free to branch from my original PR

@tthiery
Copy link

tthiery commented Sep 18, 2017

I forked the repo and the normalize branch and saw into what is missing. This is my view (correct me if I am wrong)

  • rebase to the HEAD of master works
  • Renderer for normalization exist in similar scale than for the HtmlRenderer
  • Most of the unit test cases are missing.
  • There are no test cases for normalization. But maybe that is already covered in the parser tests.
  • TODOs for
    • List
    • CodeInline
    • LineBreakInline
  • No options like
    • Limit Text Width

Questions @xoofx

  • Are the renderer - ignoring TODOs - covering the complete syntax?
  • Extensions are not covered. Maybe better as separate PR?
  • Do we need unit tests for the "normalization" test cases? Or is that covered as part of the parser unit tests?
  • Is there any list of "normalized" markdown? I checked the CommonMark spec for (as a sample) http://spec.commonmark.org/0.27/#thematic-breaksbut they do not specify a preferred sample.
  • What about LinkReferenceDefinition?
  • I found some bugs, check this commit file content

@xoofx
Copy link
Owner Author

xoofx commented Sep 18, 2017

About normalization, it is just options to modify the way is printed back to markdown (like limit the column of paragraphs). You don't have to worry too much of it at this point.

The original PR is very far from complete even for the renderers that were implemented. It was just a bootstrap to check how much work it would require and if there was any changes required to some existing classes (like handling the indent). Most likely there are missing TODO as well.

Are the renderer - ignoring TODOs - covering the complete syntax?

Can't remember the state of each, so you have to check them. The CommonMark part is quite limited, so it is only a few classes.

Extensions are not covered. Maybe better as separate PR?

yes

Do we need unit tests for the "normalization" test cases? Or is that covered as part of the parser unit tests?

we will (for the part specific with options activated, like limit column). One possible automated part for all tests is to output some markdown, re-feed it to the parser, re-output it and check that it doesn't shift.

Is there any list of "normalized" markdown?

I don't know, but "normalized" is nothing more than just a few rendering options. We have the freedom to add whatever is relevant for common use cases. We will see which one are valuable. Obviously, limit the width could have an impact on many renderers, so some normalization are more generally intrusive than others (e.g a normalization could work on list numbering for example).

What about LinkReferenceDefinition?

They should be part of the rendering as well... not sure exactly what do you imply?

I found some bugs, check this commit file content

Sure, again, the original PR was a very small proof of concept and in the early stage of a WIP.
For example, there will be may be some challenges with detecting characters that need to be escaped also (for the inline part)

- Code fences with attributes
- Escaping inline code
- Escaping link titles
- Encode hard line breaks
- HTML Entity emitting
- List intend.
@tthiery
Copy link

tthiery commented Sep 18, 2017

I have now unit tests for the straight forward things and fixed the bugs and TODOs so far.

I will go ahead by ...

  • cleaning the test suite
  • spec the normalization cases
  • not implementing the line breaking thing. Too complex for me right now ;) That will also go to an extra PR.

@tthiery
Copy link

tthiery commented Oct 20, 2017

@xoofx can you take a look at https://github.com/tthiery/markdig/blob/normalize/src/Markdig.Tests/TestNormalize.cs

I am curious about your opinion in testing AST to nomalized Markdown? I find the AST very hard to work with (I know.. damn youth... we enjoyed XLinq to much ;)). Is there a better way to annotate that? Or should we better base testing on alernative-to-normalizet text cases.

@xoofx
Copy link
Owner Author

xoofx commented Oct 20, 2017

I am curious about your opinion in testing AST to nomalized Markdown? I find the AST very hard to work with (I know.. damn youth... we enjoyed XLinq to much ;)). Is there a better way to annotate that? Or should we better base testing on alernative-to-normalizet text cases.

Yeah, it will be too laborious. I would prefer if we could test as I started by checking plain markdown text:

  • Input and renormalize ouput are equal, so just an AssertNormalize with only one input (that will be the output used to check the renormalize)
  • Input and renormalize output are different: same AssertNormalize with the two args (input and output)

@tthiery
Copy link

tthiery commented Oct 21, 2017

Okay, understood. How far should normalization go? I think I will start to support the variations offered by the AST (e.g. different fencing characters for a FencedCodeBlock). Later PR should focus on limiting this by (default) settings.

@xoofx
Copy link
Owner Author

xoofx commented Oct 24, 2017

Closing this issue as the first part (core CommonMark) is handled by the PR #154 which was a continuation of this PR.

Will open a separate issue to track the remaining work

@xoofx xoofx closed this Oct 24, 2017
@xoofx xoofx deleted the normalize branch October 25, 2017 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants