Skip to content

RFC for native markdown parsing on console #116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 4, 2018

Conversation

adityapatwardhan
Copy link
Member

No description provided.

## Specification

This RFC proposes to use VT100 escape sequences to render markdown content.
`ConvertFrom-Markdown` cmdlet as part of `Microsoft.PowerShell.MarkdownRender` PowerShell module would consume a string or a file path and output a VT100 encoded string.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we introduce new module? We expect still many new cmdlets wil be created?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this can be moved to Microsoft.PowerShell.Utility

This RFC proposes to use VT100 escape sequences to render markdown content.
`ConvertFrom-Markdown` cmdlet as part of `Microsoft.PowerShell.MarkdownRender` PowerShell module would consume a string or a file path and output a VT100 encoded string.
Optionally, the output can be formatted as HTML.
For converting strings to VT100 coded strings, we will be writing an extension to [markdig](https://github.com/lunet-io/markdig).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

markdig already implement HTML renderer - nobody ask them about plans to implement VT100 renderer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

markdig has a good extendable architecture and they prefer extensions. We may publish a nuget package of the extension we write. so it can be used by others.

For converting strings to VT100 coded strings, we will be writing an extension to [markdig](https://github.com/lunet-io/markdig).
The extension will insert VT100 escape sequences as appropriate.

```PowerShell
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code block should be commented.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iSazonov Could you please elaborate on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems it is out of context - text above is not related to the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it thanks.

- **AsHTML** : When selected, the output will be HTML.
- **AsPlainText** : When selected, the output will be plain text.

The default output will be a string encoded with VT100 escape sequences.
Copy link
Contributor

@iSazonov iSazonov Mar 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cmdlet is not intended solely for the console output. So I guess users can expect another default (more popular HTML).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer suggestion from @DerpMcDerp, I will update the RFC to state that, ConvertFrom-Markdown returns an object, and the default formatter renders it as markdown.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConvertFrom-Markdown could return a PSObject with three properties

  1. Property Tokens that holds the tokens (ast?) parsed by markdig
  2. Property Html that holds the converted html text if -AsHTML is specified
  3. Property VT100EncodedString that holds the escape sequences.

The PSObject has a special PSTypeName that can be recognized by formatting system, and render it using the VT100 escape sequences when

  • VT100EncodedString is not null, and, host support VT100 sequences

Otherwise, it will be rendered in a list view with the available properties.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we could return an object. And replace AsPlainText with AsVT100Encoded.
If we use simplified markdown scheme why we need property Tokens? It seems useless.
Having two results (Html and VT100EncodedString) at once is too costly. I suggest use methods or one property Value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace AsPlainText with AsVT100Encoded

I agree.

why we need property Tokens

I think we will get the parsed tokens/Asts anyway during the conversion. @DerpMcDerp can you give an example where the parsed markdown ASTs may be useful?

Having two results (Html and VT100EncodedString) at once is too costly

We don't generate two results in one go. By default, only Html is populated; if -AsVT100Encoded is specified, then VT100EncodedString is populated. This is similar to MeasureInfo objects produced from Measure-Object.


## Supported Markdown Elements

We will be supporting a limited set of markdown elements in the initial version.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the completeness of the RFC I would expect that we will describe the escape sequences for all markdown elements to have an accurate parity with that the markdig supports for HTML. (We are free implement its step-by-step).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC describes escape sequences for all the markdown elements that will be supported in this version. Further work on the extension might have another RFC, if the changes are significant enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree if changing already implemented escapes will not be a breaking change.


## Open Questions

- Investigate line wrapping behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have related open question - pager.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will add that to open question.

@DerpMcDerp
Copy link

I think it would be better for markdown to:

  1. output a stream of objects.
enum Kind {
	Plain,
	Plain_,
	_Header1,
	Header1,
	Header1_,
	_LinkAlt,
	LinkAlt,
	LinkAlt_,
	_Link,
	Link,
	Link_
// etc.
}

struct Region {
	int Begin;
	int End;
}

class MarkdownToken {
	string Parent;
	Kind Kind;
	Region Region;
	
}

So:


Hey
# Foo

Would output:

Plain: "Hey"
Plain_: "\n"
_Header1: "# "
Header1: "Foo"
Header1_: "\n"
  1. Remove -AsHTML and -AsPlainText and use ConvertFrom-Markdown to convert a markdown stream to html.

  2. The default console formatter cmdlet should render the above output as formatted text

  3. The console formatter cmdlet should not strip away formatting symbols by default. e.g.

## Foo

would render as:

ESC[4;93m## Foo 1ESC[0m

rather than:

ESC[4;93mFoo 1ESC[0m


- **Path** : Accepts an array of file paths with markdown content.
- **LiteralPath** : Accepts an array of literal paths with markdown content.
- **InputObject** : Accepts an InputObject of `System.IO.FileSystemInfo`, `string` type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why FileSystemInfo object? It won't be able to process directory, so FileInfo should be good here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, FileInfo should be sufficient.


#### Headers

VT100 escape sequences for headers are as follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the escape sequences for different headers should be configurable, because users may have different background colors for their hosts, and some pre-defined color won't work well with that background.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we don't need to give the granularity that every markdown header color can be configured. Instead, we can provide 2 or 3 themes that would work for dark, light backgrounds respectively.

#### Links

Links in paragraphs will keep the link label and append the link URL surrounded by parenthesis.
The link will be colored to differentiate from plain text.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to have the text in [] quoted with "", otherwise there won't be a way to know what is the exact text that refers to the link. For example: this is a long text that refers to a link should be rendered as "this is a long text that refers to a link"(https://www.bing.com)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea. I will update.

@adityapatwardhan
Copy link
Member Author

@iSazonov @DerpMcDerp @daxian-dbw your comments have been addressed.


Set-MarkdownOption [-Header1Color <ConsoleColor>] [-Header2Color <ConsoleColor>] [-Header3Color <ConsoleColor>] [-Header4Color <ConsoleColor>] [-Header5Color <ConsoleColor>] [-Header6Color <ConsoleColor>] [-CodeBlockForegroundColor <ConsoleColor>] [-CodeBlockBackgroudColor <ConsoleColor>] [-ImageAltTextForegroundColor <ConsoleColor>] [-LinkForegroundColor <ConsoleColor>] [-ItalicsForegroundColor <ConsoleColor>] [<CommonParameters>]

Set-MarkdownOption [-Theme <MarkdownThemeType>] [<CommonParameters>]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think about custom themes - using ThemeName as string is more general.
If we store the theme type/name in powershell.config.json I believe the cmdlet is unnecessary - I'd prefer to have one universal cndlet to edit powershell.config.json (out of the RFC).
I believe Set-MarkdownOption [-Header1Color <ConsoleColor>] ... is unnecessaty too - how often should users change a theme? I suppose very seldom. If we store a theme in a simple file (json/yaml?) - why we don't address the file format/place in the RFC? - users can manually edit it easily.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we store the theme type/name in powershell.config.json

Having the theme setting in powershell.config.json doesn't seem necessary. I think having Set-MarkdownOption -Theme <name> in your profile would work.

If we store a theme in a simple file (json/yaml?) - why we don't address the file format/place in the RFC?

Do you mean to allow a user to alter a theme and create their own theme?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modelled it after Set-PSReadLineOption. And as @daxian-dbw said, all the customization can be added to the profile.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to allow a user to alter a theme and create their own theme?

Yes.

all the customization can be added to the profile.

I understand. I still think that customizing themes is a rare scenario and creating special cmdlets is redundant. It is also easier to create and distribute custom themes in a file than in a profile.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iSazonov I have updated the RFC to supported importing / exporting the settings to/from a file.

- Investigate line wrapping behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As currently we discussed (@SteveL-MSFT ) tabs is problematic - we should ignore a console settings and do rendering in PowerShell consolehost internally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "line wrapping" here refers to semantic line breaks?

Copy link
Member Author

@adityapatwardhan adityapatwardhan Mar 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some investigation, we do not have to special case for this. This is parsed automatically. I will remove this.


```

The properties can be individually to customized the rendering on console.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar: The properties can individually customize the rendering on the console.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Code blocks will be rendered with a lighter background color and a dark foreground text.
Background color will be applied for the console width.
No syntax highlighting will be done.
`ESC[500@` is used to have sufficiently wide background color.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of wide, you mean contrasting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually mean wide, so that the background spans the console width and not just the code string width.

@adityapatwardhan
Copy link
Member Author

@SteveL-MSFT Your feedback has been addressed.

2. Adds a new dependency of `markdig` to `System.Management.Automation`.

## Open Questions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantic linebreaks should be called out explicitly if supported or not

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a line the Paragraphs and line breaks section.

@adityapatwardhan
Copy link
Member Author

@SteveL-MSFT This RFC is ready for review by committee.

Copy link
Contributor

@joeyaiello joeyaiello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple open questions


Set-MarkdownOption -InputObject <psobject> [<CommonParameters>]

```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PSReadline 2.0 is moving away from the ConsoleColor and towards VT100 color sequences. If VT100 is already required for this Markdown support to work, might we want to do the same?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.


```

Export the current markdown settings to a file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of file are we exporting/importing the options as? .psd1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a JSON file, I will update to clarify.

@joeyaiello
Copy link
Contributor

@PowerShell/powershell-committee reviewed this today. A couple notes:

  • it'd be nice to see a full image example with all the different token renderings so we can see them next to each other
  • there should be a mention of the help system scenario given that it's our primary motivation for this work
  • the right verb should be Show instead of ConvertTo

@daxian-dbw
Copy link
Member

@joeyaiello I'm afraid ConvertFrom- is still needed as the purpose is to convert a markdown to an object and write that object to the pipeline. The object has the properties Tokens (hold the asts generated from parsing the markdown file), HTML (hold the html content representing the markdown) and VT100EncodedString (hold the VT100 sequence representing the markdown) properties.

We can consider whether or not Show-Markdown should be added in through.

Copy link
Contributor

@iSazonov iSazonov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave a comment

The `Tokens` property has the AST for the markdown document from `markdig`.
The `Html` property has the markdown document converted to `HTML`.
The `VT100EncodedString` property has the markdown documented with `VT100` escape sequences.
By default, the `Tokens`and `Html` properties will be populated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I uderstand we'll have Tokens always.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I means we should change the sentence:

By default, the `Html` property will be populated. The `Tokens` property is populated in all cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. Updated.

## Alternate Proposals and Considerations

Introduce a new sigil to define a markdown string.
This would be later rendered using formating files.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we plan to add [MarkDown] type to automatically format output strings?

[MarkDown] $var = " ... "
$var
$var | Out-Default

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a type accelerator [Markdown] is not in scope for this RFC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment is not about the accelerator - it is about the type so we can use ETS and auto-formatting.

Copy link
Member Author

@adityapatwardhan adityapatwardhan Apr 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the cmdlet returns a strongly typed object. I will add a line about it.

@adityapatwardhan
Copy link
Member Author

I have updated the RFC to address the committee's feedback.

```

Import the markdown settings from the specified file and returns a PSObject of the options.
This can be used as an `InputObject` for `Set-MarkdownOption`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we could add Import-MarkdownOption -ApplyOptions vs Import-MarkdownOption | Set-MarkdownOption.


#### Output Type

The output type will be `MarkdownInfo` object with properties for `Html`, `VT100EncodedString` and `Tokens`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With having MarkdownInfo I'd expect follow works (using ETS):

<command returns MarkdownInfo> | Out-Default
<command returns MarkdownInfo> | Out-Markdown
``

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will have a default formatter. Maybe that calls Show-Markdown.


```

Render the `VT100EncodedString` property of `MarkdownInfo` on console.
Copy link
Contributor

@iSazonov iSazonov Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related my previous comment - What about Out-Default, Out-Markdown and ETS?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a default formatter, Out-Default will work.


```

Export the current markdown settings to a JSON file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we default JSON file? Where?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have a default JSON file. Values provided to Set-MarkdownOption via options or Import-MarkdownOption will be applied.

@iSazonov
Copy link
Contributor

iSazonov commented Apr 3, 2018

@joeyaiello
Copy link
Contributor

The @PowerShell/powershell-committee is generally okay with the functionality here, but we'd like a few details spec'd/documented here:

  • consider -PassThru for Set-MarkdownOption
  • process-persistent vs. runspace-persistent Markdown options
  • output type of Get-MarkdownOption

In the meantime, we're moving this to experimental to allow @adityapatwardhan to continue prototyping this work.

@joeyaiello joeyaiello merged commit b276b9a into PowerShell:master Apr 4, 2018
@felixfbecker
Copy link

What is the use case for having a VT100EncodedString property? If its sole purpose is for pretty output formatting in terminal hosts, it should be done through a formatting file (Format.ps1xml), not through the data object. I.e. this is the way this object should always be displayed, while the raw markdown or HTML is accessible through a property or by piping into a Convert cmdlet.

@adityapatwardhan
Copy link
Member Author

@felixfbecker Eventually the help system will be updated to use and display markdown. This will be used for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants