-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to handle “virtual spacing” in the CST? #7
Comments
In a CST would it be encoded as four spaces? My understand of CSTs is that it will always reflect the literal text on the page. |
I think it would be present as a tab, so it’ll reflect the literal text of the page indeed, but there needs to be an “extra” thing that’s part of the the indented code as its indent. Could maybe be an |
🤔 let's split this off into another discussion. |
I'd tend to expect the indentation always be in another node. |
Not quite, one space after
Right! I’m thinking the nodes would look something like this: |
some more thoughts: The current attempt of micromark (the one checked in) is, I now believe, incorrect. It assumes Markdown can be parsed in blocks (which may work in an AST). A line is made up of block continuation and block openings: While tokenizing, this “big enough” can be accomplished by using a “conceptual” (non-real) character: a “Virtual space”. Say we have: This does not affect tokens, if we’re doing something similar to mdast the whitespace token should look something like: |
This is now defined and solved in CMSM, by defining virtual spaces and content prefixes. |
I’m going to post a couple of problems I foresee as I’m trying to wrap my head around what micromark will be.
Take the following example:
>␉␠indented.code("in a block quote")
It’s a block quote marker, followed by a tab (tabs are forced to be treated as four spaces).
The first “virtual space” of the tab is part of the block quote marker. The second three “virtual spaces” are part of the indent of the indented code.
One extra real space, and you’ve got a code indent of four spaces, making it a proper indented code, in a block quote.
How is that represented as tokens? In a CST?
The text was updated successfully, but these errors were encountered: