Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add template literal tag #26

Closed
wants to merge 3 commits into from
Closed

Add template literal tag #26

wants to merge 3 commits into from

Conversation

jablko
Copy link

@jablko jablko commented Sep 30, 2021

Initial checklist

  • I read the support docs
  • I read the contributing guide
  • I agree to follow the code of conduct
  • I searched issues and couldn’t find anything (or linked relevant results below)
  • If applicable, I’ve added docs and tests

Description of changes

What do you think about adding a template literal tag, perhaps like so?

It allows me to wrap an existing mdast AST in a Markdown "template". For example:

import { md } from "mdast-util-from-markdown";

const text = md`a*b`;
const emphasis = md`*${text}*`;
console.dir(emphasis, { depth: null });
Combined AST:
{
  type: 'root',
  children: [
    {
      type: 'paragraph',
      children: [
        {
          type: 'emphasis',
          children: [
            {
              type: 'root',
              children: [
                {
                  type: 'paragraph',
                  children: [
                    {
                      type: 'text',
                      value: 'a*b',
                      position: {
                        start: { line: 1, column: 1, offset: 0 },
                        end: { line: 1, column: 4, offset: 3 }
                      }
                    }
                  ],
                  position: {
                    start: { line: 1, column: 1, offset: 0 },
                    end: { line: 1, column: 4, offset: 3 }
                  }
                }
              ],
              position: {
                start: { line: 1, column: 1, offset: 0 },
                end: { line: 1, column: 4, offset: 3 }
              }
            }
          ],
          position: {
            start: { line: 1, column: 1, offset: 0 },
            end: { line: 1, column: 14, offset: 13 }
          }
        }
      ],
      position: {
        start: { line: 1, column: 1, offset: 0 },
        end: { line: 1, column: 14, offset: 13 }
      }
    }
  ],
  position: {
    start: { line: 1, column: 1, offset: 0 },
    end: { line: 1, column: 14, offset: 13 }
  }
}

Possible approaches discussed in: syntax-tree/unist#54

The approach in this PR is tentatively mdast-util-directive: It interleaves the template strings with :expression, parses, then walks the result and swaps the :expression nodes for the template args.

@github-actions github-actions bot added the 👋 phase/new Post is being triaged automatically label Sep 30, 2021
@github-actions

This comment has been minimized.

@ChristianMurphy
Copy link
Member

related previous discussion on use of js template syntax unifiedjs/unified#40

@jablko jablko force-pushed the patch-1 branch 2 times, most recently from b3310ed to 2df3ea6 Compare September 30, 2021 00:59
@wooorm
Copy link
Member

wooorm commented Sep 30, 2021

Interesting stuff!

  • I don’t think directives should be used in this package — this package is in my opinion about being the simplest and smallest way to turn commonmark into an AST.
  • Is the goal of this to get an AST? To get HTML? Should it be at this level? micromark/here/remark/rehype/unified? Userland?
  • The visitor can be approved in several ways, before I give specific suggestions though, it might be good to read our guides https://unifiedjs.com/learn/recipe/tree-traversal/ and https://unifiedjs.com/learn/recipe/tree-traversal-typescript/
  • And last, tests and docs of course, but this depends on the earlier points

@jablko
Copy link
Author

jablko commented Sep 30, 2021

related previous discussion on use of js template syntax unifiedjs/unified#40

Thanks for this background!

Interesting stuff!

  • I don’t think directives should be used in this package — this package is in my opinion about being the simplest and smallest way to turn commonmark into an AST.

Makes sense.

  • Is the goal of this to get an AST? To get HTML? Should it be at this level? micromark/here/remark/rehype/unified? Userland?

My goal is to get an AST, so raw Markdown -> mdast AST.

Okay, will do.

  • And last, tests and docs of course, but this depends on the earlier points

I've now added a test, to solicit feedback. I'm very happy to write docs if this is acceptable in principle, but yeah depends on earlier points: Should it be at this level?

Copy link
Author

@jablko jablko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of constructing TextDirective nodes, then walking the tree and swapping them for the template args, I've now replaced the mdast extension with one that inserts the template args in the first place. This eliminates mdast-util-directive and the visitor.

Comment on lines +1164 to +1168
const parent = this.stack[this.stack.length - 1]
assert(parent, 'expected `parent`')
assert('children' in parent, 'expected `parent`')
// @ts-expect-error: Assume `Node` can exist as a child of `parent`.
parent.children.push(values.shift())
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from:

const parent = this.stack[this.stack.length - 1]
assert(parent, 'expected `parent`')
assert('children' in parent, 'expected `parent`')
// @ts-expect-error: Assume `Node` can exist as a child of `parent`.
parent.children.push(node)

@jablko
Copy link
Author

jablko commented Oct 1, 2021

I've now replaced micromark-extension-directive.

I first tried inserting events for expressions directly, something like:

  const prep = preprocess()
  const tokenize = parse().document().write
  const comp = compiler({
    mdastExtensions: [
      /** @type {Extension} */ (/** @type {unknown} */ ({expression}))
    ]
  })
  return comp(
    postprocess(
      strings.flatMap((string, i) => {
        const end = i === strings.length - 1
        const events = tokenize(prep(string, undefined, end))
        if (!end)
          events.push(
            /** @type {Event} */ (/** @type {unknown} */ (['expression']))
          )
        return events
      })
    )
  )

Based on: https://github.com/micromark/micromark/blob/d0a70fd29d5392835a3a5da2d9ae076cc756fab8/packages/micromark/dev/stream.js#L23-L25

However I found the resulting mdast (of md`*${text}*`) contained { type: 'text', value: '**' } vs. { type: 'emphasis', children: [text] }.

Next I tried inserting into chunks, something like:

  return comp(
    postprocess(
      strings.flatMap((string, i) => {
        const chunks = preprocess()(string, undefined, true)
        if (i < strings.length - 1) chunks[chunks.length - 1] = -6
        return tokenize(chunks)
      })
    )
  )

However I found if chunks is ['*', '*', codes.eof] I get { type: 'text', value: '**' }, and if it's ['*', -6, '*', codes.eof] (-6: an unassigned code) I get an unordered list.

So I'm back to inserting into the input, like interleaving the strings with :expression directives, except I picked an exotic Unicode character instead so it's unlikely to appear in the input already. U+FFFC OBJECT REPLACEMENT CHARACTER seemed like an okay choice?

What do you think?

@wooorm
Copy link
Member

wooorm commented Oct 8, 2021

U+FFFC OBJECT REPLACEMENT CHARACTER is pretty interesting! I think it’s okay to inject it, because it’s for tagged template strings that programmers are authoring themselves?

(i was reminded of this thread due to https://twitter.com/wooorm/status/1446353015389237265)

@wooorm
Copy link
Member

wooorm commented Oct 8, 2021

What’s the reason you’re focussing this on AST building, and not for example HTML or so building?

@wooorm
Copy link
Member

wooorm commented Dec 30, 2021

I’m definitely open to discussing this more—it’s quite interesting—but I don’t think this is the place to add such code.

@wooorm wooorm closed this Dec 30, 2021
@wooorm wooorm added the 🙅 no/wontfix This is not (enough of) an issue for this project label Dec 30, 2021
@github-actions

This comment has been minimized.

@github-actions github-actions bot added 👎 phase/no Post cannot or will not be acted on and removed 👋 phase/new Post is being triaged automatically labels Dec 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🙅 no/wontfix This is not (enough of) an issue for this project 👎 phase/no Post cannot or will not be acted on
Development

Successfully merging this pull request may close these issues.

None yet

3 participants