Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement template literals #1715

Merged
merged 15 commits into from
Mar 26, 2021
Merged

Implement template literals #1715

merged 15 commits into from
Mar 26, 2021

Conversation

dcodeIO
Copy link
Member

@dcodeIO dcodeIO commented Mar 6, 2021

Implements parsing and compilation of (tagged) template literals:

  • Literals without substitutions and without a tag become static (multi-line) strings, just like before
  • Literals with substitutions but without a tag become a StaticArray<string> with placeholders between the string parts for the substitution values, that is then join("")ed
  • If a literal does not have a tag, non-string arguments are implicitly converted to strings. Otherwise, the tag function receives values of the original types.
  • Other literals become calls to the respective tag function, with the first argument being either
    • a TemplateStringsArray including its .raw property when the function takes a TemplateStringsArray
    • an Array<string> without a .raw property if the first argument is of another type to avoid adding unnecessary static data for the .raw property

One difference to JS is that if a tagged template literal contains invalid escape sequences, which is allowed since Template Literals Revision, we are not actually inserting an undefined into the TemplateStringsArray but keep the broken string. A bit questionable as it avoids unnecessary null checks at the expense of not doing exactly what the spec says, solely justified by "but we don't have undefined". Thoughts?

  • I've read the contributing guidelines

@dcodeIO
Copy link
Member Author

dcodeIO commented Mar 6, 2021

Turns out there's an edge case here with tagged template literals, where there is a .raw property on the array, containing the original string without escapes, and any invalid escapes resulting in undefined array elements within the first argument (with the restriction to error out on these lifted as per the referenced proposal). The undefined is unfortunate, as we'd have to use a null there, and the .raw property requires wiring up an array extension. Any preferences what to do? (Currently, invalid escapes are piped through without yielding an undefined, and there is no .raw yet.)

@MaxGraey
Copy link
Member

MaxGraey commented Mar 7, 2021

how about just throw an error for any invalid escapes for now?

@dcodeIO
Copy link
Member Author

dcodeIO commented Mar 7, 2021

Interesting aspect is that the proposal (status) there tries to avoid exactly that, and it's already half-way implemented now so we may as well see what we can do.

@MaxGraey
Copy link
Member

MaxGraey commented Mar 7, 2021

Hmm, but that proposal on Stage 1 for a long time

@dcodeIO
Copy link
Member Author

dcodeIO commented Mar 7, 2021

That's the state on the repo, yeah, but it seems to actually be a finished proposal.

@dcodeIO
Copy link
Member Author

dcodeIO commented Mar 8, 2021

Also still thinking about codegen here. With tagged template literals using a user defined function of the form (strings: TemplateStringsArray, arg1, arg2, arg3, ...) => any which we'll want to support eventually, my intuition would be to also generate a default function for non-tagged template literals. That'd be peak performance I think, trading (significantly) larger code size for less loops and static memory, then boiling down to mostly memcpy/itoa/dtoa into a new string allocation, only doing additional dynamic allocs where .toString on a value cannot be avoided. Alternative could be to use a catch-all function (strings: TemplateStringsArray, values: string[]) => string with all values pre-toString-ed, trading one intermediate dynamic alloc into a static memory segment for values for each non-string value, involving several loops, but has (significantly) smaller code size.

@MaxGraey
Copy link
Member

MaxGraey commented Mar 8, 2021

I suggest a tradeoff. Generate separate versions with arity less than 6-8 arguments threshold similar to monomorphisation. But fallback to sigle values: string[] argument when arity >= (6-8) args. Also we could check size of body and shrinkLevel for adjusting threshold.

Actually this approach will be great for any variadic functions (except exported I guess)

@dcodeIO
Copy link
Member Author

dcodeIO commented Mar 10, 2021

Now compiles untagged template literals with substitutions to a static StaticArray<string>, leaving holes for the dynamic parts and setting them before calling staticArray.join(""). Somewhat of a compromise that integrates well with the rest of stdlib by means of code reuse (ends up utilizing joinStringArray).

@dcodeIO dcodeIO changed the title Implement parsing of template literals Implement template literals Mar 10, 2021
@dcodeIO
Copy link
Member Author

dcodeIO commented Mar 14, 2021

This should be mostly working now. Updated the opening post with the details.

@MaxGraey
Copy link
Member

MaxGraey commented Mar 14, 2021

Could you add also String.raw()?

@dcodeIO
Copy link
Member Author

dcodeIO commented Mar 14, 2021

From what I can tell so far, String.raw would have to be some sort of builtin. Take for example

class String {
  ...
  static raw(tsa: TemplateStringsArray, ...args: any[]): string {
    ...
  }
}

where args would have to be the values of any number of substitutions being present and processed, but we don't have rest parameters, and even if we had, we don't have any to properly assemble the string according to what for instance the following would do:

String.raw`Hello\n${1+2}` // "Hello\\n3"

@MaxGraey
Copy link
Member

yeah. That's definitely should be a builtin. At least for now

@MaxGraey
Copy link
Member

This raw(tsa: TemplateStringsArray, ...args: unknown[]): string btw is typical variadic function which will be great to support

@dcodeIO
Copy link
Member Author

dcodeIO commented Mar 14, 2021

Perhaps one option would be to detect a signature like raw(tsa: TemplateStringsArray, values: string[]): string and implicitly convert all substitution arguments to strings when encountering it. That'd be sufficient to implement String.raw, but may explode in the TS checker.

@dcodeIO
Copy link
Member Author

dcodeIO commented Mar 14, 2021

I don't have a good solution for variadic functions in general yet, as it both loses type info so one has to cast, which we cannot do between references and basic values yet, and to implement that would need boxing, which is a bit meh without a stack.

@jtenner
Copy link
Contributor

jtenner commented Mar 14, 2021

Count me for very excited :D

@willemneal
Copy link
Contributor

I tested this locally and everything worked! Is there anything else blocking this in its current form?

@dcodeIO
Copy link
Member Author

dcodeIO commented Mar 26, 2021

Should be good to go I think, except that tagged template literals are currently somewhat unsafe due to the lack of ReadonlyArray. Since there is interest, I can gate tagged template literals behind an error for now, so we can get rolling with untagged template literals already and revisit once we have readonly arrays. Does that sound good?

@dcodeIO
Copy link
Member Author

dcodeIO commented Mar 26, 2021

It's also still missing a String.raw builtin, but since that's tagged, we don't necessarily need it right now.

@willemneal
Copy link
Contributor

Looks great! I think most users will used untagged so the error is good for now. Thanks for the quick work!

@dcodeIO dcodeIO merged commit 1789577 into master Mar 26, 2021
@dcodeIO dcodeIO deleted the template-literal-parsing branch June 1, 2021 15:20
This was referenced Aug 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants