New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support Unicode (sub|super)script characters #3633
Conversation
I'm having some issues.
This PR changes only the Parser and the unit tests indicate that the Parser is working just fine. So I don't know what is causing these errors. Any suggestions would be welcome. |
Curious. I can reproduce the behavior you're getting too. Commenting out the changes to I have a different concern, which might also end up helping. I worry that the PR as it is now will have a performance penalty on all of KaTeX, as every lexed symbol needs to be checked against the new regular expression. Here's an alternate proposal:
This will have an additional benefit, which is better interaction with macros. For example: \def\SupTwo{²}
\def\SupThree{³}
\SupTwo⁺\SupThree These won't coalesce with the current approach, but should with repeated calls to What do you think, @ronkok? If you think this is a reasonable idea, but you'd rather not implement it, I could take a stab. |
Agree completely regarding the performance penalty to the Lexer. That concerned me, too. And now that I think about it, the That approach would not have fit so nicely in an earlier draft of my code. Hence the revision to the Lexer. I'll have a go at the |
Codecov Report
@@ Coverage Diff @@
## main #3633 +/- ##
==========================================
+ Coverage 93.48% 93.50% +0.01%
==========================================
Files 89 90 +1
Lines 6619 6636 +17
Branches 1538 1543 +5
==========================================
+ Hits 6188 6205 +17
Misses 400 400
Partials 31 31
Continue to review full report at Codecov.
|
This PR now acquires tokens by repeated I now get good behavior from |
The Jest test has apparently run successfully in GitHub and the code is executing well in the Netlify preview. I think this PR is ready to go. |
This question should not hold up any pending reviews but
I'm getting these from the wikipedia page. There are some other ones on that page, too. |
unicode-math has a much larger list with the unicode values here. |
Characters such as But I think that the NotePad++ / Consolas combination is a frequent entry point for beginning coders and I am a little reluctant to encourage the creation of documents that some authors will not be able to read. |
I feel like matching unicode-math's list makes sense. Presumably those who want to use this will use appropriate fonts. |
I have in mind the beginner who looks at existing code as a means to learn something. I do not hold this opinion strongly. I'm going to wait a day before I take any action. |
In my experience, people generally understand what's going on when a character doesn't display correctly and know to switch to a font that supports it. |
I understand your concern, but I'd lean heavily toward LaTeX compatibility here. Also, the characters render in the web browser (at least my Chrome on Windows), so it's natural to try to copy/paste them. Even if they don't render in the editor, I would expect them to render on the website. |
Okay, I've matched the list in unicode-math. This should be ready to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great now! I found what I think is one typo, and a possible code simplification.
## [0.15.4](v0.15.3...v0.15.4) (2022-05-20) ### Features * Support Unicode (sub|super)script characters ([#3633](#3633)) ([d8fc35e](d8fc35e))
🎉 This PR is included in version 0.15.4 🎉 The release is available on: Your semantic-release bot 📦🚀 |
Math-mode Unicode (sub|super)script characters will now render as if you had written regular characters in a subscript or superscript. For instance,
A²⁺³
will render the same asA^{2+3}
.Resolves #1218.