Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of arbitrary Unicode in TeX input. #653

Merged
merged 2 commits into from
May 12, 2021
Merged

Conversation

dpvc
Copy link
Member

@dpvc dpvc commented Mar 23, 2021

This PR augments the RANGES list to include the type of MathML element to create for each range, so that alphabetic ranges get mi and others get mo (or mn or whatever is appropriate). The function for obtaining the range for a character is moved to the OperatorDictionary.ts file, and removed from the MmlMo class, which now uses that new function. The TeX base configuration is modified to get the type of MathML element to create, rather than always using <mo>.

This should improve the results for Greek characters, for example, and for other languages, though there is probably more granularity that could help in some cases.

Note: This does mean some existing content may render differently. E.g., Greek letters entered as unicode used to show up as upright letters, but now will be italic, since they will be in <mi> elements (just like \alpha, etc., produce).

@dpvc dpvc added this to the 3.1.3 milestone Mar 23, 2021
@dpvc dpvc requested a review from zorkow March 23, 2021 18:01
@dpvc dpvc modified the milestones: 3.1.3, 3.2 Apr 1, 2021
@dpvc
Copy link
Member Author

dpvc commented Apr 1, 2021

I'm moving this to 3.2, since it changes behavior.

Copy link
Member

@zorkow zorkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@dpvc
Copy link
Member Author

dpvc commented Apr 14, 2021

This one is for the 3.2 release, since it will change the output for some Unicode characters (using mi or mtext rather than mo). So I will wait until then for the merge.

@dpvc dpvc merged commit 1bbb778 into develop May 12, 2021
@dpvc dpvc deleted the better-ranges branch May 12, 2021 16:57
@pkra pkra mentioned this pull request May 13, 2021
@pkra
Copy link
Contributor

pkra commented May 13, 2021

I'm no longer able to compile the develop branch. I think it's due to this change. Here's what I get:

> mathjax-full@3.1.4 compile
> npx tsc

ts/core/MmlTree/MmlNodes/mo.ts:28:42 - error TS6133: 'RANGES' is declared but its value is never read.

28 import {OperatorList, OPTABLE, RangeDef, RANGES, MMLSPACING} from '../OperatorDictionary.js';
                                            ~~~~~~

ts/core/MmlTree/MmlNodes/mo.ts:387:19 - error TS2663: Cannot find name 'getRange'. Did you mean the instance member 'this.getRange'?

387       let range = getRange(mo);
                      ~~~~~~~~

ts/core/MmlTree/MmlNodes/mo.ts:445:53 - error TS2339: Property 'RANGES' does not exist on type 'typeof MmlMo'.

445     let ranges = (this.constructor as typeof MmlMo).RANGES;
                                                        ~~~~~~


Found 3 errors.

@pkra
Copy link
Contributor

pkra commented May 13, 2021

If there's interest, I'd be happy to make a PR that adds a simple GitHub action to test if develop compiles after the merge.

dpvc added a commit that referenced this pull request May 13, 2021
@dpvc
Copy link
Member Author

dpvc commented May 13, 2021

Argh! That was due to an incorrect merge conflict resolution. I have fixed it, so you should be able to compile now.

I'd be happy to make a PR that adds a simple GitHub action to test if develop compiles after the merge.

Sure, that would be great. Thanks!

@pkra
Copy link
Contributor

pkra commented May 13, 2021

That was due to an incorrect merge conflict resolution. I have fixed it, so you should be able to compile now.

Thanks, Davide. It compiles now.

Sure, that would be great.

Great. I'll work out a first draft.

@pkra
Copy link
Contributor

pkra commented May 13, 2021

A follow up to the actual code.

While testing today, I noticed that sha Ш (as direct input) used to be upright but is now italic. Is this an intentional change?

@dpvc
Copy link
Member Author

dpvc commented May 13, 2021

Yes. In the past, all unknown characters were put into <mo> elements. This branch tries to be more granular about it, and uses <mi> for letters and <mo> for symbols. Since Ш (U+0428) is a letter (CYRILLIC CAPITAL LETTER SHA), it is now translated by <mi>&#x0428;</mi>, and that means it is italic by default.

@pkra
Copy link
Contributor

pkra commented May 13, 2021

Thanks, Davide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants