-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TeX translation could be improved for numbers #2772
Comments
Please note, that any syntax conversions in MathJax are just that: conversion between purely syntactic representations, i.e., LaTeX, MathML or AsciiMath. The goal is to retain visual equivalence. Making a leap of faith interpretation could destroy that. I would argue that a clear separation of syntax and semantic is important. Semantic interpretation should be left to a higher level recognition process, in our case it is done by SRE, which effectively only uses spatial layout and pattern recognition to build its own representation. (For an --- outdated version --- of what that tree looks like have a look here: https://zorkow.github.io/semantic-tree-visualiser/visualise.html) This is embedded without interfering with the visual rendering. (Past issues show that this is not always successful! So I should say minimal interference.) SRE has a number of heuristics for numbers, some locale dependent. But only a few are currently exposed in MathJax. I might expose a few more for the next release. But I would never claim that any of these will be perfect or indeed will reflect the intentions of authors; so I am sure misinterpretations can be found in the future. |
@NSoiffer |
Sorry for not getting back to this sooner -- it got buried. I appreciate the requirement not to break the display, but wrapping the digits inside of an If you agree with the above (which I think might be a big "if"), then you have two choices:
There is no way to know what is in the author's mind, but some simple rules will yield likely >>99% correctness. MathPlayer had some repair and MathCAT has stronger repair, but most systems won't and so JAWS and VoiceOver will likely speak the expression poorly, something that MathJax could prevent by merging these cases into a single I was aware of two common digit block strategies: Western languages use blocks of three and in many Asian countries, it is blocks of four. This wikipedia article mentions India as using a somewhat different style. In all these cases, locale helps resolve what to do. In looking up the wikipedia page, I also discovered that ISO has a standard out that say blocks of three are preferred with whitespace as the preferred separator. It also mentions using that after the decimal point. |
Issue Summary
The translation for numbers containing spaces or commas could be improved
Issue details:
16\,807
MathJax will produce the following MathML when using "copy to clipboard:mathml"Notice that the number is broken up into two mn's. Also notice that SRE is interpreting the space as multiplication. Although it is possible this is what is meant, I think it is far more likely that this is meant to be a single number. Context and digit block counting could be used to choose one interpretation in favor of another.
A similar issue arises when using
,
. E.g.,7^5=16,807
. In this case, context clearly points to16,807
being a single number.This poor translation will effect speech. Potentially it affects braille generation also.
Technical details:
The text was updated successfully, but these errors were encountered: