New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement strict mode (replacing unicodeTextInMathMode) #1278
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1278 +/- ##
==========================================
- Coverage 83.99% 83.87% -0.12%
==========================================
Files 62 64 +2
Lines 4017 4081 +64
Branches 670 689 +19
==========================================
+ Hits 3374 3423 +49
- Misses 546 559 +13
- Partials 97 99 +2
Continue to review full report at Codecov.
|
Add new "strict" setting (default value false) that can take a boolean (whether to throw an error or silently ignore), string ("ignore", "warn", or "error"), or a function possibly returning such a value. This enables a variety of ways of handling or ignoring transgressions from "true" LaTeX behavior, making KaTeX easier to use while still providing the ability for strict LaTeX adherance. Resolve KaTeX#1226, implementing that spec, for two existing transgressions from regular LaTeX: * src/functions/kern.js had some errors and warnings about use of (units in) math vs. text mode commands. * The former setting unicodeTextInMathMode (not in any released version) needed to be set to true to enable Unicode text symbols in math mode. Now these are controlled by the strict setting. By default, KaTeX is now very permissive, but if desired, the user can request warnings or errors.
README.md
Outdated
- `maxSize`: `number`. If non-zero, all user-specified sizes, e.g. in `\rule{500em}{500em}`, will be capped to `maxSize` ems. Otherwise, users can make elements and spaces arbitrarily large (the default behavior). | ||
- `strict`: `boolean` or `string` or `function` (default: `false`). If `false` or `"ignore`", ignore features that make writing LaTeX convenient but are not actually supported by (Xe)LaTeX (similar to MathJax). If `true` or `"error"` (LaTeX compatibility mode), throw an error for any such transgressions. If `"warn"`, warn about such behavior via `console.warn`. Provide a custom function `handler(errorCode, errorMsg, token)` to default behavior depending on the type of transgression (summarized by the string code `errorCode` and detailed in `errorMsg`). A list of such features and their `errorCode`s: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't compatibility mode
usually mean non-strict mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, "compatibility" maybe isn't the right term. Perhaps "adherence" or "faithfulness"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed a rewording of this paragraph that in particular uses "faithfulness" instead of "compatibility".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for aggregating various settings into strict
. Looking forward to getting this merged in so that we can continue to build out functionality while have a consistent way to provide a strict mode to users who need it.
src/Settings.js
Outdated
* Report nonstrict (non-LaTeX-compatible) input. | ||
* Can safely not be called if `this.strict` is false in JavaScript. | ||
*/ | ||
nonstrict(errorCode: string, errorMsg: string, token?: Token | ParseNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this. Maybe rename token
to tokenOrParseNode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was matching ParseError
's constructor. If I change one, I should probably change both... I could also see something like where
or context
being a good name.
"LaTeX-incompatible input and strict mode is set to 'error': " + | ||
`${errorMsg} [${errorCode}]`, token); | ||
} else if (strict === "warn") { | ||
typeof console !== "undefined" && console.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is console
not defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of such an environment, but presumably they might exist (or the global console could be overwritten?). In any case, I'm following existing code from the KaTeX codebase, in particular:
typeof console !== "undefined" && console.warn(
"No character metrics for '" + value + "' in style '" +
fontName + "'");
It might be good to define a katex.warn
or something that can be overwritten by the user, but defaults to console.warn
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow, I didn't realize we were doing that elsewhere.
if (query.strict.match(/^(1|t|y|e)/)) { | ||
options.strict = "error"; | ||
} if (query.strict && query.strict.match(/^(w)/)) { | ||
options.strict = "warn"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the simplicity of this.
src/Settings.js
Outdated
|
||
import type { MacroMap } from "./macros"; | ||
|
||
export type StrictFunction = | ||
(errorCode: string, errorMsg: string, token?: Token | ParseNode) => | ||
?(boolean | string); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exposing this level of customizability to users seems like overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This level of customizability lets users allow some nonstrict behavior and not others. @ylemkimon at least seemed to think this would be useful, and it's not exactly hard to support. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another point for allowing a function is that it enables custom warning functionality, if console.warn
isn't the desired warning output. For example, this would make our own testing easier, as jest
doesn't really handle warnings well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's nice to have customizability as long as we have simpler options to set them. We may add lots of features to the non-strict mode, and strict
option may become too generic.
src/Settings.js
Outdated
export type SettingsOptions = { | ||
displayMode?: boolean; | ||
throwOnError?: boolean; | ||
errorColor?: string; | ||
macros?: MacroMap; | ||
colorIsTextColor?: boolean; | ||
unicodeTextInMathMode?: boolean; | ||
strict?: boolean | string | StrictFunction; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's sufficient if strict?: boolean | "ignore" | "warn" | "error"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replaced string
with "ignore" | "warn" | "error"
but left StrictFunction
for now.
If possible (in particular, if it doesn't slow things down much), I think it would be good to merge this before the next release, because it makes behavior more permissive than the current release. So far we haven't released with the In other words, if we release now, we might break some code. If we merge this first, we shouldn't, because of more permissive defaults. (Alternatively, we could change the default Also, this PR removes the |
By the way, we might also discuss whether the right default for
|
this.settings.nonstrict("unicodeTextInMathMode", | ||
`Unicode text character "${text[0]}" used in math mode`, | ||
nucleus); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about allowing all unicodes(non-ASCII) on non-strict mode? (#1217, allowAllSymbols
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering about that, but planning to deal with it when we return to discussing #1217. allowAllSymbols
feels like more of a hack, not to be encouraged -- the rendering will probably be poor in general, and it's really a sign that we need to add symbols to the fonts. So I was thinking of keeping that a separate option... but I'm open to suggestions. (And maybe it is worth discussing here, in case we want to tweak the strict
functionality to deal with it...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's where the customizability should come in. Normal users would intend used symbols to be shown, but advanced users or who worries about the layout would want to know unsupported symbol has been used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I agree. I think this is more motivation to set the default strict
value to "warn"
, so at least people get a sense of why the rendering is funny in these cases.
I've gone ahead and changed the default setting for I also added functionality to check for warnings in Jest, turning them into a special type of |
@ylemkimon Thanks for the review!
|
|
The changes to make strict mode control |
Add new
"strict"
setting (default value false) that can take a boolean (whether to throw an error or silently ignore), string ("ignore"
,"warn"
, or"error"
), or a function possibly returning such a value. This enables a variety of ways of handling or ignoring transgressions from "true" LaTeX behavior, making KaTeX easier to use while still providing the ability for strict LaTeX adherance.Resolve #1226, implementing that spec, for two existing transgressions from regular LaTeX:
src/functions/kern.js
had some errors and warnings about use of (units in) math vs. text mode commands.unicodeTextInMathMode
(not in any released version) needed to be set to true to enable Unicode text symbols in math mode.Now these are controlled by the
strict
setting. By default, KaTeX is now very permissive, but if desired, the user can request warnings or errors.