Need a scheme/philosophy/plan for unifying/breaking apart "level 0" names #254

NSoiffer · 2022-01-05T18:48:22Z

Currently we have "level 1" (or perhaps better called "core"?) names that are meant to be known by applications and may have specialized ways of speaking them (e.g., for fraction "one half", "3 over n", "meters per second"). There is also the wild west of names in "level 3" (or perhaps better called "open"?).

This issue is about coming up with a design philosophy for when to unify/split names for level 1. There might be a separate issue needed for how names should be chosen (short vs long, etc.), but the focus here is on when to name something. Here are some examples:

interval: should this be just one name with four arguments (the brackets are arguments) or four names (open-interval, open-close-interval, etc.) with two arguments?
square root/root: should there be a separate name for square root? cube root? What about for the positive and negative values of roots? Or the real-valued roots?
sets: should there be a special name for empty sets? If not, is it ok if "set" has a different number of arguments in the case of an empty set? Should there be versions for explicit sets ({1,2,3} vs sets with a "such that symbol ({x | x^2 < 4})?
minus/negative and plus/positive: do we need to distinguish between unary and nary versions?
fraction/reciprocal: do we need to distinguish these since sometimes you want to say "the reciprocal of x" instead of "1 over x" (or "fraction 1 over x end fraction")
large operator (e.g, sum, union, integral) with limits: do we need special forms if the lower limit is of the form x=0 as opposed to just D or x∈D? If there is only a lower limit?

We should develop a general plan such as "less is better" with exceptions such as "delimiters should not be arguments" so we can make consistent decisions. E.g., if we adopted the above two rules, then all of the above would have just one name with the exception of intervals which would have four.

The text was updated successfully, but these errors were encountered:

dginev · 2022-01-05T19:14:40Z

An underpinning question behind some of these examples is:

"Which types of syntax are acceptable to look up from the annotated presentation tree?"

For all other types of syntax, we need to invent new names in the "intent" lists.

Either AT can handle encountering an <mo>|</mo> when examining the tree carrying the attribute intent="set($arg, $condition)" or we need a special "set-builder" symbol which is only used for the "such that" construction.

NSoiffer · 2022-01-06T01:50:44Z

I strongly feel that anything not given by the value of intent or of it's arguments is out of bounds. So for the case of intervals, if only the start and end values are given, then that means that there has to be differently named interval intent names. For sets, the | is inside the argument to set, so it is findable. It can be tricky though with something like { x | |x| < 2} unless the absolute value has an intent.

FYI: MathCAT has the phases

Canonicalization (which includes fixes to the MathML from poor generation): "MathML" -> "canonical" MathML
Intent phase, including inferring intent when intent is not given: "canonical" MathML -> Intent (tree)
Speech generation phase: Intent -> String

So in my implementation, it is actually impossible to know anything outside of the value of intent when generating speech. It also means the set example with absolute value is not a problem to find the | that corresponds to "such that".

davidfarmer · 2022-01-06T15:33:53Z

I suggest that more is better. A set with items listed:

{1, 2, 3, 4, 5}

is different than a set constructed with "set builder" notation:

{x : x \in \Z, 0 < x < 6}.

Note that I used a colon as the separator, not a vertical line. I would not want to
be told that I had to use a vertical line or some special symbol.

I'd like to see (almost?) every previous item in this thread as its own separate
intent entity. For example:
\sum_{1 \le x \le M}
is mathematically the same as
\sum_{x=1}^M
but they are pronounced differently. I'd like to see the logic separating those
two cases happen before the MathML is generated. Maybe some AT is capable
of handling that, but why offload something which can be done with intent and
which is common enough to be in Level 0 or 1?

I think we can do this and also avoid the slippery slope of silliness like labeling
the number 4 with intent="4".

NSoiffer · 2022-01-20T19:29:10Z

Logging the WG's discussion summary:

Based on the WG meeting today, there was general consensus (but no official resolution) that more names are better than few names.

NSoiffer · 2023-01-12T18:49:02Z

No general philosophy but more names are betters as long as speech make good use of them.

NSoiffer added the intent Issues involving the proposed "intent" attr label Jan 6, 2022

NSoiffer closed this as completed Jan 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need a scheme/philosophy/plan for unifying/breaking apart "level 0" names #254

Need a scheme/philosophy/plan for unifying/breaking apart "level 0" names #254

NSoiffer commented Jan 5, 2022 •

edited

dginev commented Jan 5, 2022

NSoiffer commented Jan 6, 2022

davidfarmer commented Jan 6, 2022

NSoiffer commented Jan 20, 2022

NSoiffer commented Jan 12, 2023

Need a scheme/philosophy/plan for unifying/breaking apart "level 0" names #254

Need a scheme/philosophy/plan for unifying/breaking apart "level 0" names #254

Comments

NSoiffer commented Jan 5, 2022 • edited

dginev commented Jan 5, 2022

NSoiffer commented Jan 6, 2022

davidfarmer commented Jan 6, 2022

NSoiffer commented Jan 20, 2022

NSoiffer commented Jan 12, 2023

NSoiffer commented Jan 5, 2022 •

edited