Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intent "alias" capability #40

Open
dginev opened this issue Jan 11, 2022 · 16 comments
Open

Intent "alias" capability #40

dginev opened this issue Jan 11, 2022 · 16 comments

Comments

@dginev
Copy link
Contributor

dginev commented Jan 11, 2022

Description

We have a common problem of synonymous and near-synonymous names when dealing with mathematical concepts. The same exact construct, often using the same notation, can be narrated using different words, while understood by professionals to have the same meaning. This makes it difficult to maintain a list of Intent values where we have a single name for each listed concept.

There are many reasons for such synonyms existing. To enumerate a partial list:

  1. due to common words, which have preexisting common synonyms, e.g.
    • "opposite", "negation", "additive inverse"
    • "euclidean-metric" and "euclidean-distance"
    • "in", "member-of", and "element-of"
  2. due to different mathematical perspective on the same object, e.g.
    • "greatest-common-divisor" and "greatest-common-factor"
    • piecewise "otherwise" and "elsewhere" (example in 2005.07738)
    • names of numeric literals, e.g. "undecillion" and "sextillion" can both refer to 10^36. There are other examples.
      • Also note that some of the names are not unambiguous standalone, as they refer to different values in "short scale" and "long scale". Taking "undecillion", it can be either 10^36 or 10^66. Hence, the implied scale must be known for use in a CAS application. This level of disambiguation is not a requirement for intent, where we only need a concept name that anchors the intended narration for AT.
  3. due to historical circumstances in academia e.g.
    • "euler-constant" and "euler-mascheroni-constant"
    • "euler-number" and "napier-constant"
  4. due to colloquial and technical names existing:
    • vector "norm" and "length"
    • "exclusive-or" and "exclusive-disjunction"
  5. due to visual and technical names existing:
    • "wedge-product" and "exterior-product"
  6. due to other, some, or all of the above:
    • "falling-factorial", "descending-factorial", "falling-sequential-product", "lower-factorial", "pochhammer-symbol"

To say these synonyms "exist", is to say that practitioners use them - and so will practitioners using the new Intent standard - as long as we have an "Open" level. So one way or another, we will need to make provisions for them. If no special mechanism exists, the baseline support I could imagine would be:

Baseline treatment

Add each synonymous name independently to the Intent "Open" list, copying over any relevant additional information from its main entry.

This approach has no explicit connection between the (near-)synonyms, so AT will see them as completely independent.

Proposed "alias" mechanism

Currently, I have experimented with adding a column called "alias" to the list, where each main entry can receive additional known names. AT can then do an extended table lookup, and reuse any implementation for the main entry narration also for the alias narrations.

Benefit 1: each time our group starts a lexicographical discussion about "what is the Best name" to use in the list, we don't have to spend the time and effort making that decision. In the end of the day, these decisions are often arbitrary, not just in our group, but in mathematical practice in general. Rather than debating whether e.g. "log", "common-logarithm" or "logarithm" should be the Best name in our "Core" list, the aliasing mechanism allows us to make a "soft" choice for the primary name, where anyone that prefers an alternative name (again, established in actual mathematical practice) can add it as an alias and use it in their annotations.

Note: This idea of a "soft" preference is also something that entices me on the AT implementation side - if a user annotated "common-logarithm" that is a soft preference to use any speech specially dedicated to that string (for example - to be more specific), and vice versa if a user annotated "log" it could be a soft preference to be more succinct. Neil has made a very good case that this decision is only possible to do correctly by the AT, in the narration mode best suited for an individual user's needs. So AT makes a final decision.

The "soft" preference comes into play where, all else being equal, the author's wish may be respected - hopefully to the benefit of conveying the expression as close as possible to how the author wanted it received.

Benefit 2: rules where AT has special narration implemented can be directly reused, and maintain together, with the concept's aliases.


In the end this is an organizational question for the official lists, and what will be most convenient there long-term. We started a group discussion in our first Math WG call of 2022, and I think the group generally found this to be a suggestion that adds complexity for either limited value, or too soon - one of the sentiments is that we should have the most minimalist outcome possible for the first Intent proposal.

As we agreed in the meeting, I am opening the issue to explore the trade-offs fully. Discussion and feedback welcome!

@davidfarmer
Copy link

I am not sure I am thinking about this the right way, but what I see here is a two-step process.

Step 1: specify the meaning/intent of the expression.

Step 2: specify how to indicate a preferred way (among the ways which are arguably
mathematically equivalent) to voice that expression.

I hope that we come up with a good way to do Step 1. Examples I keep returning to
are |x| and (a,b) and a × b. Without Step 1, or specialized knowledge and the
context in which the expression occurs, it is impossible to be sure what the expression
means.

Once Step 1 specifies the meaning of (a,b), there may still be several
ways to pronounce it.

If course, if one knows that the subject is K-14 then Step 1 may be unnecessary since
many expressions are unambiguous. The expression a ∈ A is unambiguous in that
context, for example.

But, as noted at the beginning of this issue, a ∈ A can be pronounced several
different ways. Since the meaning is unambiguous, should we allow the author
to specify how it is pronounced? I will be provocative and suggest "no", for two
reasons. (But at the end see a suggested have-it-both-ways.)

My first reason is that good AT is able to adapt to the needs of the user. Many expressions
have a short way to voice them, which is the preference of someone who is familiar with
the subject. I would like to hear a ∈ A as "a in A". If I were a student learning the subject,
I might like to hear "a is an element of A". In either case, the reader wants to let the AT
know how expansive to be when pronouncing the expression. It may impede learning
for the student to hear "in", and I will be annoyed at the waste of my time if I constantly
hear "is an element of".

Another example is sinh(x), which is unambiguous. I have heard it pronounced "sinch"
and "shine" on opposite sides of the Atlantic. And when first learning it, the teacher may
have said "hyperbolic sine". I see that example, and the a ∈ A example, as something the
AT can handle, even in the extreme cases where the author is intending to write for
a beginner.

My second reason is that Step 1, in the cases where the expression is ambiguous,
is the most important thing we are doing in this group, and we need to get it right.
Adding the complication of requiring more than just the meaning/intent,
makes the technical implementation more difficult and also makes it less likely
authors will go to that effort (if it appears Step 2 is needed frequently).

If intent only addressed Step 1, and there were a different attribute to do Step 2,
then maybe that is workable. But having two intents for the same concept, just
because there are different names or pronunciations for that concept, seems like
asking for confusion.

Apologies if I misunderstood the suggested implementation of Step 1 and Step 2.
(And further apologies if my distinction between the two steps has mischaracterized
the issue.)

@brucemiller
Copy link

brucemiller commented Jan 13, 2022 via email

@polx
Copy link
Contributor

polx commented Jan 13, 2022

I feel strongly against claiming equivalence where any single person on earth may speak against such.
I do believe that authors are the best persons to know what their readers will need to hear, unless the latter has done some manual adjustments.

However, I do not see a problem to simply grouping the default intents' rules for a better overview.

@davidfarmer
Copy link

In the Zoom chat @polx suggested that the aliases could help authors to identify the
official intent value for the concept.

@NSoiffer
Copy link
Contributor

It seems to me there are two different ideas:

  1. finding/using the a level 1 name
  2. author control over the speech

For '1', I think this can be solved by listing a bunch of names in the description so that any search finds the (single) core name.

An advantage of this approach is that it leaves the other names to be used as if they were "open" names and hence will get spoken as an unknown name (typically 'my alias of xxx'). Since solves the "author control" problem ('2').

The only issue is if someone wants to force the speech for a name that is in core and might be spoken differently by AT. A solution to this would be to add a "-" to the start or end of the name. AT needs to converts hyphens to spaces to avoid some speech engines from speaking the hyphen (possibly as "minus" or "dash" or...), so if we say something in the spec about this, then the name "-open-interval" is not recognized as the core name "open-interval" but would be spoken as "open interval of 1 comma 5" instead of (maybe) "the open interval from 1 to 5". The downside is the author is stuck with however the AT says unknown names although this too can be overridden with something like intent="open interval of 1 comma 5" (hopefully this is done only in the rarest of situations).

@NSoiffer
Copy link
Contributor

To follow up on the WG discussion today about alternatives to the word "alias", here are a few from a thesaurus (there are surprisingly few):

  • assumed-name
  • pseudonym or ananym -- both are similar in meaning in the sense they relate to anonymity of a person
  • nickname
  • moniker

I'm not keen on any of these synonyms and that well may be because "alias" is not close to what is desired.

To throw out a few others after poking around in the thesaurus:

  • synonym
  • label
  • tag
  • handle
  • key words
  • nomen (I had to look that one up, but it is somewhat apt)

Maybe one of those will stimulate some idea for a better name than "alias"...

@physikerwelt
Copy link
Member

physikerwelt commented Jan 20, 2022

I would suggest following the naming convention from Wikidata
label for the single main label and
also known as for the others.

@brucemiller
Copy link

brucemiller commented Jan 20, 2022 via email

@NSoiffer
Copy link
Contributor

Following up on @physikerwelt's suggestion: how about "label" and "similar-names". I feel "also known as" still strongly implies that the other names could be used in place of the "label" with the same result.

@dginev
Copy link
Contributor Author

dginev commented Jan 27, 2022

A new neutral phrasing that came to me today:

"supplemental names"

@mathematicsformisfits
Copy link

Following this discussion as an author and a 'perpetual student' -- unfortunately the scope of authoring math for pedagogy and authoring math for research do not always align. Pedagogical needs are far more varied, and would occupy the alias debate indefinitely. Research on the other hand is more 'top-down', defining new terms itself. I would argue that pedagogical needs outweighs research for three reasons:

  • pedagogical content is consumed in far greater volume
  • pedagogical content is authored by far more people
  • research audiences can adapt more readily and/or contact the source if there is grave confusion

This is not to downplay making research accessible-friendly out-of-the-box. But the need to account for multiple conventions more critically applies to K - undergraduate levels.

With that in mind, the ultimate goal is for authors to use these tools. I imagine an authoring tool that one could customize with a `preset' of aliases for a given scope of symbols they intend to use. Would the specifications as grouped in the intents list allow for something like this?

Then, an author could create their own such aliases if what they need isn't available. I respect the need to limit authors' complete freedom to override convention; at the same time, some authors are writing for very specific students and it would be best to allow as much customization as possible.

But does that spill over into ARIA? The math role discussion suggests they are revising the math role to not interfere with how MathML handles accessibility.

@dginev
Copy link
Contributor Author

dginev commented Apr 15, 2022

Thank you for the thoughtful reply @mathematicsformisfits !

I agree we need to get more clarity about the extent we want to be inter-operating with ARIA before everything settles. It makes a very big difference to the intent syntax whether we design it with a mindset that we have aria-label and aria-braillelabel available for low-level overrides, or if we design in a way where we don't. My current thinking is to allow ARIA for overrides, which separates concerns between author-enforced narration control (delegated to ARIA) and author-provided concept names and operator tree (delegated to Intent).

We've recently discussed we should find a way to solicit more ARIA feedback to our designs, so I'd expect this issue to remain open a while longer.

That said, I am obliged to take exception to your use of outweighs:

I would argue that pedagogical needs outweighs research for three reasons

I think what you have done is to motivate well pedagogical uses are important and must be supported well. You have not motivated well that they outweigh research use, in the sense of "the standard should actively disregard the needs of research texts". For that you would have to demonstrate research texts are 1) not in scope for accessibility or 2) have completely different requirements than educational materials, making them incompatible with Intent.

I would have to challenge you on both of those points. Both kinds of materials are important, and we should aim to provide support for remediating both, already in this first pass.

@mathematicsformisfits
Copy link

mathematicsformisfits commented Apr 18, 2022

@dginev you are right, and I may have been unclear with that statement. All content should fall within the scope. Thank you for pointing that out.

Allow me to contribute two examples:

  • The symbols < > as angle brackets could have a number of meanings: vectors, bra/ket inner product, cyclic group, skein relations, and so on...

So would the step 1/level 1 include multiple meanings, and then permit a particular phrasing with intent in step 2?

Then, if a particular phrasing was not available, an author could add their own with ARIA?

That would account for any use, and so my distinction between use cases is unnecessary.

On the other hand, first time students are sensitive to different phrasing, e.g.

  • the fraction "1/3" is colloquially spoken as "one third", "1 divided by 3", "1 over 3", and a screen reader will say it as such. But then x^{1/3} is colloquially spoken as "the cube root of x" or "x to the one third power". A screen reader defaults to "x to the 1 divided by 3" or even "x superscript 1 slash 3".

  • There is most certainly occasion to say "x to the 1 divided by 3", so I would not want to prevent that. This is an entirely different pedagogical discussion -- I'm thankful AT could provide the occasion to bring it up, and would not see it as a drawback!

  • There is the onscreen difference between typing a fraction as 1/3 vs. the smaller, vertical fraction \frac{1}{3} (please bear with my Latex representation, I use MathJax to convert). In regular text, I prefer the vertical fraction, but in an exponent, I prefer the appearance of 1/3 to the vertical fraction and \sqrt[3]{x}. Certainly, to get the proper phrasing, I should use the mathematical markup -- but if I wanted to be nitpicky with its appearance, I would want to force a certain visual rendering with a certain phrasing convention. It may seem arbitrary, but experience dictates these distinctions can make a difference to students.

  • SRE can get around this with ARIA to say "x to the one third"

In research, this latter case would be moot. So my apologies -- I do not mean to say it should not be given equal attention. I have just found more need for customization when creating teaching materials.

@dginev
Copy link
Contributor Author

dginev commented Jun 20, 2022

Replying to @mathematicsformisfits, but also a general comment to where we find ourselves with this issue:

But does that spill over into ARIA? The math role discussion suggests they are revising the math role to not interfere with how MathML handles accessibility.

We have had an update from ARIA, essentially delegating the task of designing the full accessibility experience over MathML to the Math WG. As such, we will have to build the practical pieces we need in the MathML spec itself, likely starting with the main features in MathML 4, and then adding more in MathML 5, if "intent" is adopted and further capabilities are needed.

As such, we would have to make a number of decisions on:

  • if the "alias" capability is needed now/at all
  • if the "speech hints" capability is needed now/at all
  • if analogues to aria-label, aria-braillelabel, aria-description, aria-details, aria-describedby, aria-labelledby are needed now/at all, and if so
    • whether we should lean on the existing ARIA specification,
    • or redo variants of them in intent, by extending its value grammar or adding additional new attributes.

We also have to make a decision as to which of these features should be general, and which are specific to "Intent Core" or "Intent Open".

I still think the best way to reach conclusions here is to work out the full details of what the lists of intent names will contain, as well as prototyping more examples of how they are used in AT, ideally in real texts. We are running short on time, so this will likely be as informed as we managed to make it through our work this summer.

@NSoiffer
Copy link
Contributor

NSoiffer commented Jan 5, 2023

This is sort of outside of what we want to put in the spec, but it is useful and should be dealt with somewhere. Moving to docs so that someone remembers to write something up about this.

@NSoiffer NSoiffer transferred this issue from w3c/mathml Jan 5, 2023
@dginev
Copy link
Contributor Author

dginev commented Aug 3, 2023

The MathWG discussion on August 3rd,2023 offered some fresh motivation to not forget aliasing for too long. There are different benefits in how one makes choices between e.g. "big-o" vs "order-of-approximation" and "power" vs "exponentiation", also noting that there is a "power" in physics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants