-
Notifications
You must be signed in to change notification settings - Fork 10
Description
Description
We have a common problem of synonymous and near-synonymous names when dealing with mathematical concepts. The same exact construct, often using the same notation, can be narrated using different words, while understood by professionals to have the same meaning. This makes it difficult to maintain a list of Intent values where we have a single name for each listed concept.
There are many reasons for such synonyms existing. To enumerate a partial list:
- due to common words, which have preexisting common synonyms, e.g.
- "opposite", "negation", "additive inverse"
- "euclidean-metric" and "euclidean-distance"
- "in", "member-of", and "element-of"
- due to different mathematical perspective on the same object, e.g.
- "greatest-common-divisor" and "greatest-common-factor"
- piecewise "otherwise" and "elsewhere" (example in 2005.07738)
- names of numeric literals, e.g. "undecillion" and "sextillion" can both refer to 10^36. There are other examples.
- Also note that some of the names are not unambiguous standalone, as they refer to different values in "short scale" and "long scale". Taking "undecillion", it can be either 10^36 or 10^66. Hence, the implied scale must be known for use in a CAS application. This level of disambiguation is not a requirement for intent, where we only need a concept name that anchors the intended narration for AT.
- due to historical circumstances in academia e.g.
- "euler-constant" and "euler-mascheroni-constant"
- "euler-number" and "napier-constant"
- due to colloquial and technical names existing:
- vector "norm" and "length"
- "exclusive-or" and "exclusive-disjunction"
- due to visual and technical names existing:
- "wedge-product" and "exterior-product"
- due to other, some, or all of the above:
- "falling-factorial", "descending-factorial", "falling-sequential-product", "lower-factorial", "pochhammer-symbol"
To say these synonyms "exist", is to say that practitioners use them - and so will practitioners using the new Intent standard - as long as we have an "Open" level. So one way or another, we will need to make provisions for them. If no special mechanism exists, the baseline support I could imagine would be:
Baseline treatment
Add each synonymous name independently to the Intent "Open" list, copying over any relevant additional information from its main entry.
This approach has no explicit connection between the (near-)synonyms, so AT will see them as completely independent.
Proposed "alias" mechanism
Currently, I have experimented with adding a column called "alias" to the list, where each main entry can receive additional known names. AT can then do an extended table lookup, and reuse any implementation for the main entry narration also for the alias narrations.
Benefit 1: each time our group starts a lexicographical discussion about "what is the Best name" to use in the list, we don't have to spend the time and effort making that decision. In the end of the day, these decisions are often arbitrary, not just in our group, but in mathematical practice in general. Rather than debating whether e.g. "log", "common-logarithm" or "logarithm" should be the Best name in our "Core" list, the aliasing mechanism allows us to make a "soft" choice for the primary name, where anyone that prefers an alternative name (again, established in actual mathematical practice) can add it as an alias and use it in their annotations.
Note: This idea of a "soft" preference is also something that entices me on the AT implementation side - if a user annotated "common-logarithm" that is a soft preference to use any speech specially dedicated to that string (for example - to be more specific), and vice versa if a user annotated "log" it could be a soft preference to be more succinct. Neil has made a very good case that this decision is only possible to do correctly by the AT, in the narration mode best suited for an individual user's needs. So AT makes a final decision.
The "soft" preference comes into play where, all else being equal, the author's wish may be respected - hopefully to the benefit of conveying the expression as close as possible to how the author wanted it received.
Benefit 2: rules where AT has special narration implemented can be directly reused, and maintain together, with the concept's aliases.
In the end this is an organizational question for the official lists, and what will be most convenient there long-term. We started a group discussion in our first Math WG call of 2022, and I think the group generally found this to be a suggestion that adds complexity for either limited value, or too soon - one of the sentiments is that we should have the most minimalist outcome possible for the first Intent proposal.
As we agreed in the meeting, I am opening the issue to explore the trade-offs fully. Discussion and feedback welcome!