Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One-letter no-arity intents useful? #480

Closed
polx opened this issue Nov 18, 2023 · 3 comments
Closed

One-letter no-arity intents useful? #480

polx opened this issue Nov 18, 2023 · 3 comments
Labels
intent Issues involving the proposed "intent" attr

Comments

@polx
Copy link

polx commented Nov 18, 2023

Lots of atomic symbols are being considered within the explorations towards the core list of intents. We need to find arguments for or against including them.

For single characters whose pronunciation is equivalent to the unicode name, there is agreement that there is no need to include it in the core list.

For symbols which carry a conceptual value it is not clear what could be the advantage of including them as intent-property or as intent as opposed to, say, let the author (’s producing system) output an explicit intent name.

There could be value into adding properties of typical letters which, through their usage, refer to a common concept. Radius, Volume, Area, Angle- or Segment-length values or similar such concepts could be defined in the core list. Would that bring a better speak-aloud? Would a perplex user ask for a more verbose speakaloud when lost and there would kick our more verbose name?

It appears that there are textbooks and reading environments where pronouncing the formula of the area inside a circle A = π∙r² is done using the complete detailed words area is equal to pi times radius squared. How could this be operationalised without making all appearances of A or r being spoken as radius or area?

@polx polx added the intent Issues involving the proposed "intent" attr label Nov 18, 2023
@dginev
Copy link
Contributor

dginev commented Nov 21, 2023

Some extra related context: We have currently hidden a number of zero-argument concepts outside of Core, while adding Core properties to represent them.

Notably:

  • the property :unit which likely implies 20+ concepts such as second, meter, volt, watt, ampere ...
  • the property :chemical-element which likely implies 100+ element names such as helium, oxygen, tin, antimony, ...

As Paul mentioned, we also have a (currently uncounted) selection of concepts in the "self-voicing Unicode" category, such as planck-constant ℎ (U+210E), degree-celsius ℃ (U+2103), ohm-sign Ω (U+2126), end-of-proof ∎ (U+220E), n-ary-product ∏ (U+220F) ...

I think we may not have yet been sufficiently clear if some of these are "outside of Core" (= Open concepts), or "skipped from the Core list" (= Core concepts, which are only skipped for visual brevity). Ideally this issue also clears up that question.

I think the main practical difference for a concept being Core for this discussion is whether AT support is expected as a minimal requirement. And what does such support entail for concepts that match a simple literal reading (e.g. radius and _radius or second and _second). Is translation the key feature which is enabled? E.g. second is clearly to be translated as the unit, while _second can be an ordinal (first, second, third...)

Edit: apologies, misclicked.

@dginev dginev closed this as completed Nov 21, 2023
@dginev dginev reopened this Nov 21, 2023
@NSoiffer
Copy link
Contributor

I freely admit my feeling on what is based on my implementation design that separates out speech for Unicode chars from speech for notations. However I think that it is a fairly common design. I know that @MurrayIII had that separation in his math editor.

What I've found is that my Unicode implementation needs to be more complete than my notation rules. That's because the notation rules fall back to reading the underlying syntax or in a bad cases, capture more than they should. An example of the later is saying "power" when something is just a superscript. Even when misreading something as a power, it can be understood. For example "x to the star power" will make someone stop and think "what!?". But they will understand and move on. However, not having a name for a Unicode character makes the speech extremely hard to understand. For example "Unicode 2 5 a b, A B C" is next to useless.

On the other hand, in practice, very few characters are used up through Calculus. In a paper I wrote, I found that 50 non-keyboard character (non-ASCII is a good approximation) covered 99.95% of the characters that were used. Still, encountering one of those 0.05% characters is a very poor experience.

A few years ago, @davidfarmer sent me more textbooks to analyze, but I haven't found the time to do that yet.

In the absence of more analysis, I would encourage a more complete listing of characters to include in core rather than a more restrictive listing. Both creating the list and implementing the speech is much simpler than documenting and implementing a notation that should be handled. It also means authoring tools can mostly worry about intents on characters when they have a special way of being spoken or are highly ambiguous (| comes to mind). My guess is that not counting alphabets and script variants, 200 - 300 characters would be sufficient to cover >99.999% of all characters encountered. But lacking data, that's just a guess.

@NSoiffer
Copy link
Contributor

At the May 16 meeting, we agreed that we should create a list separate from the core-concept list, that lists all the characters that either have the (Unicode) math property or otherwise have some reason they might show up. This list should include some suggested speech names for the characters. Potentially there will be fields for different languages.

Neil will verify his MathCAT list of 4000+ characters includes all the Unicode chars with math properties and then pass that on to @davidcarlisle for adding to his unicode.xml list (used for XML Entities rec). From that, he will then produce a draft W3C note or some other document for reference by AT vendors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
intent Issues involving the proposed "intent" attr
Projects
None yet
Development

No branches or pull requests

3 participants