Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed intent with functional notation #451

Closed
davidfarmer opened this issue Mar 17, 2023 · 50 comments
Closed

Proposed intent with functional notation #451

davidfarmer opened this issue Mar 17, 2023 · 50 comments
Labels
intent Issues involving the proposed "intent" attr

Comments

@davidfarmer
Copy link
Contributor

davidfarmer commented Mar 17, 2023

Proposal for a functional grammar for intent

This seems workable on the examples we have discussed, with proper
markup (meaning judicious use of mrows) and recognizing that
micromanaging the pronunciation often makes things worse.

Challenging cases welcome, of course! In particular, examples that
were headless or leading underscore under other proposals.

{comments in curly brackets}

knownintent       = we have to decide on the list {e.g., 'absolute-value', "superscript", "number", ...}
reflist           = '(' '$' identifier (',' '$' identifier)* ')'
namereflist       = '(' names (',' '$' identifier)* ')'
identifier        = letter+
letter            = we have to decide what is a "letter"
name              = letter+ (('-'|\s) letter+)*   {note: spaces are allowed.  Not a deal-breaker.}
names             = name ('|' name)*
numbervalue       = we have to decide if . and , are both allowed, and minus sign { why not? }
type              = 'named' | 'adhoc' | 'value' { "isa" is sort-of a type, but is treated differently }
category          = 'function' | 'group' | 'number' | 'operation' | 'system-of-equations' | ...  {several more things}

intent            = (knownintent reflist?) |
                    (isa ":" category) |
                    (type ":" category namereflist)
                    (type ":number" "(" numbervalue ")")

Example of knownintent

<mrow intent="absolute-value($x)"><mo>(</mo><mi arg="x">A</mi><mo>)<mo></mrow>

Example of isa

<mrow intent="isa:system-of-equations">ABC</mrow> tells AT that ABC is a system of equations.
Similarly for isa:matrix and isa:cases.

Things like isa:operation or isa:group probably have no effect initially.

Examples of named

<mi intent="named:extension(free algebra)">R</mi>

<mi intent="named:function(Bessel function of the first kind|Bessel-J)">J</mi>

The "named" type is used to indicate that the item has an existing name.
The "|" separate different names, with the more verbose coming first.
(We can consider omitting the option to have multiple names.)

The "named" type tells AT that it can use the literal value if desired.

Examples of adhoc

<mo intent='adhoc:operation(foo)'>&#x229e;</mo>

(that symbol is a plus in a box)

The "adhoc" type is used to indicate that the author is making up the name,
or that the name is nonstandard. There should not be "|" alternatives for
an adhoc name.

The "adhoc" type tells AT that it can use the literal value if desired.

Examples of value

<mo intent='value:operation(times)'>*</mo>

The "value" type is used to indicate that AT should use that value
instead of the literal content. There should not be "|" alternatives for
a value name. (value is implict for knownintent)

Some special cases

The "superscript" core intent is used to have the correct pronunciation
for things like

<msup><mi>H</mi><mn intent="superscript">2</mn></msup>

While it is true that (in the context of (co)homology) a person
would pronounce H^2 as "H 2", they also would pronounce H_2 the same way.
The superscript intent tells AT that the 2 is just a superscript/index, not a power,
so it will probably say "H sup 2", which is better.

The "number" intent (is it too confusing to have it as both an intent
and a category?) covers cases which were mentioned on a call, such as:

<mrow intent="number(3.14)"><mn color="red">3</mn><mo>.</mo><mn color="blue">14</mn></mrow>

In this, and many examples, it is necessary to have suitable mrows
in order to fit the proposed intent grammar. (This allows keeping numbers
out of the arguments of intent, except inside the number intent or number category.)

Some special features

In many cases, at least with the initial implementations, the "category"
is ignored and

intent="named:X(foo)" is probably pronounced "foo", no matter the category X.

In some cases the "category" can be a useful signal to AT.
For example, if the category is "function" then AT can know to say "of"
before the reference.

Otherwise, the AT just says the name and the references in order.

@davidcarlisle
Copy link
Collaborator

In this, and many examples, it is necessary to have suitable mrows
in order to fit the proposed intent grammar. (This allows keeping numbers
out of the arguments of intent, except inside the number intent.)

in all the previous versions, the places where you end up with longish compound intents are places where you can't easily add mrows.

eg something like this with and x=1.00 \\ y=10.50 where numbers are split to force decimal alignment but you can re-constitute in intent so something like intent="$op($x,1.00)" would currently be allowed.

This may not be a great example as if decimal alignment worked you woudn't have to split the number, but you may want that for coloring or other reasons as you show in your mrow, but in a table row you can not group subterms.

<mtable>
 <mtr>
  <mtd><mi intent="x">x</mi></mtd>
  <mtd><mo>=</mo></mtd>
  <mtd><mn>1</mn></mtd>
  <mtd><mn>.00</mn></mtd>
 </mtr>
 <mtr>
  <mtd><mi>y</mi></mtd>
  <mtd><mo>=</mo></mtd>
  <mtd><mn>10</mn></mtd>
  <mtd><mn>.50</mn></mtd>
 </mtr>
</mtable>

@davidcarlisle
Copy link
Collaborator

I think

type ":" category ":" namereflist

should be

type ":" category namereflist

with just one : to match the example

named:function(Bessel function of the first kind|Bessel-J)

@brucemiller
Copy link
Contributor

Actually it looks like the example was intended to match type ":" category names, rather than namereflist, but I'm not sure. names` doesn't seem to be used anywhere.

It also seems as if only references can be used as arguments to functions? (I'm kinda lost)

@davidfarmer
Copy link
Contributor Author

davidfarmer commented Mar 17, 2023

Corrected.

@davidfarmer
Copy link
Contributor Author

I corrected another typo: names now occurs on the 3rd line of the grammar.

And yes, I am proposing that, other than the first "names" entry of the namereflist,
only identifiers occur as arguments.

This forces there to be a nice structure on the markup, so you can refer by identifier.

@w3c w3c deleted a comment from brucemiller Mar 17, 2023
@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Mar 17, 2023

namesreflist has , not ( before the first ref, is that intentional?

intent="named:function(Bessel function of the first kind|Bessel-J, $x)"

I would have expected

intent="named:function(Bessel function of the first kind|Bessel-J)($x)"

@davidcarlisle
Copy link
Collaborator

there does not appear to be any equivalent of @infix ?

eg

 <mmultiscripts intent='choose@infix($n,$k)'>
  <mi>C</mi>
  <mi arg='k'>k</mi>
  <mrow/>
  <mprescripts/>
  <mrow/>
  <mi arg='n'>n</mi>
 </mmultiscripts>

from list4

(assume choose is not in the known list)

@davidfarmer
Copy link
Contributor Author

davidfarmer commented Mar 17, 2023 via email

@brucemiller
Copy link
Contributor

@davidfarmer Seriously? You deleted my comment? Please don't do that.

@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Mar 17, 2023

you can not always put the function on the mo, for delimiters and other reasons, sometimes it has to be on the mrow, or as above, on mmultiscripts and so you need a functional form with arguments.

when would you use

intent="named:function(some name,$x,$y)"

?

@davidfarmer
Copy link
Contributor Author

davidfarmer commented Mar 17, 2023 via email

@davidfarmer
Copy link
Contributor Author

davidfarmer commented Mar 17, 2023 via email

@davidcarlisle
Copy link
Collaborator

I agree that there needs to be something that lets you specify an infix or postfix reading.

currently you have

<mmultiscripts intent='named:function(choose,$n,$k)'>

I suppose you could have

<mmultiscripts intent='named:infix-function(choose,$n,$k)'>

but it still looks very odd to me with ,$n,$k rather than ($n,$k)

@davidfarmer
Copy link
Contributor Author

davidfarmer commented Mar 17, 2023 via email

@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Mar 17, 2023

well I asked above if you intended to have , not ( before the refs in a namesreflist and you confirmed that was intentional so here I just suggested infix-function but kept the (choose,$n,$k)form you specified

@davidcarlisle
Copy link
Collaborator

an important feature of previous versions is that there is no syntactic difference between core and open concept lists, as the list of "known concepts" will in practice be variable.

As far as I can tell you are using knownintent for colon-free references to the core list butvalue:operation(concept) for the open list. I would drop value: and allow

name reflist?

for possibibly unknown intents.

@davidcarlisle
Copy link
Collaborator

intent="named.X(foo)" is probably pronounced "foo", no matter the category X.

did you mean named:X(foo) with : not . ?

@davidfarmer
Copy link
Contributor Author

davidfarmer commented Mar 17, 2023 via email

@davidcarlisle
Copy link
Collaborator

As to dropping the value: : I can see doing that. But without the category , as in function(foo) or number(foo)
there is missing information which may make it hard for AT to do the right thing in some cases.

well yes which is how we ended up with properties/hints in other variants, but largely dropped here (I think)

@dginev
Copy link
Contributor

dginev commented Mar 18, 2023

In this proposal, does the inverted "median of x at index i" example (TeX \overline{x}_i), MathML:

<msub intent="median(index($x,$i))">
  <mover accent="true">
    <mi arg="x">x</mi>
    <mo>¯</mo>
  </mover>
  <mi arg="i">i</mi>
</msub>

end up identical? And if the concepts weren't known, would it instead be the same structure but with intent:

intent="adhoc:operation(my-median)(adhoc:operation(my-index)($x,$i))"

or would these be named functions?

intent="named:function(my-median)(named:function(my-index)($x,$i))"

@davidcarlisle
Copy link
Collaborator

@dginev if I understand the proposal you could not do median(index($x,$i)) or the adhoc or named versions as you can not nest function calls you can only have $xxx as arguments of a function.

@davidfarmer
Copy link
Contributor Author

There are several things going on in @dginev 's example.

  1. As @davidcarlisle noted, this proposal does not allow nesting
    functions: you have to refer to the $arg .

  2. Since "my-median" is not a standard name for an existing
    concept, this would be adhoc. If you wanted AT to say median
    or mean, and those have the usual meaning, then using named is appropriate.
    It is the concept and the name of the concept which matter, not the notation.
    That way named can be used to for nonstandard notation of a common
    concept (or for standard notation if not in core). And adhoc can be used when
    the author introduces new terminology.

  3. This is a good example because it points out that mover will
    need special treatment, just as does msup. To give a good answer I'll need
    more information about the rules AT uses for the default pronunciation of
    \overline{x}_i .
    But assuming it is "bar x sub i" or maybe "x bar sub i",
    then <mo intent="adhoc:decoration(my-median)>¯</mo>
    would have it say "my-median" instead of "bar".

But more likely we will encounter other situations where the preferred
reading is in a different order. There are many reasonably common use
cases for "decorated" objects, such as \hat{f} for Fourier transform.
Adapting a suggestion from @davidcarlisle , to force the "my-median" before
the x (whether or not the AT would do that anyhow), we could do

  <mover>
    <mi>x</mi>
    <mo intent="adhoc:prefix-decoration(my-median)>¯</mo>
  </mover>

I also realize that I unfortunately did not allow intent="$x $y $z".
(I will put an updated schema in another comment, and also include
the prefix-, etc suggestion). So, another way to do it which
guarantees the "my-median" before the x is:

  <mover accent="true" intent="$y $x">
    <mi arg="x">x</mi>
    <mo arg="y" intent="adhoc:decoration(my-median)>¯</mo>
  </mover>

In this discussion it does not seem to matter that the "sub i"
is there, because it is outside the mover, so AT should be
trusted to handle that correctly.

@davidfarmer davidfarmer added the intent Issues involving the proposed "intent" attr label Mar 19, 2023
@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Mar 19, 2023

@davidfarmer

In this discussion it does not seem to matter that the "sub i" is there, because it is outside the mover

that is surely the point of Deyan's example? It is not the "mean of $x$" it is the "mean of $x_i$" with just a typographical quirk of placing the bar just over the x not extending it over the subscript. However there is no container you can label with $xsubi to use as the argument to median. which is why you currently need a nested function call

intent="median(index($x,$i))

That is, you want to read it as the logical markup

<mover accent="true"  intent="median($xsubi)">
  <msub arg="xsubi" intent="index($x,$i))">
    <mi arg="x">x</mi>
    <mi arg="i">i</mi>
  </msub>
  <mo>¯</mo>
</mover>

without forcing that layout.

@davidcarlisle
Copy link
Collaborator

In general, nested function calls and/or literal values are used in the previous proposals to handle cases where the mathematical structure does not match the presentation mathml element structure. It is hard to see how you can handle these cases while restricting function arguments to $argref.

You give an easy example re-constituting a coloured number where there is a containing mrow but a more realistic coloured example might be

image

<math>
 <mtable columnspacing="0pt">
  <mtr intent="$op($var,10.00)">
   <mtd><mi arg="var">x</mi></mtd><mtd><mo arg="op">=</mo></mtd><mtd><mn mathcolor="red">10</mn></mtd><mtd><mn mathcolor="green">.00</mn></mtd>
  </mtr>
  <mtr intent="$op($var,12.10)">
   <mtd><mi arg="var">y</mi></mtd><mtd><mo arg="op">=</mo></mtd><mtd><mn mathcolor="red">12</mn></mtd><mtd><mn mathcolor="green">.10</mn></mtd>
  </mtr>
 </mtable>
</math>

@davidfarmer
Copy link
Contributor Author

As also discussed in #448 , we have to decide if intent is supposed to go
beyond its original scope of allowing disambiguation of what is written.
In particular, is it allowed to rearrange the presentation tree?

In the "my-median of x sub i" example, the markup clearly indicates
(my-median x) sub i. And that is what the sighted person sees.
If it means my-median(x sub i), then the sighted reader somehow has
to figure that out on their own.

The intent should clarify, such as indicating to pronounce it "my-median"
instead of "bar" or "overline". But to make the reading change what the
markup says, and providing different information than the sighted reader
sees, seems like asking for trouble.

For the example of mtable with numbers split across different mtds,
that markup is bad for accessibility. I don't see that intent is
there to remediate inaccessible markup. But, in this case the intended reading
can be done without nesting, only having references as arguments,
and only putting literal numbers inside a number intent:

  <mtr intent="$var $op $value">
   <mtd><mi arg="var">y</mi></mtd><mtd><mo arg="op">=</mo></mtd><mtd arg="value" intent="number(12.10)"><mn mathcolor="red">12</mn></mtd><mtd><mn mathcolor="green">.10</mn></mtd>
  </mtr>

The fact that the intent on $value is not the literal number value of
its contents, seems forgivable because of the inaccessible markup.

@dginev
Copy link
Contributor

dginev commented Mar 19, 2023

But to make the reading change what the markup says, and providing different information than the sighted reader sees, seems like asking for trouble.

I think as the chief current trouble-maker I should clarify that the people who tend to ask for trouble don't go away, but find the trouble elsewhere. Which is fine really, as long as everyone expects that it will inevitably happen :)

Deciding that cases where "presentation and intent structures do not align" are out of scope for this syntax proposal is a reasonable outcome. But then you get the inevitable follow-up, where someone who is decided on using that presentation MathML will use the more restrictive syntax to achieve that as either a parallel tree, or a single tree with extra wrapping mrows:

  • parallel mrows:
<mrow intent="$intent-branch">
  <mover accent="true">
    <msub>
      <mi>x</mi>
      <mi>i</mi>
    </msub>
    <mo>¯</mo>
  </mover>
  <mrow arg="intent-branch" intent="median($indexed-arg)">
    <mrow arg="indexed-arg" intent="index(x,i)"/>
  </mrow>
</mrow>
  • wrapping mrows:
<mrow intent="median($indexed-arg)">
  <mrow intent="index($x,$i)" arg="indexed-arg"> 
    <msub>
      <mover accent="true">
        <mi arg="x">x</mi>
        <mo>¯</mo>
      </mover>
      <mi arg="i">i</mi>
    </msub>
  </mrow>
</mrow>

My main point being that a restricted syntax will mostly make it more awkward to "ask for trouble", but will not eliminate the possibility (as long as Presentation MathML remains as flexible as it currently is).

@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Mar 19, 2023

@davidfarmer

In the "my-median of x sub i" example, the markup clearly indicates
(my-median x) sub i

No, sorry I do not see it that way at all.

If you start from a "semantic" tex markup such as \mean{x_i} then the macro definitions must typeset $\bar{x}_i$ not $\bar{x_i}$ If it makes the latter it is simply bad tex. So a primary aim of intent is to allow this while disambguating the original meaning, hence intent="mean(index($x,$i))

Note this happens all the time. If you have $X_i$ marked as <msub intent="foo($i)"><mi>X</mi><mi arg="i">i</mi></msub>

then need $X_i^2$ you have <msubsup intent="power(foo($i),$n)"><mi>X</mi><mi arg="i">i</mi><mn arg="n">2</mn></msubsup>

and again, you need nested function calls as neither foo nor power have an element that corresponds to an argument.

@brucemiller
Copy link
Contributor

In the "my-median of x sub i" example, the markup clearly indicates (my-median x) sub i.

If by "markup" you mean the pure MathML without the intent, then: No, the markup indicates "(x with overbar) subscript i".

And that is what the sighted person sees. If it means my-median(x sub i), then the sighted reader somehow has to figure that out on their own.

Exactly; and they do. Knowing the hypothetical (but common) context, they would recognize that overbar stands for median, and that whatever kind of collection "x" is (vector, array, list, whatever) don't have medians, but the elements of those collections do have medians, the sighted reader would figure out that the expression must mean "median(index(x,i))", and that "index(median(x),i)" would be wrong.

I don't see that intent is there to remediate inaccessible markup.

Hmm. I thought that was exactly what it was for.

To me, notation ambiguity is just a form of inaccessibility. Both sighted and non-sighted people are just as capable of figuring out that overbar stands for median, that "J" stands for Bessel, etc. But without the visual cues and context, it is much harder for the latter to do, unfairly so. Is that the wrong perspective?

@davidfarmer
Copy link
Contributor Author

davidfarmer commented Mar 19, 2023 via email

@davidcarlisle
Copy link
Collaborator

Is it better to hear "mean of quantity x sub i endquantity", which is what it means but not what the markup literally indicates?

The markup is not something the reader should be aware of at all, it is just a technical necessity.

I think $X_i^2$ should be pronounced however you are pronouncing $X_i$ followed by "squared". The fact that in MathML, as in TeX, a sub-sup combination is a separate markup than a nested subscript does not affect the reading,

I do not see how you can specify an intent for $X_i^2$ in this proposal as there is no element corresponding to $X_i$, but I don't see the restriction is needed for this proposal, you could allow nested arguments with minimal change to the grammar.

@davidfarmer
Copy link
Contributor Author

davidfarmer commented Mar 19, 2023 via email

@davidcarlisle
Copy link
Collaborator

I am hoping that this functional approach is workable, and I understand that if intent goes outside the presentation tree, then nested arguments are necessary.

I'm not sure what you mean by "outside" here but in any case I'd see specifying intents for $\bar{X}_i$ or $X_i^n$ as core motivating examples for intent, so if you could post a version of the grammar that supported that, there are other parts that probably need discusssion, but without that it's hard to see how to make it workable.

@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Mar 21, 2023

Most of the above discussion was about arguments to functions, so some comments on the other parts of the proposal, with comparisons to https://w3c.github.io/mathml/#mixing_intent_grammar

knownintent       = we have to decide on the list {e.g., 'absolute-value', "superscript", "number", ...}

I think listing names in the grammar is too fragile, better to accept any identifier here, with the system
handling "known intents" and just reading unknown ones as-is, so

concept-or-literal := NCName

reflist           = '(' '$' identifier (',' '$' identifier)* ')'

As noted above, I can't see any way to make a restriction to $argref so perhaps

arglist ='(' intent (',' intent)* ')'

namereflist       = '(' names (',' '$' identifier)* ')'

As discussed above (foo, $a, $b) is unusual syntax for a function call (lisp-like, but with commas).
Despite a personal fondness for lisp I suggest

namereflist = '(' names ')' arglist

identifier        = letter+
letter            = we have to decide what is a "letter"

Probably should be NCName or [\pL][\pL\pMn\-\Md]+ or some such as discussed for other proposals

name              = letter+ (('-'|\s) letter+)*   {note: spaces are allowed.  Not a deal-breaker.}
names             = name ('|' name)*

This (long name | other name) proposal is the main new feature here, it could possibly be incorporated in to the other proposals if we decided to go that way.

numbervalue       = we have to decide if . and , are both allowed, and minus sign { why not? }
type              = 'named' | 'adhoc' | 'value' { "isa" is sort-of a type, but is treated differently }

I can't say I like the names adhoc or value but that's just details.

category          = 'function' | 'group' | 'number' | 'operation' | 'system-of-equations' | ...  {several more things}

as for knownintent, I think baking a fixed list in the grammar is too fragile, also as discussed for other proposals you end up needing multiple overlapping ones, so I would allow :function:infix:complex:whateverand use

property := ":" NCName

intent            = (knownintent reflist?) |

concept-or-literal arglist
due to suggested name changes above, but why can you not have category/properties here?

                    (isa ":" category) |

(isa property+)

                    (type ":" category namereflist)

(type property+ namereflist)

                    (type ":number" "(" numbervalue ")")

If you make category/property an open list this just becomes a special case of the previous clause but with an interpretation that property number means the arglist has exactly one arg and any commas are part of the number

@davidfarmer
Copy link
Contributor Author

There is a lot for me to unpack here. I will working on modifying the grammar,
but it would help to clarify if nested arguments are really needed.
I am submitting a separate issue for that.

@brucemiller
Copy link
Contributor

I'm having a hard time getting an overview perspective of this proposal; Can you give a sense of the advantages of this proposal over the others?

@brucemiller
Copy link
Contributor

brucemiller commented Mar 21, 2023

but it would help to clarify if nested arguments are really needed.

Perhaps it isn't if you can get the same effect.
Given the common MathML

<msub>
  <mover accent="true">
    <mi>x</mi>
    <mo>¯</mo>
  </mover>
  <mi>i</mi>
</msub>

Without modifying the MathML, how should an annotator that knows the meaning is "the mean of the i-th element of x" encode the intent in your system?

@davidfarmer
Copy link
Contributor Author

I hope this is at least a partial answer to the question of what
I was trying to propose and the advantages I hoped to get from it.

I am thinking about the interface I am creating which will convert
user input to MathML with intent.

For core intents, I am not particularly concerned: those will have a
specified markup which I can produce and which we can expect AT
to handle properly. (There are a couple of key cases which may
require more discussion, such as how to indicate that
<msup><mo>H</mo><mn>2</mn></msup> is "H 2" and not "H squared",
probably pronounced "H sup 2" by AT.)

The hard part is how I will enable authors to indicate special treatment
for markup not in core. We have seen some examples of trying to include
literal words, so that AT can say what the author would say if the
formula were read aloud. Those examples convinced me that this is a bad idea,
because quite often the result was worse. Thus, we need a functional syntax.

My other conclusion was that, as I figure out how I will allow authors
to specify non-core intent, I do not want authors thinking in terms of
how they pronounce the expression. A workable alternative is for them
to indicate what something is (or what it is not, such as the "2" in "H^2"
is not an exponent).

For example, a particular symbol may represent a function in one
context for one author, and an infix operator in another. Knowing
that something is a function helps with pronunciation, so I need a
way for intent to specify that something is a function.
(Maybe not the best example, because of &ApplyFunction;.)
And if that symbol has a name, the author will want to indicate the name.
That is: a mathematical name which may be different than the
Unicode name.
(And as mentioned much earlier in this issue, I think it is good to
distinguish between an established name (which did not make it into core)
and a name which is not generally known and perhaps invented by the author.)

There also is the issue of what authors want to indicate, even if we
might argue that it is not really necessary. For example, the author
may want to say "J" is a Bessel function. They may complain if
they are not allowed to supply that apparently useful information.
So, I wanted a way to encode the name, but in a way that AT knows
it is okay to just say "J". Similarly for authors specifying that
"G" is a group. Maybe AT will not use that now, but if we allowed
isa:group as the intent, that would make some authors happy. I would
prefer not to do something like a data-isa="group" attribute.

The previous paragraph describes things like specifying that the content of an
mo is a function. A related situation is specifying that a large multi-layered
expression is a system of equations, or a matrix, or a "cases", or some other type
of expression. That is important information for AT.

Another issue is numbers. I don't like the idea of requiring "." as the decimal
separator, and there also are complex numbers and scientific
notation. All of those are numbers. So I suggested a number intent
as a wrapper, as in number(3,14159).

I'd like to be able to output those types of intent. And unless
someone can figure out a way to only allow speech strings that make things better,
I'd like to disallow those.

@davidcarlisle
Copy link
Collaborator

I'd like to be able to output those types of intent. And unless
someone can figure out a way to only allow speech strings that make things better,
I'd like to disallow those.

Actually I'd say a main effect of the proposal here is that it offers arbitrary speech strings for people who don't like _ .

I must admit I assumed that was the main motivation, as it's the main new feature.

intent="named:function(arbitrary English sentence here)"

seems valid (you could replace named with adhoc etc but as far as I can tell all allow the equivalent of

_(_arbitrary, _English, _sentence, _here)

without the ugly _

@davidfarmer
Copy link
Contributor Author

davidfarmer commented Mar 22, 2023 via email

@davidcarlisle
Copy link
Collaborator

ok so my example should have been that this proposal makes it easy to have

<msub intent="named:function(Jay naught)"><mi>J</mi><mn>0</mn></msub>

The named:function part of the markup makes a difference.

use adhoc:operation in my examples if you prefer. Unless I misunderstand you completely, the effect of

intent="adhoc:operation(an english sentence)"

is to ignore the mathml markup completely and generate the speech string an english sentence

@dginev
Copy link
Contributor

dginev commented Mar 22, 2023

unless someone can figure out a way to only allow speech strings that make things better,
I'd like to disallow those.

For the record I hold the opposite design bias:

Unless AT can generally guarantee great coverage of all edge cases we can expect to encounter in a broad sample of real-world uses of math syntax, I would like the authors to always have an "escape hatch" where they can remediate linguistic realities that were not foreseen during the WG's limited charter and survey scope.

@davidfarmer
Copy link
Contributor Author

davidfarmer commented Mar 22, 2023 via email

@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Mar 23, 2023

apart from making it easier to supply space separated words (and | separated choices for such strings) the other main feature is categories. These seem mostly a syntactic variant of :properties as used in the current draft, except more restricted unless you take the suggestion to allow more than one, The syntax is a lot more complicated though.

@davidcarlisle
Copy link
Collaborator

you ask

The "number" intent (is it too confusing to have it as both an intent and a category?)

I think the answer is yes as you give the example

<mrow intent="number(3.14)">

but that doesn't parse. number there is a knownintent so does not allow digits.

To parse 3.14 as a number in the grammar above you would need something like

<mrow intent="value:number(3.14)">

but even this is a bit confusing as it looks like a type ":" category but is in fact a separate grammatical form with separate parse rules for the argument.

I think it would be clearer if we want a separate grammatical form for numbers allowing comma to use a separate syntax, say [3,14] so you could use that anywhere as foo-bar($x,[3,14]) but the feeling on last week's call was not to allow decimal comma in the syntax, which means quoting is not needed and foo-bar($x,3.14) works.

@davidfarmer
Copy link
Contributor Author

davidfarmer commented Mar 23, 2023 via email

@davidcarlisle
Copy link
Collaborator

What is a "number"? Complex numbers? Scientific notation?

yes I wondered about that too. Certainly here (documenting a numerical sofware library) 0.314e1 is as much a number as 2

As is the possibility of push-back if we do not allow comas.

if we were discussing text strings I think there would be push back but I know of no system using comma separated function arguments that allows decimal comma. That said, some quoting method or using spaces for agument separation both work.

@brucemiller
Copy link
Contributor

The current grammar would treat 0.314e1 as a literal (or depending on implementation, perhaps as a number 0.314 followed by a literal e1, likely an error). Other more liberal proposals that don't specifically call out number would also treat it as a literal. In either case, a :number property might be a reasonable clarification.

Comma remains a problem: 1,235 defaults to a list of two numbers (eg. function arguments). But even if we had a way of quoting the comma, it might be a small number (<2) or a large number (greater than a thousand) depending on locality (of the author? of the listener?) and what the author expected since we said they could use comma :>

@davidcarlisle
Copy link
Collaborator

@brucemiller yes at

https://mathml-refresh.github.io/intent-lists/intent4.html#IDdecimalcomma

I have <mn intent='1,234:decimal-comma'>1,234</mn> and <mn intent='1,234:thousands-comma'> although mathcat doesn't like them (not sure I like them either but they are a placeholder for whatever is decided)

@NSoiffer
Copy link
Contributor

Both @physikerwelt and @polx were pretty clear last week that allowing two different forms of a "decimal separator" has turned out to be a bad idea in practice. I was worried about imposing my cultural bias on others, but it seems that everyone has accepted the "." in practice and not only are ok with it, but strongly want it to stay that way to keep numbers simpler.

Note: this is about intent values, not the actual display value.

@davidfarmer
Copy link
Contributor Author

To be replaced by a new issue listing a few of the desirable features which maybe
should be possible with functional intent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
intent Issues involving the proposed "intent" attr
Projects
None yet
Development

No branches or pull requests

5 participants