Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronize mathtext docs and handling #26173

Merged
merged 2 commits into from
Aug 14, 2023
Merged

Conversation

oscargus
Copy link
Contributor

@oscargus oscargus commented Jun 23, 2023

PR summary

Solves parts of #26174 (second commit).

This sort of guarantees that (at least in a while), there will be no discrepancy between which symbols are supported and that those are supported correctly. That is, if a symbol is listed as a relational operator, there will be spaces surrounding it.

Also adds a few new sections to the supported symbols list. Not sure if "Western Europe" is the best description for å, Å etc (although they are used in Swedish and some of the other are used in Norwegian etc.).

Listed directly supported script, fraktur and black board characters. If nothing else to see which works and which should maybe be added...

The ones that are not yet added are:

vec = ⃗
textasciicircum = ^
imageof = ⊷
origof = ⊶
solbar = ⌿
k = ̨
textexclamdown = ¡
llcorner = ⌞
invnot = ⌐
lq = ‘
greater = >
lrcorner = ⌟
not = ̸
bar = ̄
rightharpoonaccent = ⃑
textasciitilde = ~
leftharpoonaccent = ⃐
carriagereturn = ↵
overarc = ̑
candra = ̐
quad =  
minus = −
kernelcontraction = ∻
spadesuitopen = ♤
textquotedblleft = “
grave = ̀
acwopencirclearrow = ↺
dot = ̇
turnednot = ⌙
rasp = ʼ
H = ̋
ulcorner = ⌜
underbar = ̱
lasp = ʽ
c = ̧
colon = :
textasciiacute = ´
breve = ̆
textquestiondown = ¿
__sqrt__ = √
emdash = —
less = <
tilde = ̃
textasciigrave = `
smallsetminus = ∖
cdotp = ·
hat = ̂
acute = ́
d = ̣
t = ͡
dddot = ⃛
ocirc = ̊
ddddot = ⃜
thickspace =  
endash = –
urcorner = ⌝
prurel = ⊰
ddot = ̈
check = ̌
rq = ’
textquotedblright = ”
% = %
_ = _
# = #
guillemotleft = «
ring = ˚
guilsinglright = ›
macron = ¯
guillemotright = »
asterisk = *
guilsinglleft = ‹
plus = +
smallintclockwise = ∱
smallvarointclockwise = ∲
smallointctrcclockwise = ∳

The accents should probably be removed as they are handled separately. For example ddot is mapped to combiningdiaeresis.

Then, one may wonder if we want to openly state that we support \plus rather than simply using +?

PR checklist

@oscargus
Copy link
Contributor Author

Another change is that the extension prints the rendered symbols as above. Which clearly showed that leftbrace had the wrong mapping.

@oscargus
Copy link
Contributor Author

The failing test is because mathtext adds space around the "spaced operators" in subscripts as well, while LaTeX doesn't. (\to was not in the list of arrows, but should be spaced in normal size.)

@oscargus
Copy link
Contributor Author

oscargus commented Jun 23, 2023

Here is the outcome: https://output.circle-artifacts.com/output/job/0c35025a-735d-4800-a574-b790431306bd/artifacts/0/doc/build/html/users/explain/text/mathtext.html

A bit annoyingly *, +, and - are all interpreted as list bullets...

@QuLogic
Copy link
Member

QuLogic commented Jul 6, 2023

A bit annoyingly *, +, and - are all interpreted as list bullets...

Should maybe be fixed? Perhaps a backslash prefix will work?

_mathtext.Parser._arrow_symbols],
["Dot symbols",
4,
r"""\cdots \vdots \ldots \ddots \adots \Colon \therefore \because""".split()],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't therefore and because in Relation operators?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they are. As well.

I just saw some table with "dot symbols" somewhere (do not remember where, maybe the "List of LaTeX symbols..." or whatever it is called), where they were included, so thought that it may make sense. Easy to modify though.

@oscargus
Copy link
Contributor Author

oscargus commented Jul 7, 2023

Should maybe be fixed? Perhaps a backslash prefix will work?

Indeed. Just had no idea how to obtain it. Have tried to read up, but haven't found anything. I guess making an HTML table directly is a way to go though?

@QuLogic
Copy link
Member

QuLogic commented Jul 7, 2023

Usually, a backslash before characters will stop reST from treating them as whatever markup, so pretty sure that'll work for the bullets as well.

@oscargus oscargus force-pushed the symboldocs branch 2 times, most recently from d52726a to d7f95e6 Compare July 13, 2023 02:20
@oscargus
Copy link
Contributor Author

Fixed. Seems to work.

@ksunden ksunden modified the milestones: v3.8.0, v3.8-doc Aug 8, 2023
@oscargus oscargus removed this from the v3.8-doc milestone Aug 9, 2023
@oscargus oscargus added this to the v3.8.0 milestone Aug 9, 2023
@oscargus
Copy link
Contributor Author

oscargus commented Aug 9, 2023

I think this should go in 3.8 since there are "half-made" changes towards this and this is the final step.

@@ -910,7 +910,7 @@
'O' : 216,
'hookleftarrow' : 8617,
'trianglerighteq' : 8885,
'nsime' : 8772,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this just been wrong from 2006 ?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so. However, I do not think it was documented so I wonder if someone ever used it?

Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concern about the nsime -> nsimeq change, however it appears that nsimeq is the correct symbol.

https://milde.users.sourceforge.net/LUCR/Math/mathpackages/txfonts-symbols.pdf

If we need an API change note, we can add that in a follow up PR.

ksunden
ksunden previously requested changes Aug 9, 2023
@@ -1066,7 +1068,7 @@
'hermitmatrix' : 8889,
'barvee' : 8893,
'measuredrightangle' : 8894,
'varlrtriangle' : 8895,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to have been correct? lrtriangle is U+25ff (9727) and varlrtriangle is U+22bf (8895).

reference: https://ctan.mirrors.hoobly.com/fonts/stix/doc/stix.pdf

Both are triangles, but stix assigns this the same as the old name

@@ -1066,7 +1068,7 @@
'hermitmatrix' : 8889,
'barvee' : 8893,
'measuredrightangle' : 8894,
'varlrtriangle' : 8895,
'lrtriangle' : 8895,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'lrtriangle' : 8895,
'varlrtriangle' : 8895,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add lrtriangle then. DIdn't make sense to have varlrtriangle if we don't have lrtriangle...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if one should script the source of the STIX documentation and add all those symbols? The problem is just to add them to the documentation...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted to varlrtriangle. Any additional symbols will be in another PR.

@@ -1734,7 +1734,17 @@ class _MathStyle(enum.Enum):
\cap \triangleleft \dagger
\cup \triangleright \ddagger
\uplus \lhd \amalg
\dotplus \dotminus'''.split())
\dotplus \dotminus \Cap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to change how some things render, do we need a note about this?

@ksunden
Copy link
Member

ksunden commented Aug 10, 2023

The doc build seems to still be getting lrtriangle instead of varlrtriangle

Seems to be due to

if ignore_variant and sym != r"\varnothing":
sym = sym.replace(r"\var", "\\")

Probably the thing to do is to add just this one here.

@oscargus
Copy link
Contributor Author

Ah, that may have been an additional reason for the name change...

@oscargus
Copy link
Contributor Author

This is updated

@ksunden ksunden merged commit ac95c22 into matplotlib:main Aug 14, 2023
38 of 39 checks passed
meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull request Aug 15, 2023
ksunden added a commit that referenced this pull request Aug 15, 2023
…173-on-v3.8.x

Backport PR #26173 on branch v3.8.x (Synchronize mathtext docs and handling)
@oscargus oscargus deleted the symboldocs branch August 16, 2023 08:59
@ksunden ksunden mentioned this pull request Sep 15, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants