Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanded UnicodeMath support #8

Open
bwiernik opened this issue Mar 18, 2020 · 4 comments
Open

Expanded UnicodeMath support #8

bwiernik opened this issue Mar 18, 2020 · 4 comments

Comments

@bwiernik
Copy link

Is there any possibility you might expand this plugin to support UnicodeMath syntax more broadly? I really prefer the nearly-plain-text syntax of Unicode, especially for things like fractions, to LaTeX.

@marhop
Copy link
Owner

marhop commented Mar 29, 2020

Hi,

Thanks for the link to the Unicode Technical Note, I did not know about this! This is a really nice idea and it feels like the perfect logical consequence of this filter.

I fear it is a little out of scope at the moment though, mostly for technical reasons. This filter currently only uses a very simple implementation that reads exactly one character at a time (no further context!), looks it up in a translation table and if found replaces it by the corresponding Latex command. Implementing UnicodeMath as specified in the Tech Note would instead require a real parser that is capable of handling more complex syntactic constructs like (2+3)/5.

I'm not saying I won't think about it (because I probably will) but you should not expect anything close to a working solution anytime soon ... Sorry.

Best,
Martin

@bwiernik
Copy link
Author

It occurs to me that pandoc already has parsers for the Microsoft Office implementation of UnicodeMath (labeled as ​readOMML and writeOMML). Do you know if it is possible to call those functions from a filter?

@marhop
Copy link
Owner

marhop commented Apr 13, 2020

Yeah, that should be possible by importing the texmath library (GitHub/API docs) in which these functions are implemented in a Pandoc filter.

I fear though that this is not really what you are looking for. OMML is not the same as UnicodeMath, but an XML dialect (remotely similar to MathML) that is used by Microsoft Word for the definition of math structures. From a slightly more recent version of the Unicode Technical Note, page 3:

In Word, the structures are defined in OMML (Office MathML) and built up by Word, while for the other apps, the structures are defined in UnicodeMath and built up by RichEdit.

I guess that means while you can enter UnicodeMath in Word (you can, right? I don't know) it is stored as OMML internally. That's why Pandoc does not need to read/write UnicodeMath but "only" OMML to process Word documents.

@bwiernik
Copy link
Author

Ah, got it. The writings of the UnicodeMath author about this are somewhat hard to follow regarding exactly what everything is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants