Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode support in .ott files #85

Open
PeterSewell opened this issue Aug 31, 2021 · 3 comments
Open

Unicode support in .ott files #85

PeterSewell opened this issue Aug 31, 2021 · 3 comments

Comments

@PeterSewell
Copy link
Contributor

Some users would like to be able to use unicode in their sources, e.g. with emacs set-input-method TeX. I imagine that this wouldn't be too hard (though not a trivial change), as most of the Ott internals are just operating on byte sequences. One would have to adapt to unicode-friendly top-level lexing of the .ott source files (with https://github.com/ocaml-community/sedlex instead of ocamllex?), perhaps map the obvious unicode subset (e.g. whatever is supported by that input method) to actual TeX in the TeX output (unless tex distros all support well enough with standard packages), make sure the generated GLR parser is happy with unicode (perhaps a no-op, as it presumably just works on byte sequences and can continue to do so), and fix a few functions that check whether identifiers are alphanumeric etc.

@PeterSewell
Copy link
Contributor Author

PeterSewell commented Sep 4, 2021

From a very casual look at LaTeX support for unicode math, there seem to be two basic approaches for the output to that: either use some version of LaTeX and packages that natively understand unicode, or have Ott translate. For the former, @ianstark points to https://ctan.org/pkg/unicode-math, but it says it needs XeTeX or LuaTeX and is restricted to a relatively small set of fonts, so perhaps the latter would be more generally useful. For that, one could probably reuse the unicode-to-latex mapping from that package http://mirrors.ctan.org/macros/unicodetex/latex/unicode-math/unicode-math-table.tex, or from http://milde.users.sourceforge.net/LUCR/Math/, or (perhaps more usable in practice?) from whatever the emacs TeX inputmethod uses, or the union of the last and whatever the vim analogue is (taking due care with licencing, obviously). One would have to decide how well to support combining characters. For prover backends, one should check what unicode they each support and provide a translation - probably based on the TeX names - where they don't.

@PeterSewell
Copy link
Contributor Author

I'm hoping that someone else will take this on at some point. @neel-krishnaswami suggested that it would be useful.

@goldfirere
Copy link

I would find this useful. Though I think it's possible to break the task into two pieces:

  1. Get the ott lexer to understand unicode input
  2. Get the TeX formatter to output correctly

My use-case requires only (1), in that I'm happy to provide TeX homs for all my unicode inputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants