From d2a9d87925ccb538a5b2aa93962e18b4a392d701 Mon Sep 17 00:00:00 2001 From: Hubert Chathi Date: Fri, 26 Jul 2019 10:39:05 -0400 Subject: [PATCH 1/7] add proposal for using LaTeX for maths display --- proposals/xxxx-maths.md | 127 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 127 insertions(+) create mode 100644 proposals/xxxx-maths.md diff --git a/proposals/xxxx-maths.md b/proposals/xxxx-maths.md new file mode 100644 index 0000000000..ee2fa823e5 --- /dev/null +++ b/proposals/xxxx-maths.md @@ -0,0 +1,127 @@ +# Support for LaTeX in messages + +Some people write using an odd language that has strange symbols. No, I'm not +talking about computer programmers; I'm talking about mathematicians. In order +to aid these people in communicating, Matrix should define a standard way of +including mathematical notation in messages. + +This proposal presents a format using LaTeX, in contrast with a [previous +proposal](https://github.com/matrix-org/matrix-doc/pull/1722/) that used +MathML. + +See also: + +- https://github.com/vector-im/riot-web/issues/1945 + + +## Proposal + +A new attribute `data-mx-maths` will be added for use in `` or `
` +elements. Its value will be mathematical notation in LaTeX format. `` +is used for inline math, and `
` for display math. The contents of the +`` or `
` will be a fallback representation or the desired notation +for clients that do no support mathematical display, or that are unable to +render the entire `data-mx-maths` attribute. The fallback representation is +left up to the sending client and could be, for example, an image, or an HTML +approximation, or the raw LaTeX source. When using an image as a fallback, the +sending client should be aware of issues that may arise from the receiving +client using a different background colours. + +Example (with line breaks and indentation added to `formatted_body` for clarity): + +```json +{ + "content": { + "body": "This is an equation: sin(x)=a/b", + "format": "org.matrix.custom.html", + "formatted_body": "This is an equation: + + sin(x)=a/b + ", + "msgtype": "m.text" + }, + "event_id": "$eventid:example.com", + "origin_server_ts": 1234567890 + "sender": "@alice:example.com", + "type": "m.room.message", + "room_id": "!soomeroom:example.com" +} +``` + + +## Other solutions + +[MSC1722](https://github.com/matrix-org/matrix-doc/pull/1722/) proposes using +MathML as the format of transporting mathematical notation. It also summarizes +some other solutions in its "Other Solutions" section. + +In comparison with MathML, LaTeX has several advantages and disadvantages. + +The first advantage, which is quite obvious, is that LaTeX is much less verbose +and more readable than MathML. In many cases, the LaTeX code is a suitable +fallback for the rendered notation. + +LaTeX is a suitable input method for many people, and so converting from a +user's input to the message format would be a no-op. + +However, balanced against these advantages, LaTeX has several disadvantages as +a message format. Some of these are covered in the "Potential issues" and +"Security considerations". + + +## Potential issues + +### "LaTeX" as a format is poorly defined + +There are several extensions to LaTeX that are commonly used, such as +AMS-LaTeX. It is unclear which extensions should be supported, and which +should not be supported. Different LaTeX-rendering libraries support different +sets of commands. + +This proposal suggests that the receiving client should render the LaTeX +version if possible, but if it contains unsupported commands, then it should +display the fallback. Thus, it is up to the receiving client to decide what +commands it will support, rather than dictating what commands must be +supported. This comes at a cost of possible inconsistency between clients, but +is somewhat mitigated by the use of a fallback. + +### Lack of libraries for displaying mathematics + +see the corresponding section in [MSC1722](https://github.com/matrix-org/matrix-doc/pull/1722/) + + +## Security considerations + +LaTeX is a [Turing complete programming +language](https://web.archive.org/web/20160110102145/http://en.literateprograms.org/Turing_machine_simulator_%28LaTeX%29); +it is possible to write a LaTeX document that contains an infinite loop, or +that will require large amounts of memory. While it may be fun to write a +[LaTeX file that can control a Mars +Rover](https://wiki.haskell.org/wikiupload/8/85/TMR-Issue13.pdf#chapter.2), it +is not desireable for a mathematical formula embedded in a Matrix message to +control a Mars Rover. Clients should take precautions when rendering LaTeX. +Clients that use a rendering library should only use one that can process the +LaTeX safely. + +Clients should not render mathematics by calling the `latex` executable without +proper sandboxing, as `latex` was not written to handle untrusted input. (see, +for example, https://hovav.net/ucsd/dist/texhack.pdf, +https://0day.work/hacking-with-latex/, and +https://hovav.net/ucsd/dist/tex-login.pdf .) + +Certain commands such as `\newcommand` are potentially dangerous; clients +should either decline to process those commands, or should take care to ensure +that they are handled in safe ways. In general, LaTeX commands should be +filtered using a whitelist rather than blacklist. + +In general, LaTeX places a heavy burden on client authors to ensure that it is +processed safely. + + +## Conclusion + +Math(s) is hard, but LaTeX makes it easier to write mathematical notation. +However, using LaTeX as a format for including mathematics in Matrix messages +has some serious downsides. Nevertheless, if clients handle the LaTeX +carefully, or rely on the fallback representation, the concerns can be +addressed. From 64e36264b388df083144ba8fe603cedfc3110643 Mon Sep 17 00:00:00 2001 From: Hubert Chathi Date: Fri, 26 Jul 2019 10:45:17 -0400 Subject: [PATCH 2/7] rename to match MSC number --- proposals/{xxxx-maths.md => 2191-maths.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename proposals/{xxxx-maths.md => 2191-maths.md} (99%) diff --git a/proposals/xxxx-maths.md b/proposals/2191-maths.md similarity index 99% rename from proposals/xxxx-maths.md rename to proposals/2191-maths.md index ee2fa823e5..95a85b1e78 100644 --- a/proposals/xxxx-maths.md +++ b/proposals/2191-maths.md @@ -41,7 +41,7 @@ Example (with line breaks and indentation added to `formatted_body` for clarity) "msgtype": "m.text" }, "event_id": "$eventid:example.com", - "origin_server_ts": 1234567890 + "origin_server_ts": 1234567890, "sender": "@alice:example.com", "type": "m.room.message", "room_id": "!soomeroom:example.com" From d27cfddfbc019e1607b71cb7b8aa7319603cf109 Mon Sep 17 00:00:00 2001 From: Hubert Chathi Date: Mon, 19 Oct 2020 19:45:43 -0400 Subject: [PATCH 3/7] change title --- proposals/2191-maths.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2191-maths.md b/proposals/2191-maths.md index 95a85b1e78..ca0259b4b3 100644 --- a/proposals/2191-maths.md +++ b/proposals/2191-maths.md @@ -1,4 +1,4 @@ -# Support for LaTeX in messages +# Markup for mathematical messages Some people write using an odd language that has strange symbols. No, I'm not talking about computer programmers; I'm talking about mathematicians. In order From 1587a801c68a24efdfec63124f5f7ac90302d7ba Mon Sep 17 00:00:00 2001 From: Hubert Chathi Date: Tue, 13 Feb 2024 16:26:25 -0500 Subject: [PATCH 4/7] update based on feedback --- proposals/2191-maths.md | 37 ++++++++++++++++++++++++------------- 1 file changed, 24 insertions(+), 13 deletions(-) diff --git a/proposals/2191-maths.md b/proposals/2191-maths.md index ca0259b4b3..21c23f4ead 100644 --- a/proposals/2191-maths.md +++ b/proposals/2191-maths.md @@ -1,4 +1,4 @@ -# Markup for mathematical messages +# MSC2191: Markup for mathematical messages Some people write using an odd language that has strange symbols. No, I'm not talking about computer programmers; I'm talking about mathematicians. In order @@ -83,11 +83,17 @@ version if possible, but if it contains unsupported commands, then it should display the fallback. Thus, it is up to the receiving client to decide what commands it will support, rather than dictating what commands must be supported. This comes at a cost of possible inconsistency between clients, but -is somewhat mitigated by the use of a fallback. +is somewhat mitigated by the use of a fallback. Clients should, however, +aim to support, at minimum, the basic LaTeX2e maths commands and the TeX maths +commands, with the exception of commands that could be security risks (see +below). + +To improve compatibility, the sender's client may warn the sender if they are +using a command that comes from another package, such as AMS-LaTeX. ### Lack of libraries for displaying mathematics -see the corresponding section in [MSC1722](https://github.com/matrix-org/matrix-doc/pull/1722/) +see the corresponding section in [MSC1722](https://github.com/matrix-org/matrix-spec-proposals/pull/1722/files#diff-4a271297299040dbfa622bfc6d2aab02f9bc82be0b28b2a92ce30b14c5621f94R148-R164) ## Security considerations @@ -104,18 +110,23 @@ Clients that use a rendering library should only use one that can process the LaTeX safely. Clients should not render mathematics by calling the `latex` executable without -proper sandboxing, as `latex` was not written to handle untrusted input. (see, -for example, https://hovav.net/ucsd/dist/texhack.pdf, -https://0day.work/hacking-with-latex/, and -https://hovav.net/ucsd/dist/tex-login.pdf .) - -Certain commands such as `\newcommand` are potentially dangerous; clients -should either decline to process those commands, or should take care to ensure -that they are handled in safe ways. In general, LaTeX commands should be -filtered using a whitelist rather than blacklist. +proper sandboxing, as the `latex` executable was not written to handle +untrusted input. (see, for example, , +, and +.) Some LaTeX rendering libraries +are better suited for processing untrusted input. + +Certain commands, such as [those that can create +macros](https://katex.org/docs/supported#macros), are potentially dangerous; +clients should either decline to process those commands, or should take care to +ensure that they are handled in safe ways (such as by limiting recursion). In +general, LaTeX commands should be filtered by allowing known-good commands +rather than forbidding known-bad commands. Some LaTeX libraries may have +options for doing this. In general, LaTeX places a heavy burden on client authors to ensure that it is -processed safely. +processed safely. Some LaTeX rendering libraries provide security advice, for +example, . ## Conclusion From 6829f0556290bb2385b98033480247e7efa2a546 Mon Sep 17 00:00:00 2001 From: Hubert Chathi Date: Tue, 13 Feb 2024 17:06:37 -0500 Subject: [PATCH 5/7] up to clients how to deal with potentially-dangerous commands --- proposals/2191-maths.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/proposals/2191-maths.md b/proposals/2191-maths.md index 21c23f4ead..49bbcdc7fb 100644 --- a/proposals/2191-maths.md +++ b/proposals/2191-maths.md @@ -83,10 +83,10 @@ version if possible, but if it contains unsupported commands, then it should display the fallback. Thus, it is up to the receiving client to decide what commands it will support, rather than dictating what commands must be supported. This comes at a cost of possible inconsistency between clients, but -is somewhat mitigated by the use of a fallback. Clients should, however, -aim to support, at minimum, the basic LaTeX2e maths commands and the TeX maths -commands, with the exception of commands that could be security risks (see -below). +is somewhat mitigated by the use of a fallback. Clients should, however, aim +to support, at minimum, the basic LaTeX2e maths commands and the TeX maths +commands, with the possible exception of commands that could be security risks +(see below). To improve compatibility, the sender's client may warn the sender if they are using a command that comes from another package, such as AMS-LaTeX. From fd783691766287d9d1007db94c14b44bc681a1f0 Mon Sep 17 00:00:00 2001 From: Hubert Chathi Date: Tue, 13 Feb 2024 17:21:21 -0500 Subject: [PATCH 6/7] fix typo Co-authored-by: Travis Ralston --- proposals/2191-maths.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2191-maths.md b/proposals/2191-maths.md index 49bbcdc7fb..335ef502fc 100644 --- a/proposals/2191-maths.md +++ b/proposals/2191-maths.md @@ -20,7 +20,7 @@ A new attribute `data-mx-maths` will be added for use in `` or `
` elements. Its value will be mathematical notation in LaTeX format. `` is used for inline math, and `
` for display math. The contents of the `` or `
` will be a fallback representation or the desired notation -for clients that do no support mathematical display, or that are unable to +for clients that do not support mathematical display, or that are unable to render the entire `data-mx-maths` attribute. The fallback representation is left up to the sending client and could be, for example, an image, or an HTML approximation, or the raw LaTeX source. When using an image as a fallback, the From 4ad26f827a831b09f7bfede9b6511d9afc942a3b Mon Sep 17 00:00:00 2001 From: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com> Date: Tue, 20 Feb 2024 17:38:33 +0000 Subject: [PATCH 7/7] small typo fix --- proposals/2191-maths.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2191-maths.md b/proposals/2191-maths.md index 335ef502fc..28a26f3807 100644 --- a/proposals/2191-maths.md +++ b/proposals/2191-maths.md @@ -25,7 +25,7 @@ render the entire `data-mx-maths` attribute. The fallback representation is left up to the sending client and could be, for example, an image, or an HTML approximation, or the raw LaTeX source. When using an image as a fallback, the sending client should be aware of issues that may arise from the receiving -client using a different background colours. +client using a different background colour. Example (with line breaks and indentation added to `formatted_body` for clarity):