Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle the <math> HTML tag #54

Closed
BoboTiG opened this issue May 19, 2020 · 5 comments · Fixed by #225
Closed

Handle the <math> HTML tag #54

BoboTiG opened this issue May 19, 2020 · 5 comments · Fixed by #225
Assignees

Comments

@BoboTiG
Copy link
Owner

BoboTiG commented May 19, 2020

Wiktionary page: https://fr.wiktionary.org/wiki/octonion

Wikicode:

<math>x=x_0+x_1{\rm i}+x_2{\rm j}+x_3{\rm k}+x_4{\rm l}+x_5{\rm il}+x_6{\rm jl}+x_7{\rm kl}</math>

Output:

x=x_0+x_1{\rm i}+x_2{\rm j}+x_3{\rm k}+x_4{\rm l}+x_5{\rm il}+x_6{\rm jl}+x_7{\rm kl}

Expected:

Not sure if we can and how to display it.
@BoboTiG
Copy link
Owner Author

BoboTiG commented Jun 1, 2020

Kobo is able to display pictures, we should try to generate ones using Tex or LaTex. It must be included using data URI base64 encoded. See https://fr.m.wikipedia.org/wiki/Aide:Formules_TeX and https://pgaskin.net/dictutil/dicthtml/format.html#gif-jpg-etc.

@BoboTiG BoboTiG closed this as completed Jun 1, 2020
@BoboTiG BoboTiG reopened this Jun 1, 2020
@BoboTiG
Copy link
Owner Author

BoboTiG commented Jun 1, 2020

diff --git a/.github/workflows/auto-updates.yml b/.github/workflows/auto-updates.yml
index 45850b5..f8ab3b7 100644
--- a/.github/workflows/auto-updates.yml
+++ b/.github/workflows/auto-updates.yml
@@ -24,6 +24,9 @@ jobs:
       with:
         python-version: 3.8
 
+    - name: Install LaTeX requirements
+      run: sudo apt install dvipng texlive-latex-base
+
     - name: Install requirements
       run: python -m pip install -r requirements.txt
 
diff --git a/scripts/utils.py b/scripts/utils.py
index 63a28c6..b0da64c 100644
--- a/scripts/utils.py
+++ b/scripts/utils.py
@@ -4,7 +4,7 @@ from contextlib import suppress
 from datetime import datetime
 from functools import lru_cache
 from pathlib import Path
-from typing import Tuple
+from typing import Match, Tuple
 from warnings import warn
 
 from .constants import DOWNLOAD_URL
@@ -277,9 +277,29 @@ def clean(word: str, text: str, locale: str) -> str:
     text = sub(r"\s{2,}", " ", text)
     text = sub(r"\s{1,}\.", ".", text)
 
+    # Handle the <math> case
+    text = sub(r"<math>(.+)</math>", convert_math, text)
+
     return text.strip()
 
 
+def convert_math(match: Match[str]) -> str:
+    """"""
+    expr: str = match.group(1) if isinstance(match, re.Match) else match
+
+    import subprocess
+    cmd = ["./pnglatex", "-f", expr, "-o", "file.png", "-d", "96"]
+    subprocess.check_call(cmd)
+
+    import base64
+    img = '<img src="data:image/png;base64,'
+    with open("file.png", "rb") as f:
+        img += base64.encodebytes(f.read()).decode()
+    img += '"/>'
+
+    return img
+
+
 def transform(word: str, template: str, locale: str) -> str:
     """Convert the data from the *template" template.
     This function also checks for template style.

Needs https://github.com/mneri/pnglatex at the root of the repository.

It works but I am not sure we want that. It will break Kobo using a firmware < 4.20.*.

Screenshot from the browser:
preview

@BoboTiG BoboTiG changed the title <math> not handled Handle the <math> HTML tag Nov 3, 2020
BoboTiG added a commit that referenced this issue Nov 12, 2020
Requires a firmware >= 4.20.
@BoboTiG BoboTiG mentioned this issue Nov 12, 2020
4 tasks
BoboTiG added a commit that referenced this issue Nov 12, 2020
Requires a firmware >= 4.20.
BoboTiG added a commit that referenced this issue Nov 12, 2020
Requires a firmware >= 4.20.
BoboTiG added a commit that referenced this issue Nov 12, 2020
Requires a firmware >= 4.20.
@lasconic
Copy link
Collaborator

FWIW

$ cat data/fr/data.json | grep "<math>" | wc -l
59

@BoboTiG
Copy link
Owner Author

BoboTiG commented Nov 13, 2020

Yes, but considering all locales, it may be interesting to handle it. For english:

$ cat data/en/data.json | grep "<math>" | wc -l
464

Not a big priority though.

@lasconic
Copy link
Collaborator

The tag could also be supported by the same code.

Interesting reading : https://en.wikipedia.org/wiki/Help:Displaying_a_formula

Use of chem in FR : https://fr.wiktionary.org/wiki/dim%C3%A9thylpolysiloxane

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants