Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rendering errors (<chem> and <math>) #1182

Closed
Moonbase59 opened this issue Jan 31, 2022 · 19 comments
Closed

Rendering errors (<chem> and <math>) #1182

Moonbase59 opened this issue Jan 31, 2022 · 19 comments
Labels
bug Something isn't working

Comments

@Moonbase59
Copy link
Contributor

Moonbase59 commented Jan 31, 2022

Note from @BoboTiG: issue tightly coupled to #1183, interesting details can be found there too.


I did a fresh download and render of the EN wiktionary today, and got the following errors:

>>> Loading data/en/data_wikicode-20220120.json ...
>>> Loaded 1,038,672 words from data/en/data_wikicode-20220120.json
<chem> ERROR with ^-N=\overset{+}N=N^- in [azide]
<math> ERROR with \begin{align}\frac{\pi}{2} & = \prod_{n=1}^{\infty} \frac{ 4n^2 }{ 4n^2 - 1 } = \prod_{n=1}^{\infty} \left(\frac{2n}{2n-1} \cdot \frac{2n}{2n+1}\right) \\[6pt]& = \Big(\frac{2}{1} \cdot \frac{2}{3}\Big) \cdot \Big(\frac{4}{3} \cdot \frac{4}{5}\Big) \cdot \Big(\frac{6}{5} \cdot \frac{6}{7}\Big) \cdot \Big(\frac{8}{7} \cdot \frac{8}{9}\Big) \cdot \; \cdots \\\end{align} in [Wallis product]
<math> ERROR with \begin{align}a_0 &+ a_1x + a_2x^2 + a_3x^3 + \cdots + a_nx^n \\ &= a_0 + x \bigg(a_1 + x \Big(a_2 + x \big(a_3 + \cdots + x(a_{n-1} + x \, a_n) \cdots \big) \Big) \bigg).\end{align} in [Horner's rule]
<math> ERROR with \frac = \frac in [circle of Apollonius]
<math> ERROR with \begin{align}\rho(g, h) (0,x_1,\ldots,x_k) &= g(x_1,\ldots,x_k) \\\rho(g, h) (y+1,x_1,\ldots,x_k) &= h(y,\rho(g, h) (y,x_1,\ldots,x_k),x_1,\ldots,x_k)\,\end{align} in [primitive recursion]
>>> Saved 697,169 words into data/en/data-20220120.json
>>> Render done!
@BoboTiG
Copy link
Owner

BoboTiG commented Feb 1, 2022

We are aware of such issues. Most of maths and chem scripts can be converted to GIF though. But some are not passing our LaTeX parser.

Any help is welcome, I bet Wikimedia is using specific modules for that.

You can find more info on #1096.

@BoboTiG BoboTiG changed the title [EN] Rendering errors (<chem> and <math>) Rendering errors (<chem> and <math>) Feb 1, 2022
@BoboTiG BoboTiG added the bug Something isn't working label Feb 1, 2022
@Moonbase59
Copy link
Contributor Author

Moonbase59 commented Feb 1, 2022

Oh well, I was expecting such problems.. Formulae are always a problem. Unfortunately, readers don’t usually use MathJax or MathML (although that should work in EPUB3).

So transformation is always a big issue, especially since devices have such differing display ppi, making small images often absolutely illegible. Are you actually using (La)TeX to produce the GIFs?

Their quality is not too good, I wonder if we could eventually switch to 8-bit transparent PNGs instead, to get a little better output. Anyone knows how good that is supported on readers?

(Just tried a few, scrapping all metadata inside, and saving as 8-bit grayscale+alpha PNG, they aren’t that much bigger. Example: 194 bytes → 209 bytes.)

@BoboTiG
Copy link
Owner

BoboTiG commented Feb 1, 2022

Actually I think Kobo does only support GIFs. I need to check again though.

@lasconic
Copy link
Collaborator

lasconic commented Feb 1, 2022

dictgen mentions GIF and JPG https://pgaskin.net/dictutil/dictgen/
In theory, Kobo should support PNG https://help.kobo.com/hc/fr/articles/360017763713-Formats-de-fichiers-pris-en-charge-par-votre-application-Kobo-eReader-et-Kobo-Books but not sure if the support is included for dictionaries.

@lasconic
Copy link
Collaborator

lasconic commented Feb 1, 2022

"azide" is new but the others math expression errors are known: #1096

A way to debug, add "-d -1" and print the exception

except Exception as e:
        print(e)

@lasconic
Copy link
Collaborator

lasconic commented Feb 1, 2022

I just tested and PNGs works. Not sure exactly what sort of PNG it is... I just changed

Image.open(buf).convert("L").save(im, format="gif", optimize=True)

and
return f'<img style="{IMG_CSS}" src="data:image/gif;base64,{b64encode(raw).decode()}"/>'

and replaced "gif" by "png".

Then created a en dict with only "graph" as a word:

mkdir  test_wik
python -m wikidict en --gen-dict=graph --output=test_wik 

Resulting dictionary in kobo format: dicthtml-en-en.zip (3,939 bytes)

With gif: dicthtml-en-en.zip (4,047 bytes)

Tested on Kobo Aura with latest firmware 4.31.19086

@Moonbase59
Copy link
Contributor Author

Moonbase59 commented Feb 1, 2022

Sounds great, thanks for testing. Must get a Kobo soon… What imaging lib does it use? Maybe we can find out more (like how to specify 8-bit greyscale, alpha, no metadata) to keep them small.

I wonder if we could even make it use an SVG. That would be the best (scalable). Some experimenting to do here, I guess.

@lasconic
Copy link
Collaborator

lasconic commented Feb 1, 2022

I was also curious if svg could be used on Kobo. And it can !
dicthtml-en-en.zip (11,819 bytes)

dvioptions = [ "-d -1", ]
    with BytesIO() as buf, BytesIO() as im:
        preview(
            f"${expr}$",
            output="svg",
            viewer="BytesIO",
            outputbuffer=buf,
            dvioptions=dvioptions,
            packages=tuple(packages),
        )

        buf.seek(0)
        raw = buf.read()

    return f'<img style="{IMG_CSS}" src="data:image/svg+xml;base64,{b64encode(raw).decode()}"/>'

PNG:
screen_001

GIF
screen_003

SVG
screen_005

@BoboTiG
Copy link
Owner

BoboTiG commented Feb 1, 2022

Ooooohhhh I am in love with SVG! Why did not we try sooner? :D

@BoboTiG
Copy link
Owner

BoboTiG commented Feb 1, 2022

If going the SVG way, we need also to check what is the output when PyGlossary handles the word, and how it looks finally (cc @Moonbase59). Could you share the StarDict file @lasconic?

@BoboTiG
Copy link
Owner

BoboTiG commented Feb 1, 2022

Looking again at examples, GIF & PNG seem so archaic now :o

@lasconic
Copy link
Collaborator

lasconic commented Feb 1, 2022

unfortunately pyglossary is not happy:

Traceback (most recent call last):
  File "ebook-reader-dict/isoEnv/lib/python3.9/site-packages/pyglossary/glossary.py", line 905, in _read
    reader.open(filename)
  File "ebook-reader-dict/isoEnv/lib/python3.9/site-packages/pyglossary/plugins/ebook_kobo_dictfile.py", line 71, in open
    TextGlossaryReader.open(self, filename)
  File "ebook-reader-dict/isoEnv/lib/python3.9/site-packages/pyglossary/text_reader.py", line 84, in open
    self._open(filename)
  File "ebook-reader-dict/isoEnv/lib/python3.9/site-packages/pyglossary/text_reader.py", line 80, in _open
    self.loadInfo()
  File "ebook-reader-dict/isoEnv/lib/python3.9/site-packages/pyglossary/text_reader.py", line 131, in loadInfo
    self._pendingEntries.append(self.newEntry(word, defi))
  File "ebook-reader-dict/isoEnv/lib/python3.9/site-packages/pyglossary/text_reader.py", line 113, in newEntry
    return self._glos.newEntry(
  File "ebook-reader-dict/isoEnv/lib/python3.9/site-packages/pyglossary/glossary.py", line 742, in newEntry
    return Entry(
  File "ebook-reader-dict/isoEnv/lib/python3.9/site-packages/pyglossary/entry.py", line 285, in __init__
    raise TypeError(f"invalid defi type {type(defi)}")
TypeError: invalid defi type <class 'tuple'>
Reading file 'test_wik/dict-en-en.df' failed.

@BoboTiG
Copy link
Owner

BoboTiG commented Feb 1, 2022

Actually the error is present on the main branch too, meaning it is not SVG-related.
(Let's add a simple test to cover the use case ;)

@Moonbase59
Copy link
Contributor Author

WOW! Thanks so much for trying and the screenshot comparisons. We should generate something and post it to MobileRead and/or E-Reader Forum maybe, to get some other actual users try it.

The SVG looks so much better, and hopefully on any device…

@Moonbase59
Copy link
Contributor Author

check what is the output when PyGlossary handles the word

Probably need to talk to @ilius to support writing lots of small SVGs instead ;-) Plus, of course, not destroy anything that might be in a real dictionary, like JPG/PNG images (as are in real dicts: Cambridge has lots of JPGs, German Duden even has PDFs). We just might—in the ffar future—wish to include images from Wiktionary, after all…

@ilius
Copy link
Contributor

ilius commented Feb 1, 2022

Converting a bitmap format (like png or gif) to SVG (salable vector graphics) is not on the cards, really.
I'm not sure how to explain why.

@Moonbase59
Copy link
Contributor Author

Moonbase59 commented Feb 1, 2022

Depends … where do you convert from? The svg would be there already in the dict, base64-encoded. Could that not just be taken and written out? See #1182 (comment)

Of course trying to convert a raster image to SVG makes no sense.

@lasconic
Copy link
Collaborator

lasconic commented Feb 1, 2022

@BoboTiG
Copy link
Owner

BoboTiG commented Feb 1, 2022

Let's move the conversation to #1183. It is starting to be hard to follow :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants