-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support <hiero> mediawiki extension #703
Comments
Should we handle it instead? It seems to be pictures. |
What do you mean ? convert the pictures to GIF and embed like we do for math ? |
I did not have a look at the PHP file that is handling the template. But I guess it is "only" a bunch of files referenced by a key (here "R11"). IF it is that, we could handle it and use inline GIF as we do for math and chem, yes. |
Pictures are there. WDYT of displaying GIF for the template? |
It seems more like several GIFs for "Ptah". I do not know if it is worth handling the template. Let me know your thoughts :) |
It's a bit more complicated than just one GIF indeed. The extension outputs an HTML table and is able to put symbols on top of each other.
63 in english, on 677,008 words
Could be worth it, especially if most of them are sequential and "simple"... |
For french, here are the code.
Some are simple like R11, but most of them contains * or : ... and it's less simple and would require a table or some css... |
Convert the PNG in GIF and store base64 in a map. Resulting file is 655KB. import os
from PIL import Image
from io import BytesIO
from base64 import b64encode
files = os.listdir(".")
results = {}
for f in files:
if f.endswith(".png"):
code = f.split("_", 1)[1].split(".")[0]
png = Image.open(f)
im = BytesIO()
png.convert("L").save(im, format="gif", optimize=True)
im.seek(0)
raw = im.read()
results[code] = f'<img src="data:image/gif;base64,{b64encode(raw).decode()}"/>'
print("hiero = {")
for t, r in sorted(results.items()):
print(f' "{t}": \'{r}\',')
print(f"}} # {len(results):,}") |
In short, we probably need to reproduce the whole PHP scripts to have a decent support. In particular the tokenizer, https://github.com/wikimedia/mediawiki-extensions-wikihiero/blob/366b1226891e609650b4c7f7d925b718c779517c/includes/HieroTokenizer.php Also some hiero code uses phonemes and not the code used in the PNG filename. So we need a copy of https://github.com/wikimedia/mediawiki-extensions-wikihiero/blob/366b1226891e609650b4c7f7d925b718c779517c/includes/WikiHiero.php#L259 It will be hard to unit test the output, since it's only img tag with base64 and a bunch of HTML... A bit too much for a sunday :) |
Clearly too much, yes :) Thanks for the analysis and pre-work ;) |
Nice one! |
I was wondering what do you think about your patch? Worth giving a try on my side? |
It's kind of linked with the HTML table one #1024, since table support is needed. So I would tackle HTML table first to get some info on how well it works on kobo before tackling this one. |
Attached a dictionary containing the french words with hiero from #703 (comment) |
C'est propre ! I think the cell width should be adapted to the picture width it contain.
But we can live as-is 👍 |
https://fr.wiktionary.org/wiki/Sekhmet is not really well displayed too. |
Yes, I feel like I'm pushing the limit of the HTML renderer on the Kobo... Here is Sekhmet in Chrome (rendered bigger to be the right size on Kobo...) Somehow the styling in the Kobo browser is not the same... (do we know which renderer it is ? Probably webkit, but which version ?) Maybe it's not the browser but a default CSS applied to table... Any idea if we can see this CSS somewhere ? and Ramsès |
I could go up to https://github.com/kobolabs/qt-everywhere-opensource-src-4.6.2/blob/master/src/3rdparty/webkit/VERSION to find the WebKit version, but the hash is not helpfull ( And I am not sure about those information, I got the 4.6.2 version of Qt Embedded from the latest Kobo firmware (https://kbdownload1-a.akamaihd.net/firmwares/kobo7/Feb2021/kobo-update-4.26.16704.zip), so it should be right. |
Ok, so if they use webkit to do dictionary rendering, it's the one included in Qt 4.6.2. I investigated the style... I believe I found the problem for Ramsès, not yet for Sekhmet New french dictionary: |
About the default CSS, I cannot say it is used in the dictionary area though: * {padding: 0; margin: 0; }
body { font: %1px %2; }
table, thead, tbody, tr, td, th { font-size: inherit; font-family: inherit; } (still looking for more data) |
Interesting page for testing : https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:WikiHiero/Exemples |
The new version is way better 💪 |
https://fr.wiktionary.org/wiki/Aton needs more space in column 2. Maybe it is a vertical alignment issue like for Sekhmet. |
Wikicode:
Output:
Expected:
Model link, if any: https://www.mediawiki.org/wiki/Extension:WikiHiero
https://www.mediawiki.org/wiki/Special:MyLanguage/Extension:WikiHiero/Syntax
https://github.com/wikimedia/mediawiki-extensions-wikihiero/blob/366b1226891e609650b4c7f7d925b718c779517c/includes/WikiHiero.php
The text was updated successfully, but these errors were encountered: