Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Support 'R:TLFi' template #679

Closed
lasconic opened this issue Feb 5, 2021 · 12 comments · Fixed by #688 or #696
Closed

[FR] Support 'R:TLFi' template #679

lasconic opened this issue Feb 5, 2021 · 12 comments · Fixed by #688 or #696

Comments

@lasconic
Copy link
Collaborator

lasconic commented Feb 5, 2021

Wikicode:

{{R:TLFi}}

Output:


Expected:

« pedzouille », dans <i>TLFi, Le Trésor de la langue française informatisé</i>, 1971–1994 → consulter cet ouvrage

Model link, if any: https://fr.wiktionary.org/wiki/Mod%C3%A8le:R:TLFi

Not sure if we want to keep "→ consulter cet ouvrage"

@BoboTiG WDYT ?

@BoboTiG
Copy link
Owner

BoboTiG commented Feb 5, 2021

Not sure if we want to keep "→ consulter cet ouvrage"

No, we should not keep it.

lasconic added a commit to lasconic/ebook-reader-dict that referenced this issue Feb 7, 2021
lasconic added a commit to lasconic/ebook-reader-dict that referenced this issue Feb 8, 2021
lasconic added a commit to lasconic/ebook-reader-dict that referenced this issue Feb 8, 2021
@lasconic lasconic reopened this Feb 8, 2021
@lasconic
Copy link
Collaborator Author

lasconic commented Feb 8, 2021

Unfortunately, I have to reopen this issue...

{{R:TLFi}} is different from {{R|TLFi}}. We only want to render the first one and not the second one. Currently we cannot make the difference because {{R:TLFi}} is converted to {{R|TLFi}} early in the process. I'm not sure if the PR is doing more good than harm... so I would propose to revert it for now...

See
https://fr.wiktionary.org/w/index.php?title=m%C3%A9tis
https://fr.wiktionary.org/w/index.php?title=pedzouille

@BoboTiG
Copy link
Owner

BoboTiG commented Feb 9, 2021

An idea: when cleaning the HTML, we could convert {{R:XXX}} to {{XXX}}. Then, it would be as simple as defining a template handler for XXX. WDYT?

lasconic added a commit to lasconic/ebook-reader-dict that referenced this issue Feb 9, 2021
@lasconic
Copy link
Collaborator Author

lasconic commented Feb 9, 2021

Another one at : https://fr.wiktionary.org/w/index.php?title=barricade

It's a good idea, but if we make a template handler for XXX, we could just let it alone and do a template handler for R:XXX directly ?

@BoboTiG
Copy link
Owner

BoboTiG commented Feb 9, 2021

That makes more sense yes :)
I dit not find where we do the {{R:XXX}} -> {{R|XXX}} conversion. But if we can just handle {{R:XXX}} then it is a clean solution IMO.

@lasconic
Copy link
Collaborator Author

lasconic commented Feb 9, 2021

Very early here:

# {{formatnum:-1000000}}

@BoboTiG
Copy link
Owner

BoboTiG commented Feb 9, 2021

Another approach could be to convert {{formatnum:-1000000}} to {{formatnum|-1000000}} in cleanup() instead. We could use a predefined list of patterns we want to convert. That would make the process more robust, WDYT?

@BoboTiG
Copy link
Owner

BoboTiG commented Feb 9, 2021

Very quickly, without the patterns list:

diff --git a/wikidict/utils.py b/wikidict/utils.py
index 213099b..89e6177 100644
--- a/wikidict/utils.py
+++ b/wikidict/utils.py
@@ -368,6 +368,9 @@ def clean(text: str) -> str:
     text = sub(r"<<([^/>]+)>>", "\\1", text)
     text = sub(r"<<(?:[^/>]+)/([^>]+)>>", "\\1", text)
 
+    # {{formatnum:-1000000}} -> {{formatnum|-1000000}}
+    text = sub(r"{{(formatnum):([^}]+)}}", "{{\\1|\\2}}", text)
+
     return text.strip()
 
 
@@ -504,12 +507,6 @@ def transform(word: str, template: str, locale: str) -> str:
     parts = [p.strip("\u200e") for p in parts]  # Left-to-right mark
     tpl = parts[0]
 
-    # {{formatnum:-1000000}}
-    if ":" in tpl:
-        parts_raw = template.split(":")
-        parts = [p.strip() for p in parts_raw]
-        tpl = parts[0]
-
     # Stop early
     if not tpl or tpl in templates_ignored[locale]:
         return ""

lasconic added a commit to lasconic/ebook-reader-dict that referenced this issue Feb 9, 2021
@lasconic
Copy link
Collaborator Author

lasconic commented Feb 9, 2021

I don't get why it's better to do it in cleanup but having a way to choose which one we want to convert is definitely a win. We could first try everything but R:TLFi

See PR #696

@BoboTiG
Copy link
Owner

BoboTiG commented Feb 9, 2021

The later would ease adding patterns to ignore for each locale. The former is working too, and maybe could we move to another solution later.

@lasconic
Copy link
Collaborator Author

lasconic commented Feb 9, 2021

I'm not sure if we rely on this code for formatnum only ? Do you know ?

lasconic added a commit to lasconic/ebook-reader-dict that referenced this issue Feb 9, 2021
@BoboTiG
Copy link
Owner

BoboTiG commented Feb 9, 2021

I am not sure. Let's do it and see later if we need to adapt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants