Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Retrieve all italic templates #200

Closed
BoboTiG opened this issue Nov 8, 2020 · 10 comments · Fixed by #206
Closed

[FR] Retrieve all italic templates #200

BoboTiG opened this issue Nov 8, 2020 · 10 comments · Fixed by #206

Comments

@BoboTiG
Copy link
Owner

BoboTiG commented Nov 8, 2020

Wikicode:

{{instruments à cordes|fr}}

Output:

<i>(Instruments à cordes)</i>

Expected:

<i>(Musique)</i>
@lasconic
Copy link
Collaborator

lasconic commented Nov 8, 2020

It seems all these templates (instruments à cordes, divinités...) could be imported from https://fr.wiktionary.org/wiki/Cat%C3%A9gorie:Mod%C3%A8les_de_domaine_d%E2%80%99utilisation no ?

@BoboTiG
Copy link
Owner Author

BoboTiG commented Nov 8, 2020

You are right. I was too lazy to do that at the time. Maybe a simple Python scrapper would work? The thing is that (it seems to me) we need to click on each and every word to get the real "them" to use. This can be done one-time and we would keep the code in the Wiki for later needs, WDYT?

@lasconic
Copy link
Collaborator

lasconic commented Nov 8, 2020

WDYT?

:)

See https://gist.github.com/lasconic/391f19dffac8605b6f5fa4c7a3c40d96 for the scraper code and the result... What do you want to do with it ?

@BoboTiG
Copy link
Owner Author

BoboTiG commented Nov 8, 2020

💪

As there is scraping, I would love to see that object moved to its own file, like done with langs.py (with the same kind of docstring, I will provide a Wiki link when the script will be finalized):

templates_italic = {

As done with langs.py, adding the number of entries in the dict would be cool.

Also, no need for the term, just generate the same list of the one existent, while keeping it sorted.

@BoboTiG BoboTiG changed the title [FR] Handle the 'instruments à cordes' template [FR] Retrieve all italic templates Nov 8, 2020
@BoboTiG
Copy link
Owner Author

BoboTiG commented Nov 8, 2020

The original dict contains more than just ones from the new list. So you should move the result of the scraping into its own file and do something like that in fr/__init__.py:

templates_italic = {
    **basic_italic_templates,  # or whatever fitting your mind
    "abréviation": "Abréviation",
    # ...
}

@lasconic
Copy link
Collaborator

lasconic commented Nov 8, 2020

Unfortunately, this last bit doesn't always work. templates_italic is dealt with here: https://github.com/BoboTiG/ebook-reader-dict/blob/master/scripts/utils.py#L366 and in the case of divinités(see issue #197) it will not work since len(parts) > 1.
An other suggestion ?

@lasconic
Copy link
Collaborator

lasconic commented Nov 8, 2020

For the moment, I made a special case for "divinités". See PR #206

@BoboTiG
Copy link
Owner Author

BoboTiG commented Nov 8, 2020

The way you handled "divinités" is the right one ;)

I will review your PR tomorrow 👍

@lasconic
Copy link
Collaborator

lasconic commented Nov 9, 2020

@BoboTiG
Copy link
Owner Author

BoboTiG commented Nov 9, 2020

lasconic referenced this issue in lasconic/ebook-reader-dict Nov 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants