Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Word wise not working correctly for French books #141

Closed
3 tasks done
e-zz opened this issue Aug 6, 2023 · 4 comments
Closed
3 tasks done

Word wise not working correctly for French books #141

e-zz opened this issue Aug 6, 2023 · 4 comments

Comments

@e-zz
Copy link

e-zz commented Aug 6, 2023

Checkboxes

  • I have read the document at xxyzz.github.io/WordDumb.
  • I have not found similar issue or disscussion at GitHub.
  • Reboot doesn't fix the problem.

Describe the bug

Error message pops up after changing Lemma lang → french in Customize Word Wise.

And maybe directly because of this (or maybe not), generation of word wise failed to work properly for French books.
Actually it works, but the anntations added are all wrong. And by clicking an explanation from worddumb, I won't see a french word but an English one. It's wierd. I don't know what I did wrong here. Maybe the reason behind is simply the language setting is somehow wrong, like set as English after the processing by the plugin?

System Information

OS: win10
Calibre: 6.24.0
python: 3.8
plugin ver: 3.29.5

Error message

calibre, version 6.24.0 (win32, embedded-python: True)
Tonnerre de Brest!: An error occurred, please copy error message then report bug at GitHub.

Starting job: Saving customized lemmas 
Job: "Saving customized lemmas" failed with error: 
Traceback (most recent call last):
  File "calibre\gui2\threaded_jobs.py", line 82, in start_work
  File "calibre_plugins.worddumb.config", line 357, in dump_lemmas_job
  File "calibre_plugins.worddumb.utils", line 56, in run_subprocess
  File "subprocess.py", line 524, in run
subprocess.CalledProcessError: Command '['py', 'C:\\Users\\ez\\AppData\\Roaming\\calibre\\plugins\\WordDumb.zip', '{"is_kindle": true, "db_path": "C:\\\\Users\\\\ez\\\\AppData\\\\Roaming\\\\calibre\\\\plugins\\\\worddumb-lemmas\\\\fr\\\\wiktionary_fr_en_v0.db", "lemma_lang": "fr", "plugin_path": "C:\\\\Users\\\\ez\\\\AppData\\\\Roaming\\\\calibre\\\\plugins\\\\WordDumb.zip", "model_name": "fr_core_news_md"}', '{"use_pos": true, "search_people": true, "model_size": "md", "zh_wiki_variant": "cn", "fandom": "", "add_locator_map": false, "preferred_formats": ["KFX", "AZW3", "AZW", "MOBI", "EPUB"], "use_all_formats": false, "minimal_x_ray_count": 1, "en_ipa": "ga_ipa", "zh_ipa": "pinyin", "choose_format_manually": true, "wiktionary_gloss_lang": "en", "kindle_gloss_lang": "en", "use_gpu": false, "cuda": "cu118", "last_opened_kindle_lemmas_language": "fr", "last_opened_wiktionary_lemmas_language": "fr", "use_wiktionary_for_kindle": false, "ca_wiktionary_difficulty_limit": 5, "da_wiktionary_difficulty_limit": 5, "de_wiktionary_difficulty_limit": 5, "el_wiktionary_difficulty_limit": 5, "en_wiktionary_difficulty_limit": 5, "es_wiktionary_difficulty_limit": 5, "fi_wiktionary_difficulty_limit": 5, "fr_wiktionary_difficulty_limit": 5, "hr_wiktionary_difficulty_limit": 5, "it_wiktionary_difficulty_limit": 5, "ja_wiktionary_difficulty_limit": 5, "ko_wiktionary_difficulty_limit": 5, "lt_wiktionary_difficulty_limit": 5, "mk_wiktionary_difficulty_limit": 5, "nl_wiktionary_difficulty_limit": 5, "no_wiktionary_difficulty_limit": 5, "pl_wiktionary_difficulty_limit": 5, "pt_wiktionary_difficulty_limit": 5, "ro_wiktionary_difficulty_limit": 5, "ru_wiktionary_difficulty_limit": 5, "sl_wiktionary_difficulty_limit": 5, "sv_wiktionary_difficulty_limit": 5, "uk_wiktionary_difficulty_limit": 5, "zh_wiktionary_difficulty_limit": 5}']' returned non-zero exit status 1.
 
Called with args: (True, WindowsPath('C:/Users/ez/AppData/Roaming/calibre/plugins/worddumb-lemmas/fr/wiktionary_fr_en_v0.db'), 'fr') {'notifications': <queue.Queue object at 0x0000015AF4C24F40>, 'abort': <threading.Event object at 0x0000015AF4C25120>, 'log': <calibre.utils.logging.GUILog object at 0x0000015AF4C251E0>} 
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\ez\AppData\Roaming\calibre\plugins\WordDumb.zip\__main__.py", line 24, in <module>
  File "C:\Users\ez\AppData\Roaming\calibre\plugins\WordDumb.zip\dump_lemmas.py", line 69, in dump_spacy_docs
KeyError: 'spacy_model'

Reproduce steps

See the picture below to reproduce it.

Screenshots or videos

image

@e-zz
Copy link
Author

e-zz commented Aug 6, 2023

In short, the Lemma language seems to be fixed as English in my case .

@xxyzz
Copy link
Owner

xxyzz commented Aug 6, 2023

Thanks for the report! The commit linked above should fix this error.

If the Use Wiktionary definition option is enabled, you have to select the word wise definition language to Chinese on your Kindle: https://xxyzz.github.io/WordDumb/usage.html#create-files

@e-zz
Copy link
Author

e-zz commented Aug 6, 2023

Thanks for the report! The commit linked above should fix this error.

If the Use Wiktionary definition option is enabled, you have to select the word wise definition language to Chinese on your Kindle: https://xxyzz.github.io/WordDumb/usage.html#create-files

Hi, thanks for the quick fix. I tested your solution and now word wise works smoothly.

@gloverd
Copy link

gloverd commented Aug 17, 2023

I'm trying to run Word Wise on a French book using the newest artifact linked above (3.29.6) but I am not able to generate correct Gloss, and it looks like its not looking up the correct Lemma either. Since its related to this ticket (Word Wise in French), and its still open, I'm going to list it here, but I can open another ticket if needed since I'm posting a lot here.

SUMMARY: It appears that WordDumb is only looking up words that match the LEMMA in English, even if its set to French. This results in word wise only appearing for words that are shared between English and French (like "transparent"), and the Gloss that appears, is most of the time the same word repeated, or a different definition.

The result of running "Create Word Wise" on a book with the following settings:

  • Book language metadata: English (Set manually)
  • Lemma / Gloss: French / French
  • Wiktionary Definition: Yes
  • POS to find definition: Yes
    I've tried almost all combinations, and some of them multiple times. My notes are pretty bad, but basically none of them gave good gloss results for french. (note: I have en and zh/fr listed because I also wanted to look at the results for language when setting it on kindle to english then back to chinese where I expected to see the gloss in french. Not really useful, but I left the comments there). In between each test, I opened a different book on my kindle, and deleted the book/folder before re-running Word Dumb.
    image

The resulting pages look like this:
image
image

What it has as Lemma/Gloss are

  1. Salon / "Salon"
  2. Transparent / "Transparent"
  3. Ton / "Unite de mesure de poids . Le symbole : t"

Looking at the "Customize Kindle Word Wise" screen that appears when first setting Lemma/Gloss there are different Gloss results for these words:
image
image

I also picked randomly some more difficult words from those two pages that I would have expected to see defined, and they are in the customize page. So these are Lemmas that are missing (Consterner, Berceau, Incessamment, Fauteuil):
image
image
Is this because of the length of the definition being greater than 3?

I have deleted the English and French Kindle Word Wise lemmas database files (%AppData%/Roaming/calibre/plugins/worddumb-lemmas/), and done a clean install of Calibre/Plugin during my testing.

In #136, specifically this comment you mention that

The Word Wise database file on Kindle will be overwritten. If you enable "use Wiktionary definition", new database file will be copied to Kindle if the previous database is in a different language. The Kindle's default db have to be restored manually.

I'm not sure if all this testing has corrupted something, but I want to try and get as close to a clean slate as possible

Questions:

  1. If the length of the definition is greater than the ration of 3:1, is it supposed to go to a pop up footnotes style behavior, or just not appear?
  2. Do you know how Phobooky was able to see the wordwise results in the Calibre E-book viewer? there are screenshots in Problem with new test feature: Word Wise for EPUB books #53 , and that would make this testing so much easier!
  3. How do I restore the kindles default database manually? What is the best way to do a restore to get back to a fresh installation on Calibre and Kindle if needed?
  4. Sometimes on the step "Saving Customized Lemmas" or "Generating Word Wise" it just hangs -- I've left it upwards of several hours, but end up having to close calibre and force stop python which is running in the background. Does stopping Calibre/Python corrupt anything? I wonder if it freezing in "Saving Customized Lemmas" creates a corrupted file that then gets re-used later on? That is why I'm looking for how to best do a clean re-install.

@xxyzz xxyzz closed this as completed Aug 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants