Skip to content
This repository has been archived by the owner on Feb 25, 2023. It is now read-only.

Various 故事ことわざの辞典 bugs #27

Open
Thermospore opened this issue Mar 8, 2021 · 2 comments
Open

Various 故事ことわざの辞典 bugs #27

Thermospore opened this issue Mar 8, 2021 · 2 comments

Comments

@Thermospore
Copy link

Hello, here are some things I've noticed with the 故事ことわざの辞典 importer:

  1. There is a \n\n\n\n at the end of each definition, which should be removed
  2. I found this entry: 愛して〔も〕その悪を知り、憎みて〔も〕その善を知る. Looks like there isn't any handling for those 〔〕 brackets. I'm guessing it means you can do with or without the thing inside. Didn't check if there were other entries with this issue
    image
  3. Something is broken at the start of 愛縁奇縁
    image
    image
    image

Thanks -Tyler

@Thermospore
Copy link
Author

@FooSoft

I think number 3 is dictionary independent. getting it with 広辞苑第六版 • 付属資料 too

current yomichan import on the left, old yomichan import on the right
image

@FooSoft
Copy link
Owner

FooSoft commented Mar 13, 2021

Might need to create some new regex to strip away stuff in the braces. The thing with #3 is weird. I am imagining that perhaps there is a different failure mode during text decode? As you mentioned, yomichan-import has switched to zero-epwing-go for parsing; we don't have to go cross process and everything is a lot simpler in general, so that is likely cause here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants