Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add user dic support #72

Merged
merged 3 commits into from
Nov 18, 2023
Merged

Add user dic support #72

merged 3 commits into from
Nov 18, 2023

Conversation

r9y9
Copy link
Owner

@r9y9 r9y9 commented Nov 18, 2023

finally replaces #26 with:

  • more consistent API with the original mecab
  • more descriptive API to let the users know they update the global state of the package
  • unit tests

@r9y9
Copy link
Owner Author

r9y9 commented Nov 18, 2023

Tests failed on window 🤔

Windows fatal exception: access violation

Thread 0x000018a8 (most recent call first):
  File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 316 in wait
  File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 581 in wait
  File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\tqdm\_monitor.py", line 60 in run
  File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 980 in _bootstrap_inner
  File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 9[37](https://github.com/r9y9/pyopenjtalk/actions/runs/6914049216/job/18811513572?pr=72#step:6:38) in _bootstrap

Current thread 0x00001044 (most recent call first):
  File "d:\a\pyopenjtalk\pyopenjtalk\pyopenjtalk\__init__.py", line 245 in mecab_dict_index
  File "D:\a\pyopenjtalk\pyopenjtalk\tests\test_openjtalk.py", line 103 in test_userdic

@r9y9
Copy link
Owner Author

r9y9 commented Nov 18, 2023

================================== FAILURES ===================================
________________________________ test_userdic _________________________________

    def test_userdic():
        for text, expected in [
            ("nnmn", "n a n a m i N"),
            ("GNU", "g u n u u"),
        ]:
            p = pyopenjtalk.g2p(text)
            assert p != expected
    
        with tempfile.NamedTemporaryFile(
            mode="w", encoding="utf-8", suffix=".csv", delete=False
        ) as f:
            f.write("\uff4e\uff4e\uff4d\uff4e,,,1,\u540d\u8a5e,\u4e00\u822c,*,*,*,*,\uff4e\uff4e\uff4d\uff4e,\u30ca\u30ca\u30df\u30f3,\u30ca\u30ca\u30df\u30f3,1/4,*\\n")
            f.write("\uff27\uff2e\uff[35](https://github.com/r9y9/pyopenjtalk/actions/runs/6914217786/job/18811840384#step:6:36),,,1,\u5[40](https://github.com/r9y9/pyopenjtalk/actions/runs/6914217786/job/18811840384#step:6:41)d\u8a5e,\u4e00\u822c,*,*,*,*,\uff27\uff2e\uff35,\u30b0\u30cc\u30fc,\u30b0\u30cc\u30fc,2/3,*\\n")
            f.close()
    
            with tempfile.NamedTemporaryFile(
                mode="w",
                encoding="utf-8",
                suffix=".dic",
                delete=False,
            ) as f2:
                pyopenjtalk.mecab_dict_index(f.name, f2.name)
>               pyopenjtalk.update_global_jtalk_with_user_dict(f2.name)

tests\test_openjtalk.py:110: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pyopenjtalk\__init__.py:266: in update_global_jtalk_with_user_dict
    _global_jtalk = OpenJTalk(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   raise RuntimeError("Failed to initalize Mecab")
E   RuntimeError: Failed to initalize Mecab

Failures on Windows. Hard to debug without having a windows machine...

try not to use features that work differently depending on platforms
@r9y9
Copy link
Owner Author

r9y9 commented Nov 18, 2023

Windows-specific behavior of tempfiles was the root cause. Workaround by not using the tempfile completely.

@r9y9 r9y9 merged commit 26fcdd9 into master Nov 18, 2023
6 checks passed
@r9y9 r9y9 deleted the userdic branch November 18, 2023 13:52
@r9y9 r9y9 mentioned this pull request Nov 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant