Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Binding: Tokenizer.from_file() cannot parse JSON file of tokens #1506

Closed
dwash96 opened this issue Apr 18, 2024 · 2 comments
Closed

Comments

@dwash96
Copy link

dwash96 commented Apr 18, 2024

I don't understand why the following token file:
https://github.com/Plachtaa/VALL-E-X/blob/master/utils/g2p/bpe_69.json

would throw this very unspecific error when the project definitely uses this library as a dependency and the file itself was presumably generated with Tokenizer.save at some point.

Traceback (most recent call last):
  File "/app/tts.py", line 335, in <module>
    cli()
  File "/app/tts.py", line 331, in cli
    convert_text_valle()
  File "/app/tts.py", line 294, in convert_text_valle
    from vallex.utils.generation import SAMPLE_RATE, generate_audio, preload_models
  File "/app/.venv/lib/python3.10/site-packages/vallex/utils/generation.py", line 63, in <module>
    text_tokenizer = PhonemeBpeTokenizer(tokenizer_path="./utils/g2p/bpe_69.json")
  File "/app/.venv/lib/python3.10/site-packages/vallex/utils/g2p/__init__.py", line 13, in __init__
    self.tokenizer = Tokenizer.from_file(tokenizer_path)
Exception: invalid type: integer `404`, expected struct Tokenizer at line 1 column 3

Has anyone seen this before and can you point me in the right direction? Is it somehow trying to load the file as a URL and getting a literal 404 not found error? I'm trying to read through the Rust code but so far, I have no intuition about what I am looking at.

I currently have version 0.19.1 installed and I have I tried to downgrade back to 0.13.1 in case the token file itself is ill-formatted but the same error gets thrown. Any help would be appreciated, thanks!

@dwash96
Copy link
Author

dwash96 commented Apr 19, 2024

Well, never mind, I figured out in the VALL-E-X library was trying to download the token file from the wrong url. Closing.

@dwash96 dwash96 closed this as completed Apr 19, 2024
@ArthurZucker
Copy link
Collaborator

thanks for updating us!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants