Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installing Wyoming Whisper without Docker llooks impossible: 404 error when downloading model #16

Open
Nardol opened this issue May 26, 2023 · 8 comments

Comments

@Nardol
Copy link

Nardol commented May 26, 2023

I post it here because Python package index website indicate this repository for home page so first of all, sorry if it is the wrong place.

I would like to test Whisper using Wyoming.
I use Home Assistant core installation, so I have not Docker for anything.
Having Docker installed only for one thing does not look reasonable for me, so I try to install Wyoming Whisper manually.

I looked into the add-on code to see how Wyoming Whisper installed and made the following on my side:

mkdir -p wyoming-whisper/data
cd wyoming-whisper
python3.11 -m venv venv
source venv/bin/activate
pip install wheel
pip install wyoming-faster-whisper==0.0.3
python3 -m wyoming_faster_whisper --uri 'tcp://0.0.0.0:10300' --model medium --beam-size "1" --language "fr" --data-dir ./data --download-dir ./data

But when running the last command, I have the following:

WARNING:wyoming_faster_whisper.download:Model hashes do not match
WARNING:wyoming_faster_whisper.download:Expected: {'config.json': 'e5a2f85afc17f73960204cad2b002633', 'model.bin': '5f852c3335fbd24002ffbb965174e3d7', 'vocabulary.txt': 'c1120a13c94a8cbb132489655cdd1854'}
WARNING:wyoming_faster_whisper.download:Got: {'model.bin': '', 'config.json': '', 'vocabulary.txt': ''}
INFO:__main__:Downloading FasterWhisperModel.MEDIUM to ./data
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/pzajda/wyoming-whisper/venv/lib/python3.11/site-packages/wyoming_faster_whisper/__main__.py", line 135, in <module>
    asyncio.run(main())
  File "/home/pzajda/.pyenv/versions/3.11.2/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/home/pzajda/.pyenv/versions/3.11.2/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pzajda/.pyenv/versions/3.11.2/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/pzajda/wyoming-whisper/venv/lib/python3.11/site-packages/wyoming_faster_whisper/__main__.py", line 75, in main
    model_dir = download_model(model, args.download_dir)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pzajda/wyoming-whisper/venv/lib/python3.11/site-packages/wyoming_faster_whisper/download.py", line 90, in download_model
    with urlopen(model_url) as response:
         ^^^^^^^^^^^^^^^^^^
  File "/home/pzajda/.pyenv/versions/3.11.2/lib/python3.11/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pzajda/.pyenv/versions/3.11.2/lib/python3.11/urllib/request.py", line 525, in open
    response = meth(req, response)
               ^^^^^^^^^^^^^^^^^^^
  File "/home/pzajda/.pyenv/versions/3.11.2/lib/python3.11/urllib/request.py", line 634, in http_response
    response = self.parent.error(
               ^^^^^^^^^^^^^^^^^^
  File "/home/pzajda/.pyenv/versions/3.11.2/lib/python3.11/urllib/request.py", line 563, in error
    return self._call_chain(*args)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pzajda/.pyenv/versions/3.11.2/lib/python3.11/urllib/request.py", line 496, in _call_chain
    result = func(*args)
             ^^^^^^^^^^^
  File "/home/pzajda/.pyenv/versions/3.11.2/lib/python3.11/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

How it could work for the add-on but not manually?
And how could I solve this?
I also posted a topic on Home Assistant community but it looks like I am alone to do that kind of setup 🙁

@synesthesiam
Copy link
Contributor

synesthesiam commented May 27, 2023

The "medium" model is not available here: https://github.com/rhasspy/models/releases/tag/v1.0
It was quite large, so I didn't upload it or the large model.

You can create it yourself by following these steps: https://github.com/guillaumekln/faster-whisper#model-conversion

@Nardol
Copy link
Author

Nardol commented May 28, 2023

Before trying procedure you linked to, I've just tested with tiny-int8 to test with smaller before but with the same 404 result.
What do I do wrong?
python3 -m wyoming_faster_whisper --uri 'tcp://0.0.0.0:10300' --model tiny-int8 --beam-size "1" --language "fr" --data-dir ./data --download-dir ./data

@synesthesiam
Copy link
Contributor

Weird. Can you get the full URL it's trying for the model?

@Nardol
Copy link
Author

Nardol commented May 29, 2023

I added a print(model_url) before the urlopen which gave me the following:
https://github.com/rhasspy/models/releases/download/v1.0/asr_faster-whisper-FasterWhisperModel.TINY_INT8.tar.gz
The dash is replaced by a period if using model instead of model.value.
Python version 3.11.2
I have not found where the source code is to make a PR.

@synesthesiam
Copy link
Contributor

Ok, so this must be a difference with enums and Python 3.11. Thanks!

@antlarr
Copy link

antlarr commented Jun 12, 2023

Hi, I had the same problem as @Nardol. @synesthesiam , as you said, this is a difference in Python 3.11 where you now have to replace:
model_url = URL_FORMAT.format(model=model)
with
model_url = URL_FORMAT.format(model=model.value)
in download.py (in download_model)

As @Nardol , I also didn't find the source code to make a PR. Could you point us to where it is? Even if you've probably already fixed this, it would be nice to know where it is, just in case someone wants to submit some other fix/improvement.

Thanks!

@taha-yassine
Copy link

Hi, I had the same problem as @Nardol. @synesthesiam , as you said, this is a difference in Python 3.11 where you now have to replace: model_url = URL_FORMAT.format(model=model) with model_url = URL_FORMAT.format(model=model.value) in download.py (in download_model)

As @Nardol , I also didn't find the source code to make a PR. Could you point us to where it is? Even if you've probably already fixed this, it would be nice to know where it is, just in case someone wants to submit some other fix/improvement.

Thanks!

I'm answering since no one did. The source code is sitting over in the v0.1.0 branch. It can be found here: https://github.com/rhasspy/rhasspy3/blob/v0.1.0/programs/asr/faster-whisper/script/download.py

@mweinelt
Copy link

mweinelt commented Oct 26, 2023

Ok, so this must be a difference with enums and Python 3.11. Thanks!

That's spot on, tested using the following reproducer. Should be patched to use the value accessor as indicated by @antlarr. The code is on the wyoming-v1 branch.

from enum import Enum

URL_FORMAT = "https://github.com/rhasspy/models/releases/download/v1.0/asr_faster-whisper-{model}.tar.gz"

class FasterWhisperModel(str, Enum):
    """Available faster-whisper models."""

    TINY = "tiny"
    TINY_INT8 = "tiny-int8"
    BASE = "base"
    BASE_INT8 = "base-int8"
    SMALL = "small"
    SMALL_INT8 = "small-int8"
    MEDIUM = "medium"
    MEDIUM_INT8 = "medium-int8"


tiny = FasterWhisperModel.TINY

print(URL_FORMAT.format(model=tiny))
print(URL_FORMAT.format(model=tiny.value))
$ python3.8 enumtest.py
https://github.com/rhasspy/models/releases/download/v1.0/asr_faster-whisper-tiny.tar.gz
https://github.com/rhasspy/models/releases/download/v1.0/asr_faster-whisper-tiny.tar.gz
$ python3.9 enumtest.py
https://github.com/rhasspy/models/releases/download/v1.0/asr_faster-whisper-tiny.tar.gz
https://github.com/rhasspy/models/releases/download/v1.0/asr_faster-whisper-tiny.tar.gz
$ python3.10 enumtest.py
https://github.com/rhasspy/models/releases/download/v1.0/asr_faster-whisper-tiny.tar.gz
https://github.com/rhasspy/models/releases/download/v1.0/asr_faster-whisper-tiny.tar.gz
$ python3.11 enumtest.py
https://github.com/rhasspy/models/releases/download/v1.0/asr_faster-whisper-FasterWhisperModel.TINY.tar.gz
https://github.com/rhasspy/models/releases/download/v1.0/asr_faster-whisper-tiny.tar.gz
$ python3.12 enumtest.py
https://github.com/rhasspy/models/releases/download/v1.0/asr_faster-whisper-FasterWhisperModel.TINY.tar.gz
https://github.com/rhasspy/models/releases/download/v1.0/asr_faster-whisper-tiny.tar.gz

mweinelt added a commit to mweinelt/rhasspy3 that referenced this issue Oct 26, 2023
From Python 3.11 accessing an enum value will not longer convert to
its value representation. Instead a representation of the enum key will
be returned.

This broke model download on wyoming-faster-whisper, since it changed
the URL under which model downloads would be looked for.

Closes: rhasspy#16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants