Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Termux error with Whisperx, not in Termux Prooted Debian #1651

Open
Manamama opened this issue Feb 18, 2024 · 2 comments
Open

Termux error with Whisperx, not in Termux Prooted Debian #1651

Manamama opened this issue Feb 18, 2024 · 2 comments

Comments

@Manamama
Copy link

Tested versions

This :
Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0, yours is 2.1.0. Bad things might happen unless you revert torch to 1.x. >>Performing transcription...

is desired, but sanitation is needed :

def check_version(library: Text, theirs: Text, mine: Text, what: Text = "Pipeline"):

   # Sanitize the version strings, added:
   theirs = theirs.split("+")[0].replace("a0", ".0")
   mine = mine.split("+")[0].replace("a0", ".0")

   theirs = ".".join(theirs.split(".")[:3])
   mine = ".".join(mine.split(".")[:3])

   theirs = VersionInfo.parse(theirs)
   mine = VersionInfo.parse(mine)

   if theirs.major > mine.major:
       print(
         f"{what} was trained with {library} {theirs}, yours is {mine}. "
         f"Bad things will probably happen unless you upgrade {library} to {theirs.major}.x."
       )

   elif theirs.major < mine.major:
       print(
           f"{what} was trained with {library} {theirs}, yours is {mine}. "
           f"Bad things might happen unless you revert {library} to {theirs.major}.x."
       )

   elif theirs.minor > mine.minor:
       print(
           f"{what} was trained with {library} {theirs}, yours is {mine}. "
           f"This should be OK but you might want to upgrade {library}."
       )

to avoid:

...kages/pyannote/audio/utils/version.py", line 34, in check_version mine = VersionInfo.parse(mine) ^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/semver/version.py", line 646, in parse raise ValueError(f"{version} is not valid SemVer string") ValueError: 2.1.0a0+gita8e7c98 is not valid SemVer string

Bug is only in:


Operating System: Android 11 aarch64 Kernel: 4.14.186+
Shell: /data/data/com.termux/files/usr/bin/bash 5.2.26 Python: 3.11.8 ------ Display Manager Display Server: Desktop Environment theme: Adwaita [GTK3] Icons theme: Adwaita [GTK3]

with

Name: pyannote.audio Version: 3.1.1 Summary: Neural building blocks for speaker diarization.

It works fine with:

root@localhost -------------- OS: Debian GNU/Linux 12 (bookworm) aarch64 Host: realme RMX3085 Kernel: 6.2.1-PRoot-Distro Uptime: 2 mins Packages: 892 (dpkg), 1 (pkg) Shell: bash 5.2.15 Terminal: proot

  • not sure why so, as same machine, probably same library versions, etc.

System information

Android 11 aarch64, Termux

Issue description

See above

Minimal reproduction example (MRE)

...

@Manamama
Copy link
Author

Manamama commented Mar 4, 2024

Same on a fresh pip install, new Droid, Termux only. Without my fix, above, I get : whisperx --compute_type float32 --print_progress True --language Polish "/storage/7B27-F244/Audio/Music/Old 3/R.U.T.A. - Zew Hord.mp3" Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.2.0.post0. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint ../../.cache/torch/whisperx-vad-segmentation.bin Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Traceback (most recent call last): File "/data/data/com.termux/files/usr/bin/whisperx", line 8, in sys.exit(cli()) ^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/whisperx/transcribe.py", line 170, in cli model = load_model(model_name, device=device, device_index=device_index, download_root=model_dir, compute_type=compute_type, language=args['language'], asr_options=asr_options, vad_options={"vad_onset": vad_onset, "vad_offset": vad_offset}, task=task, threads=faster_whisper_threads) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/whisperx/asr.py", line 345, in load_model vad_model = load_vad_model(torch.device(device), use_auth_token=None, **default_vad_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/whisperx/vad.py", line 51, in load_vad_model vad_model = Model.from_pretrained(model_fp, use_auth_token=use_auth_token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pyannote/audio/core/model.py", line 696, in from_pretrained model = Klass.load_from_checkpoint( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/utilities/model_helpers.py", line 125, in wrapper return self.method(cls, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/core/module.py", line 1581, in load_from_checkpoint loaded = _load_from_checkpoint( ^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/core/saving.py", line 91, in _load_from_checkpoint model = _load_state(cls, checkpoint, strict=strict, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/core/saving.py", line 177, in _load_state obj.on_load_checkpoint(checkpoint) File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pyannote/audio/core/model.py", line 295, in on_load_checkpoint check_version( File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pyannote/audio/utils/version.py", line 34, in check_version mine = VersionInfo.parse(mine) ^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/semver/version.py", line 646, in parse raise ValueError(f"{version} is not valid SemVer string") ValueError: 2.1.0a0+gita8e7c98 is not valid SemVer string ~/downloads/fixed $ whisperx --compute_type float32 --print_progress True "/storage/7B27-F244/Audio/Music/Old 3/R.U.T.A. - Zew Hord.mp3" No language specified, language will be first be detected for each audio file (increases inference time). Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.2.0.post0. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint ../../.cache/torch/whisperx-vad-segmentation.bin Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Traceback (most recent call last): File "/data/data/com.termux/files/usr/bin/whisperx", line 8, in sys.exit(cli()) ^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/whisperx/transcribe.py", line 170, in cli model = load_model(model_name, device=device, device_index=device_index, download_root=model_dir, compute_type=compute_type, language=args['language'], asr_options=asr_options, vad_options={"vad_onset": vad_onset, "vad_offset": vad_offset}, task=task, threads=faster_whisper_threads) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/whisperx/asr.py", line 345, in load_model vad_model = load_vad_model(torch.device(device), use_auth_token=None, **default_vad_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/whisperx/vad.py", line 51, in load_vad_model vad_model = Model.from_pretrained(model_fp, use_auth_token=use_auth_token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pyannote/audio/core/model.py", line 696, in from_pretrained model = Klass.load_from_checkpoint( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/utilities/model_helpers.py", line 125, in wrapper return self.method(cls, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/core/module.py", line 1581, in load_from_checkpoint loaded = _load_from_checkpoint( ^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/core/saving.py", line 91, in _load_from_checkpoint model = _load_state(cls, checkpoint, strict=strict, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/core/saving.py", line 177, in _load_state obj.on_load_checkpoint(checkpoint) File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pyannote/audio/core/model.py", line 295, in on_load_checkpoint check_version( File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pyannote/audio/utils/version.py", line 34, in check_version mine = VersionInfo.parse(mine) ^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/semver/version.py", line 646, in parse raise ValueError(f"{version} is not valid SemVer string") ValueError: 2.1.0a0+gita8e7c98 is not valid SemVer string ~/downloads/fixed $ pip show pyannote.audio Name: pyannote.audio Version: 3.1.1 Summary: Neural building blocks for speaker diarization Home-page: https://github.com/pyannote/pyannote-audio Author: Hervé Bredin Author-email: herve.bredin@irit.fr License: mit Location: /data/data/com.termux/files/usr/lib/python3.11/site-packages Requires: asteroid-filterbanks, einops, huggingface-hub, lightning, omegaconf, pyannote.core, pyannote.database, pyannote.metrics, pyannote.pipeline, pytorch-metric-learning, rich, semver, soundfile, speechbrain, tensorboardX, torch, torch-audiomentations, torchaudio, torchmetrics Required-by: whisperx ~/downloads/fixed $

@Manamama
Copy link
Author

Manamama commented Mar 4, 2024

And with semver fix:

whisperx --compute_type float32 --print_progress True "/storage/7B27-F244/Audio/Music/Old 3/R.U.T.A. - Zew Hord.mp3" No language specified, language will be first be detected for each audio file (increases inference time). Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.2.0.post0. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint ../../../.cache/torch/whisperx-vad-segmentation.bin Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0, yours is 2.1.0. Bad things might happen unless you revert torch to 1.x. >>Performing transcription... Warning: audio is shorter than 30s, language detection may be inaccurate. ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants