Termux error with Whisperx, not in Termux Prooted Debian #1651

Manamama · 2024-02-18T03:57:59Z

Tested versions

This :
Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0, yours is 2.1.0. Bad things might happen unless you revert torch to 1.x. >>Performing transcription...

is desired, but sanitation is needed :

def check_version(library: Text, theirs: Text, mine: Text, what: Text = "Pipeline"):

   # Sanitize the version strings, added:
   theirs = theirs.split("+")[0].replace("a0", ".0")
   mine = mine.split("+")[0].replace("a0", ".0")

   theirs = ".".join(theirs.split(".")[:3])
   mine = ".".join(mine.split(".")[:3])

   theirs = VersionInfo.parse(theirs)
   mine = VersionInfo.parse(mine)

   if theirs.major > mine.major:
       print(
         f"{what} was trained with {library} {theirs}, yours is {mine}. "
         f"Bad things will probably happen unless you upgrade {library} to {theirs.major}.x."
       )

   elif theirs.major < mine.major:
       print(
           f"{what} was trained with {library} {theirs}, yours is {mine}. "
           f"Bad things might happen unless you revert {library} to {theirs.major}.x."
       )

   elif theirs.minor > mine.minor:
       print(
           f"{what} was trained with {library} {theirs}, yours is {mine}. "
           f"This should be OK but you might want to upgrade {library}."
       )

to avoid:

...kages/pyannote/audio/utils/version.py", line 34, in check_version mine = VersionInfo.parse(mine) ^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/semver/version.py", line 646, in parse raise ValueError(f"{version} is not valid SemVer string") ValueError: 2.1.0a0+gita8e7c98 is not valid SemVer string

Bug is only in:

Operating System: Android 11 aarch64 Kernel: 4.14.186+
Shell: /data/data/com.termux/files/usr/bin/bash 5.2.26 Python: 3.11.8 ------ Display Manager Display Server: Desktop Environment theme: Adwaita [GTK3] Icons theme: Adwaita [GTK3]

with

Name: pyannote.audio Version: 3.1.1 Summary: Neural building blocks for speaker diarization.

It works fine with:

root@localhost -------------- OS: Debian GNU/Linux 12 (bookworm) aarch64 Host: realme RMX3085 Kernel: 6.2.1-PRoot-Distro Uptime: 2 mins Packages: 892 (dpkg), 1 (pkg) Shell: bash 5.2.15 Terminal: proot

not sure why so, as same machine, probably same library versions, etc.

System information

Android 11 aarch64, Termux

Issue description

See above

Minimal reproduction example (MRE)

...

The text was updated successfully, but these errors were encountered:

Manamama · 2024-03-04T00:14:38Z

Same on a fresh pip install, new Droid, Termux only. Without my fix, above, I get : whisperx --compute_type float32 --print_progress True --language Polish "/storage/7B27-F244/Audio/Music/Old 3/R.U.T.A. - Zew Hord.mp3" Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.2.0.post0. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint ../../.cache/torch/whisperx-vad-segmentation.bin Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Traceback (most recent call last): File "/data/data/com.termux/files/usr/bin/whisperx", line 8, in sys.exit(cli()) ^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/whisperx/transcribe.py", line 170, in cli model = load_model(model_name, device=device, device_index=device_index, download_root=model_dir, compute_type=compute_type, language=args['language'], asr_options=asr_options, vad_options={"vad_onset": vad_onset, "vad_offset": vad_offset}, task=task, threads=faster_whisper_threads) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/whisperx/asr.py", line 345, in load_model vad_model = load_vad_model(torch.device(device), use_auth_token=None, **default_vad_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/whisperx/vad.py", line 51, in load_vad_model vad_model = Model.from_pretrained(model_fp, use_auth_token=use_auth_token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pyannote/audio/core/model.py", line 696, in from_pretrained model = Klass.load_from_checkpoint( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/utilities/model_helpers.py", line 125, in wrapper return self.method(cls, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/core/module.py", line 1581, in load_from_checkpoint loaded = _load_from_checkpoint( ^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/core/saving.py", line 91, in _load_from_checkpoint model = _load_state(cls, checkpoint, strict=strict, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/core/saving.py", line 177, in _load_state obj.on_load_checkpoint(checkpoint) File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pyannote/audio/core/model.py", line 295, in on_load_checkpoint check_version( File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pyannote/audio/utils/version.py", line 34, in check_version mine = VersionInfo.parse(mine) ^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/semver/version.py", line 646, in parse raise ValueError(f"{version} is not valid SemVer string") ValueError: 2.1.0a0+gita8e7c98 is not valid SemVer string ~/downloads/fixed $ whisperx --compute_type float32 --print_progress True "/storage/7B27-F244/Audio/Music/Old 3/R.U.T.A. - Zew Hord.mp3" No language specified, language will be first be detected for each audio file (increases inference time). Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.2.0.post0. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint ../../.cache/torch/whisperx-vad-segmentation.bin Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Traceback (most recent call last): File "/data/data/com.termux/files/usr/bin/whisperx", line 8, in sys.exit(cli()) ^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/whisperx/transcribe.py", line 170, in cli model = load_model(model_name, device=device, device_index=device_index, download_root=model_dir, compute_type=compute_type, language=args['language'], asr_options=asr_options, vad_options={"vad_onset": vad_onset, "vad_offset": vad_offset}, task=task, threads=faster_whisper_threads) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/whisperx/asr.py", line 345, in load_model vad_model = load_vad_model(torch.device(device), use_auth_token=None, **default_vad_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/whisperx/vad.py", line 51, in load_vad_model vad_model = Model.from_pretrained(model_fp, use_auth_token=use_auth_token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pyannote/audio/core/model.py", line 696, in from_pretrained model = Klass.load_from_checkpoint( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/utilities/model_helpers.py", line 125, in wrapper return self.method(cls, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/core/module.py", line 1581, in load_from_checkpoint loaded = _load_from_checkpoint( ^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/core/saving.py", line 91, in _load_from_checkpoint model = _load_state(cls, checkpoint, strict=strict, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pytorch_lightning/core/saving.py", line 177, in _load_state obj.on_load_checkpoint(checkpoint) File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pyannote/audio/core/model.py", line 295, in on_load_checkpoint check_version( File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pyannote/audio/utils/version.py", line 34, in check_version mine = VersionInfo.parse(mine) ^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/semver/version.py", line 646, in parse raise ValueError(f"{version} is not valid SemVer string") ValueError: 2.1.0a0+gita8e7c98 is not valid SemVer string ~/downloads/fixed $ pip show pyannote.audio Name: pyannote.audio Version: 3.1.1 Summary: Neural building blocks for speaker diarization Home-page: https://github.com/pyannote/pyannote-audio Author: Hervé Bredin Author-email: herve.bredin@irit.fr License: mit Location: /data/data/com.termux/files/usr/lib/python3.11/site-packages Requires: asteroid-filterbanks, einops, huggingface-hub, lightning, omegaconf, pyannote.core, pyannote.database, pyannote.metrics, pyannote.pipeline, pytorch-metric-learning, rich, semver, soundfile, speechbrain, tensorboardX, torch, torch-audiomentations, torchaudio, torchmetrics Required-by: whisperx ~/downloads/fixed $

Manamama · 2024-03-04T00:27:25Z

And with semver fix:

whisperx --compute_type float32 --print_progress True "/storage/7B27-F244/Audio/Music/Old 3/R.U.T.A. - Zew Hord.mp3" No language specified, language will be first be detected for each audio file (increases inference time). Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.2.0.post0. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint ../../../.cache/torch/whisperx-vad-segmentation.bin Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0, yours is 2.1.0. Bad things might happen unless you revert torch to 1.x. >>Performing transcription... Warning: audio is shorter than 30s, language detection may be inaccurate. ...

hbredin added the cannot_reproduce label Feb 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Termux error with Whisperx, not in Termux Prooted Debian #1651

Termux error with Whisperx, not in Termux Prooted Debian #1651

Manamama commented Feb 18, 2024

Manamama commented Mar 4, 2024 •

edited

Loading

Manamama commented Mar 4, 2024

Termux error with Whisperx, not in Termux Prooted Debian #1651

Termux error with Whisperx, not in Termux Prooted Debian #1651

Comments

Manamama commented Feb 18, 2024

Tested versions

System information

Issue description

Minimal reproduction example (MRE)

Manamama commented Mar 4, 2024 • edited Loading

Manamama commented Mar 4, 2024

Manamama commented Mar 4, 2024 •

edited

Loading