-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Open
Description
Problem Description:
I am using WhisperX for speech-to-text through the Python interface with Python 3.10. On my Mac, everything works perfectly, but on a Linux machine with Arch and an Nvidia GTX1080 GPU, I encounter issues when running WhisperX alongside OpenCV. Individually, both work without problems, but if imported together, OpenCV fails to open windows and the rest of the program crashes.
Code to reproduce the issue:
import whisperx
import cv2
cv2.namedWindow("pippo", cv2.WINDOW_NORMAL)Working code:
import cv2
cv2.namedWindow("pippo", cv2.WINDOW_NORMAL)Additional Details:
- Operating System: Arch Linux
- GPU: Nvidia GTX1080
- Python Version: 3.10.16 (main, Apr 12 2025, 11:37:09) [GCC 14.2.1 20250207]
- Installed Python Libraries:
(.venv) [ifab@ifab-ms7b48 ifab]$ pip freeze
aiohappyeyeballs==2.6.1
aiohttp==3.11.16
aiosignal==1.3.2
alembic==1.15.2
antlr4-python3-runtime==4.9.3
argcomplete==3.6.1
asteroid-filterbanks==0.4.0
async-timeout==5.0.1
attrs==25.3.0
av==14.3.0
bidict==0.23.1
certifi==2025.1.31
cffi==1.17.1
charset-normalizer==3.4.1
click==8.1.8
coloredlogs==15.0.1
colorlog==6.9.0
contourpy==1.3.1
ctranslate2==4.6.0
cv2_enumerate_cameras==1.1.18.3
cycler==0.12.1
docopt==0.6.2
einops==0.8.1
faster-whisper==1.1.0
filelock==3.18.0
Flask==2.2.3
Flask-Cors==3.0.10
Flask-SocketIO==5.3.3
flatbuffers==25.2.10
fonttools==4.56.0
frozenlist==1.5.0
fsspec==2025.3.2
greenlet==3.2.1
h11==0.14.0
hf-xet==1.0.3
huggingface-hub==0.30.2
humanfriendly==10.0
HyperPyYAML==1.2.2
idna==3.10
itsdangerous==2.2.0
Jinja2==3.1.6
joblib==1.4.2
julius==0.2.7
kiwisolver==1.4.8
lightning==2.5.1
lightning-utilities==0.14.3
Mako==1.3.10
markdown-it-py==3.0.0
MarkupSafe==3.0.2
matplotlib==3.10.1
mdurl==0.1.2
mergedeep==1.3.4
mpmath==1.3.0
multidict==6.4.3
networkx==3.4.2
nltk==3.9.1
numpy==2.2.4
nvidia-cublas-cu12==12.4.5.8
nvidia-cuda-cupti-cu12==12.4.127
nvidia-cuda-nvrtc-cu12==12.4.127
nvidia-cuda-runtime-cu12==12.4.127
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.2.1.3
nvidia-curand-cu12==10.3.5.147
nvidia-cusolver-cu12==11.6.1.9
nvidia-cusparse-cu12==12.3.1.170
nvidia-cusparselt-cu12==0.6.2
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.4.127
omegaconf==2.3.0
onnxruntime==1.21.0
opencv-python==4.11.0.86
optuna==4.2.1
packaging==24.2
pandas==2.2.3
pillow==11.1.0
pip-system-certs==4.0
piper-phonemize==1.1.0
piper-tts==1.2.0
piper_phonemize_cross==1.2.1
primePy==1.3
propcache==0.3.1
protobuf==6.30.2
pyannote.audio==3.3.2
pyannote.core==5.0.0
pyannote.database==5.1.3
pyannote.metrics==3.2.1
pyannote.pipeline==3.0.1
pycparser==2.22
Pygments==2.19.1
pyparsing==3.2.3
python-dateutil==2.9.0.post0
python-engineio==4.4.0
python-socketio==5.8.0
pytorch-lightning==2.5.1
pytorch-metric-learning==2.8.1
pytz==2025.2
PyYAML==6.0.2
regex==2024.11.6
requests==2.28.2
rich==14.0.0
ruamel.yaml==0.18.10
ruamel.yaml.clib==0.2.12
safetensors==0.5.3
scikit-learn==1.6.1
scipy==1.15.2
semver==3.0.4
sentencepiece==0.2.0
shellingham==1.5.4
simple-websocket==0.10.0
six==1.17.0
sortedcontainers==2.4.0
sounddevice==0.5.1
soundfile==0.13.1
speechbrain==1.0.3
SQLAlchemy==2.0.40
sympy==1.13.1
tabulate==0.9.0
tensorboardX==2.6.2.2
threadpoolctl==3.6.0
tokenizers==0.21.1
torch==2.6.0
torch-audiomentations==0.12.0
torch_pitch_shift==1.2.5
torchaudio==2.6.0
torchmetrics==1.7.1
tqdm==4.67.1
transformers==4.51.2
triton==3.2.0
typer==0.15.2
typing_extensions==4.13.2
tzdata==2025.2
urllib3==1.26.20
websocket-client==1.5.1
Werkzeug==2.2.3
whisperx==3.3.2
wrapt==1.17.2
wsproto==1.2.0
yarl==1.19.0
Please help me address this issue or find a workaround. I am reaching out to you because OpenCV runs entirely on the CPU, and I have noticed that the import line takes a long time to execute, leading me to believe that something is happening there that causes the problem.
Thank you for your attention.
EmaMaker
Metadata
Metadata
Assignees
Labels
No labels