MPS word timestamp failure due to .double() conversion order #2804
janngobble
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
When running Whisper on Apple Silicon using the MPS backend with --word_timestamps True, transcription fails because the code attempts to convert an MPS tensor to float64 before moving it to the CPU.
Changing the conversion order fixes the problem while preserving the original intent of performing the DTW calculation in double precision.
Environment
macOS (Apple Silicon)
Python 3.12.13
PyTorch 2.12.1
OpenAI Whisper 20250625
MPS backend enabled
Reproduction
Run Whisper with:
Whisper aborts with:
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
The failure occurs in
whisper/timing.py:The call to .double() executes while the tensor is still resident on the MPS device, but MPS does not support float64.
Proposed fix
This code path is already invoking dtw_cpu(), so the tensor is destined for the CPU. Reordering the operations simply avoids requesting an unsupported float64 conversion while the tensor is still on the MPS device.
Move the tensor to the CPU before converting to double precision:
whisper/timing.py (around line 151 in Whisper 20250625)
This preserves the original use of double precision for the CPU DTW implementation while avoiding the unsupported MPS conversion.
Validation
After applying this one-line change:
TypeErrorno longer occurs;--word_timestamps Truecompleted successfully on my Apple Silicon test system using the MPS backend;float64on the CPU exactly as originally intended.If this approach looks acceptable, I'm happy to open a pull request containing this one-line change.
Scope: This change does not attempt to address separate MPS decoder stability issues (such as
NaNlogits with larger models), which appear to be a separate issue based on my testing and should be tracked independently (Discussion #2693).Beta Was this translation helpful? Give feedback.
All reactions