-
Notifications
You must be signed in to change notification settings - Fork 459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[show and tell] apple mps support #6
Comments
That's awesome, thanks for sharing @bghira! How fast was inference on your local machine? |
it gets slower as the sample size increases but this test script takes about 10 seconds to run on an M3 Max. |
I got this working as well! Inference time seems to increase more than linearly with prompt size
I think the reason is that itself takes a surprising amount of memory — loading the model takes the expected ~3GB of memory, but then inference takes 15 GB on top of that, which is probably what's slowing it down on my machine (16GB M2). |
Swapping activated? I will try on Mac Mini M2 (24GB). Do we know the performance on CUDA on similar machine? |
on the 128gb M3 Max i can get pretty far into the output window before the time increases to 3 minutes. it'll take about a minute for 30 seconds of audio. |
I am getting, 2s of audio: 11 seconds and 6s of audio: 36 seconds |
my data , on 64G M2 Max
|
I'm getting this error NotImplementedError: Output channels > 65536 not supported at the MPS device. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS. Did something change or is it still working for you? In [2]: torch.version |
stick with pytorch 2.4 unless you want things blowing up constantly is my suggestion |
with newer pytorch (2.4 nightly) we get bfloat16 support in MPS.
i tested this:
The text was updated successfully, but these errors were encountered: