Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apple Silicon support #19

Closed
iliane5 opened this issue Apr 20, 2023 · 23 comments
Closed

Apple Silicon support #19

iliane5 opened this issue Apr 20, 2023 · 23 comments

Comments

@iliane5
Copy link

iliane5 commented Apr 20, 2023

Hey guys, thanks for releasing this as open-source!

Is there any plan to add Apple Silicon support and use MPS with PyTorch if available or is CUDA a "strict" requirement?

@ludos1978
Copy link

ludos1978 commented Apr 20, 2023

this seems to run, a bit faster, but way hotter.

add the following after the each of the device definitions on line 274, 292, 375, 545, 665, 743

if torch.backends.mps.is_available():
    device = "mps"

(i'd recommend to put that into a function in the long term, but i have no idea about ai systems...)

i also use the nightly build of https://pytorch.org/get-started/locally/ but i am not sure wether thats needed.

be aware that it gets really hot when cpu and gpu are running. istats said gpu temps up to 100 degrees c on gpu. so maybe have a tool to turn up the fans. it stil took 10 mins to run, and i dont know where i can find the generated content...

btw. i dont know if the result is anything usable, because i cant find it yet...

@mcamac
Copy link
Contributor

mcamac commented Apr 20, 2023

Re Apple Silicon, definitely something we welcome input/suggestions on.

Re where results end up, they start in memory. For example saving to wav/mp3 check out #13

@ludos1978
Copy link

ludos1978 commented Apr 20, 2023

there is a problem

NotImplementedError: The operator 'aten::_weight_norm_interface' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

my "run.py" looks like this now:

# run in the console
# > export PYTORCH_ENABLE_MPS_FALLBACK=1
# before running this script

from bark import SAMPLE_RATE, generate_audio
from IPython.display import Audio
import soundfile as sf
import numpy as np
from scipy.io import wavfile

text_prompt = """
     Hello
"""
audio_array = generate_audio(text_prompt)
Audio(audio_array, rate=SAMPLE_RATE)

# convert audio array to 16-bit int represenation
int_audio_arr = (audio_array * np.iinfo(np.int16).max).astype(np.int16)

# save as wav
wavfile.write("my_file.wav", SAMPLE_RATE, int_audio_arr)

it generated something, but i gotta test with other texts now.

@ludos1978
Copy link

something is definitely off. The sound file is not usable in any way. But the code runs without errors and the length of the file does make sense. Could be something about conversion. But it might be a deeper problem as well.

Maybe somebody else has an idea.

@gkucsko
Copy link
Contributor

gkucsko commented Apr 22, 2023

maybe related to precision? can you try running with float32?

@sagardesai90
Copy link

Saw the same issue as you @ludos1978, but I don't see an output file like I do when I don't change the 'device' definition to mps.

NotImplementedError: The operator 'aten::_weight_norm_interface' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable PYTORCH_ENABLE_MPS_FALLBACK=1 to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

If there's a way you get past this issue, lmk!

@ludos1978
Copy link

ludos1978 commented Apr 24, 2023

@gkucsko : i dont really understand where you mean to try running it in float32? I did figure out how to write to a file if you meant that.

@sagardesai90 : cant really describe it any better then i did in my script or in the error message.


With the code below, the audio generates fine on cpu. However on mps it only wushes and cracks all the time.

import os

if os.getenv('PYTORCH_ENABLE_MPS_FALLBACK') != '1':
     print ("you really need to run > export PYTORCH_ENABLE_MPS_FALLBACK=1")
     exit(1)

from bark import SAMPLE_RATE, generate_audio
from IPython.display import Audio
import soundfile as sf
import numpy as np
from scipy.io import wavfile
from datetime import datetime as dt

text_prompt = """
     Hello World, this is a test.
"""
audio_array = generate_audio(text_prompt)
iPyAudio = Audio(audio_array, rate=SAMPLE_RATE)

filename = 'audio-%s-%i.wav' % (dt.now().strftime('%Y%m%d-%H%M%S'), SAMPLE_RATE)
with open(filename, 'wb') as f:
    f.write(iPyAudio.data)
print ("saved %s" % filename)

@gkucsko
Copy link
Contributor

gkucsko commented Apr 24, 2023

i meant hacking the autocast setting ontop of generation.py, but was a bit of a random guess. hard without access to your setup.
could you try with this branch and see if it works? #22

@ludos1978
Copy link

autocast does not seem to be available on mps. The code in the beginning of #22 is at least not working for me with pytorch 2.0.0 on a M2 Max.

@QKJIN
Copy link

QKJIN commented Apr 25, 2023

I am using a mac pro with m2. I put this code
device = "mps" if torch.backends.mps.is_available() else "cpu"
before preload_models(). Then I got voice within 2 mins with below text.

text_prompt = """
    I have a silky smooth voice, and today I will tell you 
    about the exercise regimen of the common sloth.
"""

This is the output info.

No GPU being used. Careful, inference might be extremely slow!
No GPU being used. Careful, inference might be extremely slow!
No GPU being used. Careful, inference might be extremely slow!
100%|████████████████████████████████████| 100/100 [00:19<00:00, 5.12it/s]
100%|████████████████████████████████████| 18/18 [01:29<00:00, 4.96s/it]

I don't have a computer with GPU. So I can't compare. But if delete this code device = "mps" if torch.backends.mps.is_available() else "cpu" . The time will be litter longer, almost double.
No GPU being used. Careful, inference might be extremely slow!
No GPU being used. Careful, inference might be extremely slow!
No GPU being used. Careful, inference might be extremely slow!
100%|████████████████████████████████████| 100/100 [00:38<00:00, 2.60it/s]
100%|████████████████████████████████████| 32/32 [02:44<00:00, 5.13s/it]

@MSchmidt
Copy link

MSchmidt commented Apr 25, 2023

@QKJIN just declaring a device variable without actually using it doesn't seem correct. Do you have the full snippet for comparison?

@ludos1978
Copy link

From my experience the time it takes to generate sounds varies a lot between running, because there is no fixed seed we can use (this would be a nice addition for testing and comparison), so the voices can be really slow or fast and thus take longer or shorter to generate.

@QKJIN
Copy link

QKJIN commented Apr 25, 2023

@ludos1978 Yes, it's right. I generate several sounds. The speeds are all different.
@MSchmidt You are right. It's just a variable. It's not used correctly. I'm still searching how to use it.

@ludos1978
Copy link

ludos1978 commented Apr 25, 2023

I have been modifying the code a bit so i can use mps on different parts of the generation process.
With these changes i have been able to generate useful sound files when i switch mps on for the text_to_semantic and generate_coarse processes. However it breaks the sound if i use it on codec_decode and crashes if i use it on generate_fine.

I actually added to generation.py to have fixed seeds (generates the same sound every time). The fixed seeds however do not generate the same sound if you change the devices.

import torch
torch.manual_seed(0)
torch.use_deterministic_algorithms(True)
np.random.seed(0)

i am a bit lazy in my free time to use a versioning tool, but if anybody wants to have a look at my code, it's attached.
bark-mps.zip

edit: i am actually not completely sure it's accelerated using mps, but it makes a difference when i switch the devices, so i assume it must change something under the hood.

@gkucsko
Copy link
Contributor

gkucsko commented Apr 25, 2023

added in most recent commit as experimental. has to be enabled via:

import os
os.environ["SUNO_ENABLE_MPS"] = "True"

@gkucsko gkucsko closed this as completed Apr 25, 2023
@dataf3l
Copy link

dataf3l commented May 16, 2023

when I enable MPS I get this error:

Traceback (most recent call last):
File "/Users/b/study/ml/bark/mine.py", line 20, in
audio_array = generate_audio(text_prompt)
File "/Users/b/study/ml/bark/venv/lib/python3.10/site-packages/bark/api.py", line 113, in generate_audio
out = semantic_to_waveform(
File "/Users/b/study/ml/bark/venv/lib/python3.10/site-packages/bark/api.py", line 66, in semantic_to_waveform
audio_arr = codec_decode(fine_tokens)
File "/Users/b/study/ml/bark/venv/lib/python3.10/site-packages/bark/generation.py", line 826, in codec_decode
emb = model.quantizer.decode(arr)
File "/Users/b/study/ml/bark/venv/lib/python3.10/site-packages/encodec/quantization/vq.py", line 112, in decode
quantized = self.vq.decode(codes)
File "/Users/b/study/ml/bark/venv/lib/python3.10/site-packages/encodec/quantization/core_vq.py", line 361, in decode
quantized = layer.decode(indices)
File "/Users/b/study/ml/bark/venv/lib/python3.10/site-packages/encodec/quantization/core_vq.py", line 288, in decode
quantize = self._codebook.decode(embed_ind)
File "/Users/b/study/ml/bark/venv/lib/python3.10/site-packages/encodec/quantization/core_vq.py", line 202, in decode
quantize = self.dequantize(embed_ind)
File "/Users/b/study/ml/bark/venv/lib/python3.10/site-packages/encodec/quantization/core_vq.py", line 188, in dequantize
quantize = F.embedding(embed_ind, self.embed)
File "/Users/b/study/ml/bark/venv/lib/python3.10/site-packages/torch/nn/functional.py", line 2210, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

just with the basic example, any ideas on how to fix?

@gkucsko
Copy link
Contributor

gkucsko commented May 16, 2023

it's cause encodec doesn't work on mps i think, but technically that shouldn't happen, i can't reproduce. can you try to find the bug?

@swyxio
Copy link

swyxio commented May 26, 2023

im sorry but after reading everything here i still havent been able to run bark on apple silicon.

issue here:

             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/swyx/.pyenv/versions/myenv/lib/python3.11/site-packages/transformers-4.29.2-py3.11.egg/transformers/utils/import_utils.py", line 1174, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.models.bert because of the following error (look up to see its traceback):
dlopen(/Users/swyx/.pyenv/versions/myenv/lib/python3.11/site-packages/tokenizers-0.13.3-py3.11-macosx-13.2-arm64.egg/tokenizers/tokenizers.cpython-311-darwin.so, 0x0002): tried: '/Users/swyx/.pyenv/versions/myenv/lib/python3.11/site-packages/tokenizers-0.13.3-py3.11-macosx-13.2-arm64.egg/tokenizers/tokenizers.cpython-311-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/swyx/.pyenv/versions/myenv/lib/python3.11/site-packages/tokenizers-0.13.3-py3.11-macosx-13.2-arm64.egg/tokenizers/tokenizers.cpython-311-darwin.so' (no such file), '/Users/swyx/.pyenv/versions/myenv/lib/python3.11/site-packages/tokenizers-0.13.3-py3.11-macosx-13.2-arm64.egg/tokenizers/tokenizers.cpython-311-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64')), '/Users/swyx/.pyenv/versions/3.11.3/envs/myenv/lib/python3.11/site-packages/tokenizers-0.13.3-py3.11-macosx-13.2-arm64.egg/tokenizers/tokenizers.cpython-311-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/swyx/.pyenv/versions/3.11.3/envs/myenv/lib/python3.11/site-packages/tokenizers-0.13.3-py3.11-macosx-13.2-arm64.egg/tokenizers/tokenizers.cpython-311-darwin.so' (no such file), '/Users/swyx/.pyenv/versions/3.11.3/envs/myenv/lib/python3.11/site-packages/tokenizers-0.13.3-py3.11-macosx-13.2-arm64.egg/tokenizers/tokenizers.cpython-311-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))

does nobody else see this? i tried searching "arm64" and didnt find anyone else with this issue

@gkucsko
Copy link
Contributor

gkucsko commented May 26, 2023

hmm seems to be an issue with tokenizers from hugging face. maybe just try pip install -U tokenizers?

@DEVANANDJALLA
Copy link

after using mps it has become slow

@PhanindraParashar
Copy link

RuntimeError: Placeholder storage has not been allocated on MPS device!

@lxgbrl
Copy link

lxgbrl commented Feb 1, 2024

Apple MacBook M2
#What worked for me

#You may have to install anaconda, ffmpeg
#Create an virtual env named bark
conda create --name bark
#Activate virtual env
conda activate bark
#install git
conda install git
#clone bark repo

git clone https://github.com/suno-ai/bark.git
cd bark
pip install .
pip install git+https://github.com/huggingface/transformers.git

#I also had to install soundfile
pip install soundfile
#enter following line
export PYTORCH_ENABLE_MPS_FALLBACK=1
#Create in root folder, where the setup.py is located, a python file called main.py with following content:

import os
os.environ["SUNO_OFFLOAD_CPU"] = "True"
os.environ["SUNO_USE_SMALL_MODELS"] = "True"

if os.getenv('PYTORCH_ENABLE_MPS_FALLBACK') != '1':
     print ("you really need to run > export PYTORCH_ENABLE_MPS_FALLBACK=1")
     exit(1)

from bark import SAMPLE_RATE, generate_audio
from IPython.display import Audio
import soundfile as sf
import numpy as np
from scipy.io import wavfile
from datetime import datetime as dt

text_prompt = """
I hope this will work for you.
"""
audio_array = generate_audio(text_prompt, history_prompt="en_speaker_8") #{lang_code}_speaker_{0-9}
iPyAudio = Audio(audio_array, rate=SAMPLE_RATE)

filename = 'audio-%s-%i.wav' % (dt.now().strftime('%Y%m%d-%H%M%S'), SAMPLE_RATE)
with open(filename, 'wb') as f:
    f.write(iPyAudio.data)
print ("saved %s" % filename)

#start your file
python main.py
#with the first start it will download the models, so it will take some time. After that it is much faster!
#Hope it works for you as well!

@Huixxi
Copy link

Huixxi commented May 29, 2024

➜ bark git:(main) ✗ ~/venv/bin/python3 -m bark --text "Hello, my name is Suno." --output_filename "example.wav"
No GPU being used. Careful, inference might be very slow!
Oops, an error occurred: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests