How to use this project on Apple's M1 chip. #18

imxw · 2024-01-02T02:30:58Z

No description provided.

Zengyi-Qin · 2024-01-02T04:00:37Z

If you wanted to try a demo, please use the lepton or myshell link, instead of running on M1. If you wanted to deploy it by yourself, please use a ubuntu 20.04 environment. In neither situation we recommend running on M1.

iamshreeram · 2024-01-02T11:21:31Z

@Zengyi-Qin , The demo appears to be functioning well on Ubuntu. However, when attempting to run the torch with "device" set as MPS for Mac M1, I encountered the following error:

Loaded checkpoint 'checkpoints/base_speakers/EN/checkpoint.pth'
missing/unexpected keys: [] []
Loaded checkpoint 'checkpoints/converter/checkpoint.pth'
missing/unexpected keys: [] []

Considering the widespread usage of Mac computers in various workplaces, it would be advantageous to ensure compatibility with Mac systems. Exploring possibilities on a Mac would broaden the range of individuals who could benefit from the demo and make it more accessible to potential users.

Please let me know if could provide support, as I would like to consider the possibility of making openvoice compatible with Mac.

Zengyi-Qin · 2024-01-02T17:51:17Z

@iamshreeram The output you showed is not an error. Please safely ignore it. And regarding Mac compatibility - Could you make this repo Mac-compatible and create a pull request? Very much appreciated

cchance27 · 2024-01-02T22:20:13Z

@iamshreeram just to be clear

Loaded checkpoint 'checkpoints/base_speakers/EN/checkpoint.pth'
missing/unexpected keys: [] []
Loaded checkpoint 'checkpoints/converter/checkpoint.pth'
missing/unexpected keys: [] []

isn't an error thats just that it loaded correctly and no keys were missing or unexpected, realistically they could probably wrap that in an if statement and if there are none just don't print the blank arrays.

iamshreeram · 2024-01-03T04:37:17Z

Thank you for clarifying, @cchance27 . Great, @Zengyi-Qin. I appreciate the support.

I have begun working on a fork to incorporate Mac support, which can be found at this link - https://github.com/iamshreeram/OpenVoice/blob/main/demo_mps.ipynb
While attempting the inference, I am encountering an error. I suspect that there is an issue with the dimensions when running on a Mac computer.

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[6], line 6
      4 text = "This audio is generated by OpenVoice."
      5 src_path = f'{output_dir}/tmp.wav'
----> 6 base_speaker_tts.tts(text, src_path, speaker='default', language='English', speed=1.0)
      8 # Run the tone color converter
      9 encode_message = "@MyShell"

File ~/ram/project/python/OpenVoice/api.py:90, in BaseSpeakerTTS.tts(self, text, output_path, speaker, language, speed)
     88         x_tst_lengths = torch.LongTensor([stn_tst.size(0)]).to(device)
     89         sid = torch.LongTensor([speaker_id]).to(device)
---> 90         audio = self.model.infer(x_tst, x_tst_lengths, sid=sid, noise_scale=0.667, noise_scale_w=0.6,
     91                             length_scale=1.0 / speed)[0][0, 0].data.cpu().float().numpy()
     92     audio_list.append(audio)
     93 audio = self.audio_numpy_concat(audio_list, sr=self.hps.data.sampling_rate, speed=speed)

File ~/ram/project/python/OpenVoice/models.py:466, in SynthesizerTrn.infer(self, x, x_lengths, sid, noise_scale, length_scale, noise_scale_w, sdp_ratio, max_len)
    465 def infer(self, x, x_lengths, sid=None, noise_scale=1, length_scale=1, noise_scale_w=1., sdp_ratio=0.2, max_len=None):
--> 466     x, m_p, logs_p, x_mask = self.enc_p(x, x_lengths)
    467     if self.n_speakers > 0:
    468         g = self.emb_g(sid).unsqueeze(-1) # [b, h, 1]

File /Applications/anaconda3/envs/voiceclone/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/ram/project/python/OpenVoice/models.py:53, in TextEncoder.forward(self, x, x_lengths)
     50 x = torch.transpose(x, 1, -1) # [b, h, t]
     51 x_mask = torch.unsqueeze(commons.sequence_mask(x_lengths, x.size(2)), 1).to(x.dtype)
---> 53 x = self.encoder(x * x_mask, x_mask)
     54 stats = self.proj(x) * x_mask
     56 m, logs = torch.split(stats, self.out_channels, dim=1)

File /Applications/anaconda3/envs/voiceclone/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/ram/project/python/OpenVoice/attentions.py:113, in Encoder.forward(self, x, x_mask, g)
    111     x = x + g
    112     x = x * x_mask
--> 113 y = self.attn_layers[i](x, x, attn_mask)
    114 y = self.drop(y)
    115 x = self.norm_layers_1[i](x + y)

File /Applications/anaconda3/envs/voiceclone/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/ram/project/python/OpenVoice/attentions.py:269, in MultiHeadAttention.forward(self, x, c, attn_mask)
    266 k = self.conv_k(c)
    267 v = self.conv_v(c)
--> 269 x, self.attn = self.attention(q, k, v, mask=attn_mask)
    271 x = self.conv_o(x)
    272 return x

File ~/ram/project/python/OpenVoice/attentions.py:286, in MultiHeadAttention.attention(self, query, key, value, mask)
    282 if self.window_size is not None:
    283     assert (
    284         t_s == t_t
    285     ), "Relative attention is only available for self-attention."
--> 286     key_relative_embeddings = self._get_relative_embeddings(self.emb_rel_k, t_s)
    287     rel_logits = self._matmul_with_relative_keys(
    288         query / math.sqrt(self.k_channels), key_relative_embeddings
    289     )
    290     scores_local = self._relative_position_to_absolute_position(rel_logits)

File ~/ram/project/python/OpenVoice/attentions.py:350, in MultiHeadAttention._get_relative_embeddings(self, relative_embeddings, length)
    348 slice_end_position = slice_start_position + 2 * length - 1
    349 if pad_length > 0:
--> 350     padded_relative_embeddings = F.pad(
    351         relative_embeddings,
    352         commons.convert_pad_shape([[0, 0], [pad_length, pad_length], [0, 0]]),
    353     )
    354 else:
    355     padded_relative_embeddings = relative_embeddings

IndexError: Dimension out of range (expected to be in range of [-3, 2], but got 3)

AaronWard · 2024-01-03T15:26:52Z

@iamshreeram Working on this too - will let you know if i get with working on my M1.

suggestion:
missing/unexpected keys: [] [] should only be raised if there actually are missing or unexpected keys - it's misleading showing it when not needed.

iamshreeram · 2024-01-03T15:48:04Z

@AaronWard - Excellent, once the code is working correctly, we can push it to the main repository. Instead of cluttering this issue, let's carry on the conversation on iamshreeram#1 for more comprehensive information.

cc: @Zengyi-Qin

iamshreeram · 2024-01-04T12:47:50Z

@Zengyi-Qin / @wl-zhao , The app functions properly when executed through gradio, but encounters above error issues when used in a jupyter notebook. We are experiencing an error during the inference process, specifically at the below mentioned step.

base_speaker_tts.tts(text, src_path, speaker='default', language='English', speed=1.0)

What are the distinctions between these two methods? It would be appreciated if you could engage in this discussion to help resolve the matter further. Here is the link: iamshreeram#1

pndllxzzy · 2024-01-09T03:54:00Z

Set device as cpu is a temporary plan
device = 'cpu'

ehartford · 2024-02-12T07:03:40Z

hello.
please support mac.

iamshreeram · 2024-02-13T09:47:48Z

Hey @ehartford , Are you experiencing any issues with the current code base on your Mac. I was under the impression that it should already be compatible (Just that MPS is not compatible due to the issue with dependency). Could you please let us know if you're facing any challenges. thanks!

XP20225 · 2024-06-07T06:54:03Z

Which code areyou talking about? @iamshreeram

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use this project on Apple's M1 chip. #18

How to use this project on Apple's M1 chip. #18

imxw commented Jan 2, 2024

Zengyi-Qin commented Jan 2, 2024 •

edited

Loading

iamshreeram commented Jan 2, 2024

Zengyi-Qin commented Jan 2, 2024 •

edited

Loading

cchance27 commented Jan 2, 2024

iamshreeram commented Jan 3, 2024

AaronWard commented Jan 3, 2024 •

edited

Loading

iamshreeram commented Jan 3, 2024

iamshreeram commented Jan 4, 2024

pndllxzzy commented Jan 9, 2024 •

edited

Loading

ehartford commented Feb 12, 2024

iamshreeram commented Feb 13, 2024

XP20225 commented Jun 7, 2024

How to use this project on Apple's M1 chip. #18

How to use this project on Apple's M1 chip. #18

Comments

imxw commented Jan 2, 2024

Zengyi-Qin commented Jan 2, 2024 • edited Loading

iamshreeram commented Jan 2, 2024

Zengyi-Qin commented Jan 2, 2024 • edited Loading

cchance27 commented Jan 2, 2024

iamshreeram commented Jan 3, 2024

AaronWard commented Jan 3, 2024 • edited Loading

iamshreeram commented Jan 3, 2024

iamshreeram commented Jan 4, 2024

pndllxzzy commented Jan 9, 2024 • edited Loading

ehartford commented Feb 12, 2024

iamshreeram commented Feb 13, 2024

XP20225 commented Jun 7, 2024

Zengyi-Qin commented Jan 2, 2024 •

edited

Loading

Zengyi-Qin commented Jan 2, 2024 •

edited

Loading

AaronWard commented Jan 3, 2024 •

edited

Loading

pndllxzzy commented Jan 9, 2024 •

edited

Loading