Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ERROR] Get target tone color error cuFFT error: CUFFT_INTERNAL_ERROR #41

Open
Zovya opened this issue Jan 3, 2024 · 9 comments
Open

Comments

@Zovya
Copy link

Zovya commented Jan 3, 2024

installed locally on Manjaro linux with nvidia drivers
no problems during installation

I get this error in the info box every time
[ERROR] Get target tone color error cuFFT error: CUFFT_INTERNAL_ERROR

@Zengyi-Qin
Copy link
Contributor

This repo was developed under ubuntu 20.04

@ahandleman
Copy link

I'm encountering the same issue on Ubuntu 20.04 running on WSL.

Ubuntu 20.04.6 LTS (GNU/Linux 5.15.133.1-microsoft-standard-WSL2 x86_64)

@xaroth8088
Copy link

It appears to be an issue with Torch 1.13.1 due to its dependency on CUDA 11.7. According to the PyTorch bug thread, this error does not occur when running later versions of Torch.

I have my doubts as to whether it'll be as simple as updating Torch to latest, but I'll try it out and report back.

@WolfieXIII
Copy link

WolfieXIII commented Jan 4, 2024

I'm getting this too - I tracked it down to the following in mel_processing.py

spec = torch.stft(
        y,
        n_fft,
        hop_length=hop_size,
        win_length=win_size,
        window=hann_window[wnsize_dtype_device],
        center=center,
        pad_mode="reflect",
        normalized=False,
        onesided=True,
        return_complex=False,
    )

I added a bunch of extra logging to track it down - here is that - tired now, going to bed. Best of skill taking this to next step.

(openvoice) jas@Hope:/mnt/d/repo/AI/audio/OpenVoice$ python openvoice_app.py
Loaded checkpoint 'checkpoints/base_speakers/EN/checkpoint.pth'
missing/unexpected keys: [] []
Loaded checkpoint 'checkpoints/base_speakers/ZH/checkpoint.pth'
missing/unexpected keys: [] []
Loaded checkpoint 'checkpoints/converter/checkpoint.pth'
missing/unexpected keys: [] []
/home/jas/anaconda3/envs/openvoice/lib/python3.9/site-packages/gradio/components/dropdown.py:103: UserWarning: The `max_choices` parameter is ignored when `multiselect` is False.
 warnings.warn(
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Detected language:en
[(0.0, 19.278375)]
after vad: dur = 19.27798185941043
Audio path: processed/demo_speaker0-0-100/wavs
Audio name: demo_speaker0-0-100
Device: cuda
VC model: <api.ToneColorConverter object at 0x7f0ed7dcddf0>
SE path: processed/demo_speaker0-0-100/se.pth
Audio Segments: ['processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg0.wav', 'processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg1.wav']
ref_wav_list: ['processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg0.wav', 'processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg1.wav']
se_save_path: processed/demo_speaker0-0-100/se.pth
device: cuda
hps: {'data': {'sampling_rate': 22050, 'filter_length': 1024, 'hop_length': 256, 'win_length': 1024, 'n_speakers': 0}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [8, 8, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 4, 4], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 256}}
gs: []
proc fname: processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg0.wav
loaded audio: [ 0.          0.          0.         ... -0.00099443  0.01052232
 0.        ] 22050
torch.FloatTensor: tensor([ 0.0000,  0.0000,  0.0000,  ..., -0.0010,  0.0105,  0.0000])
y.to(device): tensor([ 0.0000,  0.0000,  0.0000,  ..., -0.0010,  0.0105,  0.0000],
      device='cuda:0')
y.unsqueeze(0): tensor([[ 0.0000,  0.0000,  0.0000,  ..., -0.0010,  0.0105,  0.0000]],
      device='cuda:0')
wnsize_dtype_device: 1024_torch.float32_cuda:0
wnsize_dtype_device adding
torch.nn.functional.pad
y.squeeze(1)
torch.stft(...)
Exception: cuFFT error: CUFFT_INTERNAL_ERROR

@xaroth8088
Copy link

@WolfieXIII : That mirrors what I found, too. 😞

Re: trying to just upgrade Torch - alas, it appears OpenVoice has a dependency on wavmark, which doesn't seem to have a version compatible with torch>2.0. So, trying to get this to work on newer cards will likely require one of the following:

  1. Wait for wavmark to create a Torch 2.x-compatible version
  2. Replace wavmark with an alternative library (and then upgrade Torch)
  3. Create a custom build of Torch 1.13.1 that depends on CUDA 11.8 or later.
  4. Your ideas here...

@JacopoMangiavacchi
Copy link

Someone already made a PR on wavmark to support 2.1 <3

wavmark/wavmark#6

@yctam
Copy link

yctam commented Jan 5, 2024

I found that it works after doing the following two:

  1. Upgrade all torch, torchaudio, torchvision to the latest version
  2. Uninstall the default wavmark and reinstall this version: https://github.com/violetdenim/wavmark using "pip install -e ." after git clone this project and cd into this directory.

@xaroth8088
Copy link

Confirmed!

Here's some simpler instructions to tide everyone over until wavmark officially updates their package:

  1. Install OpenVoice, as per the README.md instructions
  2. pip install -U torch torchvision torchaudio git+https://github.com/violetdenim/wavmark.git
  3. Run OpenVoice, as per the README.md instructions
  4. Enjoy!

@xiangdev
Copy link

Confirmed!

Here's some simpler instructions to tide everyone over until wavmark officially updates their package:

  1. Install OpenVoice, as per the README.md instructions
  2. pip install -U torch torchvision torchaudio git+https://github.com/violetdenim/wavmark.git
  3. Run OpenVoice, as per the README.md instructions
  4. Enjoy!

Thanks! This should be included in README.md

frankdu pushed a commit to frankdu/OpenVoice that referenced this issue Feb 29, 2024
This was originally fixed by myshell-ai#6. But a recent refactoring removed the fix.
It solves myshell-ai#41 and myshell-ai#80.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants