## Multi-Accent and Multi-Lingual Voice Clone Demo with MeloTTS

In [1]:
import os
import torch
from openvoice import se_extractor
from openvoice.api import ToneColorConverter

  from .autonotebook import tqdm as notebook_tqdm


Importing the dtw module. When using in academic works please cite:
  T. Giorgino. Computing and Visualizing Dynamic Time Warping Alignments in R: The dtw Package.
  J. Stat. Soft., doi:10.18637/jss.v031.i07.



### Initialization

In this example, we will use the checkpoints from OpenVoiceV2. OpenVoiceV2 is trained with more aggressive augmentations and thus demonstrate better robustness in some cases.

In [2]:
ckpt_converter = 'checkpoints_v2/converter'
device = "cuda:0" if torch.cuda.is_available() else "cpu"
output_dir = 'outputs_v2'

tone_color_converter = ToneColorConverter(f'{ckpt_converter}/config.json', device=device)
tone_color_converter.load_ckpt(f'{ckpt_converter}/checkpoint.pth')

os.makedirs(output_dir, exist_ok=True)

  WeightNorm.apply(module, name, dim)


Loaded checkpoint 'checkpoints_v2/converter/checkpoint.pth'
missing/unexpected keys: [] []


### Obtain Tone Color Embedding
We only extract the tone color embedding for the target speaker. The source tone color embeddings can be directly loaded from `checkpoints_v2/ses` folder.

In [3]:

reference_speaker = r'F:\openvoice\OpenVoice\checkpoints_v2\base_speakers\ses\vb.wav'
target_se, audio_name = se_extractor.get_se(reference_speaker, tone_color_converter, vad=True)

OpenVoice version: v2
[(0.0, 52.886)]
after vad: dur = 52.886


Note: you can still call torch.view_as_real on the complex output to recover the old return format. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\SpectralOps.cpp:880.)
  return _VF.stft(  # type: ignore[attr-defined]


#### Use MeloTTS as Base Speakers

MeloTTS is a high-quality multi-lingual text-to-speech library by @MyShell.ai, supporting languages including English (American, British, Indian, Australian, Default), Spanish, French, Chinese, Japanese, Korean. In the following example, we will use the models in MeloTTS as the base speakers. 

In [5]:
from melo.api import TTS
import sys

# 🔧 PATCH to skip MeCab since we're not using Japanese
import types
sys.modules['MeCab'] = types.SimpleNamespace(Tagger=lambda *args, **kwargs: None)

texts = {
    'EN_NEWEST': "Did you ever hear a folk tale about a giant turtle?",  # The newest English base speaker model
    'EN': """Satoshi Nakamoto - a mysterious figure behind Bitcoin's creation. My understanding is that Satoshi designed the
initial proof-of-concept for Bitcoin in 2008. He or she released the first implementation on January 3, 2009, and
continued to contribute to the project until December 2010. After disappearing from the community, there have been
numerous attempts to uncover their real identity, but so far none of these claims have been proven conclusively.""",}


src_path = f'{output_dir}/tmp.wav'

# Speed is adjustable
speed = 1.0

for language, text in texts.items():
    model = TTS(language=language, device=device)
    speaker_ids = model.hps.data.spk2id
    
    for speaker_key in speaker_ids.keys():
        speaker_id = speaker_ids[speaker_key]
        speaker_key = speaker_key.lower().replace('_', '-')
        
        source_se = torch.load(f'checkpoints_v2/base_speakers/ses/{speaker_key}.pth', map_location=device)
        if torch.backends.mps.is_available() and device == 'cpu':
            torch.backends.mps.is_available = lambda: False
        model.tts_to_file(text, speaker_id, src_path, speed=speed)
        save_path = f'{output_dir}/output_v2_{speaker_key}.wav'

        # Run the tone color converter
        encode_message = "@MyShell"
        tone_color_converter.convert(
            audio_src_path=src_path, 
            src_se=source_se, 
            tgt_se=target_se, 
            output_path=save_path,
            message=encode_message)

  WeightNorm.apply(module, name, dim)


 > Text split to sentences.
Did you ever hear a folk tale about a giant turtle?


100%|██████████| 1/1 [00:03<00:00,  3.21s/it]


 > Text split to sentences.
Satoshi Nakamoto - a mysterious figure behind Bitcoin's creation. My understanding is that Satoshi designed the initial proof-of-concept for Bitcoin in 2008. He or she released the first implementation on January 3, 2009, and continued to contribute to the project until December 2010.
After disappearing from the community, there have been numerous attempts to uncover their real identity, but so far none of these claims have been proven conclusively.


100%|██████████| 2/2 [00:45<00:00, 22.87s/it]


 > Text split to sentences.
Satoshi Nakamoto - a mysterious figure behind Bitcoin's creation. My understanding is that Satoshi designed the initial proof-of-concept for Bitcoin in 2008. He or she released the first implementation on January 3, 2009, and continued to contribute to the project until December 2010.
After disappearing from the community, there have been numerous attempts to uncover their real identity, but so far none of these claims have been proven conclusively.


100%|██████████| 2/2 [00:40<00:00, 20.40s/it]


 > Text split to sentences.
Satoshi Nakamoto - a mysterious figure behind Bitcoin's creation. My understanding is that Satoshi designed the initial proof-of-concept for Bitcoin in 2008. He or she released the first implementation on January 3, 2009, and continued to contribute to the project until December 2010.
After disappearing from the community, there have been numerous attempts to uncover their real identity, but so far none of these claims have been proven conclusively.


100%|██████████| 2/2 [00:39<00:00, 19.81s/it]


 > Text split to sentences.
Satoshi Nakamoto - a mysterious figure behind Bitcoin's creation. My understanding is that Satoshi designed the initial proof-of-concept for Bitcoin in 2008. He or she released the first implementation on January 3, 2009, and continued to contribute to the project until December 2010.
After disappearing from the community, there have been numerous attempts to uncover their real identity, but so far none of these claims have been proven conclusively.


100%|██████████| 2/2 [00:41<00:00, 20.75s/it]


 > Text split to sentences.
Satoshi Nakamoto - a mysterious figure behind Bitcoin's creation. My understanding is that Satoshi designed the initial proof-of-concept for Bitcoin in 2008. He or she released the first implementation on January 3, 2009, and continued to contribute to the project until December 2010.
After disappearing from the community, there have been numerous attempts to uncover their real identity, but so far none of these claims have been proven conclusively.


100%|██████████| 2/2 [00:36<00:00, 18.23s/it]
