## Code for Voice Clone AI Project
### It uses a pre-trained text to speech model with multi speaker support for voice cloning
`Can be used with audio sample of around a minute`

- ### Required Imports

In [1]:
from TTS.api import TTS
import numpy as np
import pandas as pd

- ### User Input
> The audio text to be synthesized, speed, emotion, etc.

In [2]:
outputString = input("Please Enter the Output speech string to be generated")
outputSpeed = float(input("Please Enter the speed of the audio output"))
outputEmotion = input("Please Enter the type of emotion in the output. Enter 'Neutral' for default")
outputFileName = input("Please Enter the output file name")

Please Enter the Output speech string to be generatedIdentity and Access Management (IAM) is a set of processes, policies, and tools for controlling user access to critical information within an organization. The enterprise needs IAM to support security and compliance, as well as improve organizational productivity. This applies not only to people resources, but to any entity to which an identity is assigned (e.g., Internet of Things (IoT) devices, application programming interfaces (APIs)). The proliferation of device types and locations from which applications and data are accessed also underlies the importance of identity and access management. Identity and access management reduces the number of traditional points of security failure associated with passwords. The enterprise is vulnerable not only to data breaches associated with passwords and password recovery information, but to human frailties when it comes to creating passwords – generating easy-to-remember (and easy to crack) 

- ### Loading the pre-trained model

    In this case used YourTTS model which is based on [Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech](https://arxiv.org/pdf/2106.06103.pdf)
    Other models that can be used are [TortoiseTTS](https://github.com/neonbjb/tortoise-tts) and [Bark](https://github.com/suno-ai/bark)

In [3]:
modelName = "tts_models/multilingual/multi-dataset/your_tts"
ttsModel = TTS(modelName)

 > tts_models/multilingual/multi-dataset/your_tts is already downloaded.
 > Using model: vits
 > Setting up Audio Processor...
 | > sample_rate:16000
 | > resample:False
 | > num_mels:80
 | > log_func:np.log10
 | > min_level_db:0
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:None
 | > fft_size:1024
 | > power:None
 | > preemphasis:0.0
 | > griffin_lim_iters:None
 | > signal_norm:None
 | > symmetric_norm:None
 | > mel_fmin:0
 | > mel_fmax:None
 | > pitch_fmin:None
 | > pitch_fmax:None
 | > spec_gain:20.0
 | > stft_pad_mode:reflect
 | > max_norm:1.0
 | > clip_norm:True
 | > do_trim_silence:False
 | > trim_db:60
 | > do_sound_norm:False
 | > do_amp_to_db_linear:True
 | > do_amp_to_db_mel:True
 | > do_rms_norm:False
 | > db_level:None
 | > stats_path:None
 | > base:10
 | > hop_length:256
 | > win_length:1024
 > Model fully restored. 
 > Setting up Audio Processor...
 | > sample_rate:16000
 | > resample:False
 | > num_mels:64
 | > log_func:np.log10
 | > min_level_db:-

- ### Place the input audio file for cloning in `input/` folder 

In [4]:
inputFile = "input/audio.wav"

- ### Pass the data to the TTS model

In [10]:
npArray = ttsModel.tts(
    text=outputString, 
    speaker_wav=inputFile, 
    language="en",
    file_path=outputFileName+".wav"
)
npArray

 > Text splitted to sentences.
['Identity and Access Management (IAM) is a set of processes, policies, and tools for controlling user access to critical information within an organization.', 'The enterprise needs IAM to support security and compliance, as well as improve organizational productivity.', 'This applies not only to people resources, but to any entity to which an identity is assigned (e.g., Internet of Things (IoT) devices, application programming interfaces (APIs)).', 'The proliferation of device types and locations from which applications and data are accessed also underlies the importance of identity and access management.', 'Identity and access management reduces the number of traditional points of security failure associated with passwords.', 'The enterprise is vulnerable not only to data breaches associated with passwords and password recovery information, but to human frailties when it comes to creating passwords – generating easy-to-remember (and easy to crack) passw

[0.000465924,
 0.00013102836,
 0.00042311967,
 -0.00025383045,
 -0.0003347481,
 -0.0007343255,
 -0.0015258964,
 -0.0015715567,
 -0.0011531658,
 -0.00092200964,
 -0.001373688,
 -0.0013381504,
 -0.00023780687,
 6.4890846e-05,
 -0.00090777694,
 -0.0010043108,
 -0.00042683486,
 -0.0006924802,
 -0.00079721934,
 -0.0006186942,
 -0.0004329608,
 -0.0003335182,
 0.00036576067,
 0.000299659,
 0.00010111663,
 -4.179965e-05,
 -0.00014050686,
 -0.00022335409,
 -0.00023152486,
 -0.000109138346,
 2.0794425e-05,
 -0.00024870696,
 -0.00021389546,
 -0.00028732704,
 -8.523275e-05,
 0.00023712934,
 0.00037222708,
 7.4852134e-05,
 -0.00024515303,
 -9.47556e-05,
 7.67468e-06,
 0.00014199922,
 0.00019449831,
 -0.00044684677,
 0.00041821762,
 0.0001994796,
 0.00021569077,
 0.000602734,
 -5.5985598e-05,
 0.00030212253,
 0.00022070794,
 -0.0001581871,
 -2.8721493e-05,
 0.0005088965,
 0.00028769363,
 -3.0651136e-06,
 0.0001638858,
 0.00035193592,
 0.00022809065,
 -0.000116396215,
 -1.1984172e-05,
 1.2575867e-05,

- ### Output the audio 

In [17]:
ttsModel.tts_to_file(
    text=outputString, 
    speaker_wav=inputFile, 
    emotion="Neutral",
    speed=outputSpeed,
    language="en",
    file_path="output/"+outputFileName+".wav"
)

 > Text splitted to sentences.
['Identity and Access Management (IAM) is a set of processes, policies, and tools for controlling user access to critical information within an organization.', 'The enterprise needs IAM to support security and compliance, as well as improve organizational productivity.', 'This applies not only to people resources, but to any entity to which an identity is assigned (e.g., Internet of Things (IoT) devices, application programming interfaces (APIs)).', 'The proliferation of device types and locations from which applications and data are accessed also underlies the importance of identity and access management.', 'Identity and access management reduces the number of traditional points of security failure associated with passwords.', 'The enterprise is vulnerable not only to data breaches associated with passwords and password recovery information, but to human frailties when it comes to creating passwords – generating easy-to-remember (and easy to crack) passw

'output/GeneratedOutput.wav'