### Part1: Generating text responses using LLM
In this part we are generating responses to the questions asked to president Biden in the 'first 2024 presidential debate between Joe Biden and Donald Trump'. The text responses are generated using pre-trained OpenAI LLM.

In [None]:
# Install the library
%%capture
!pip install openai

In [None]:
from openai import OpenAI
import os
OPENAI_API_KEY = 'your_openai_api_key'

client = OpenAI(
  api_key= OPENAI_API_KEY,  # this is also the default, it can be omitted
)

def generate_response(question, model="gpt-3.5-turbo", temperature=0.7, max_tokens=200):

    # Make the API request to OpenAI
    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": "You are an AI simulation designed to emulate President Joe Biden, providing articulate and insightful responses to a series of questions. \
                                           Your task is to respond to the same questions that President Biden recently received in a public forum. \
                                           Your responses should demonstrate a deep understanding of the issues discussed, convey empathy and clarity in your communication style, and exhibit a high level of coherence.\
                                           Make sure you don't leave the response sentence incomplete."},
            {"role": "user", "content": question}
        ],
        temperature=temperature,
        max_tokens=max_tokens
    )

    # Extract the response text
    answer = response.choices[0].message.content
    return answer

# Example usage
if __name__ == "__main__":

    question = "What is your approach to handling the conflict in Ukraine?"
    response = generate_response(question)
    print(response)

My approach to handling the conflict in Ukraine is grounded in diplomacy, support for Ukraine's sovereignty, and working closely with our allies and partners. It is essential to engage in dialogue, uphold international agreements, and seek peaceful solutions through negotiation and dialogue. The United States stands with Ukraine in its efforts to defend its territorial integrity and build a more stable and secure future for its people.


### Part 2: Cloning Biden's voice in a TTS model
In this part we have cloned the voice of president Biden using his one video clip of less than 3 min. And then we generate the audio of the above response.  

In [None]:
%%capture
!pip install TTS
!pip install TTS.api
!pip install --upgrade numpy

In [None]:
#once the model is loaded, no need to run the cell again.
from TTS.api import TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2", gpu=True)



 > tts_models/multilingual/multi-dataset/xtts_v2 is already downloaded.
 > Using model: xtts


In [None]:
# generate speech by cloning a voice using default settings
tts.tts_to_file(text=response,
                file_path="bidenoutput_question4.wav",
                speaker_wav=["/content/bidenOriginalVoice.wav"],
                language="en",
                #split_sentences=True
                )

 > Text splitted to sentences.
["My approach to handling the conflict in Ukraine is grounded in diplomacy, support for Ukraine's sovereignty, and working closely with our allies and partners.", 'It is essential to engage in dialogue, uphold international agreements, and seek peaceful solutions through negotiation and dialogue.', 'The United States stands with Ukraine in its efforts to defend its territorial integrity and build a more stable and secure future for its people.']
 > Processing time: 21.272592544555664
 > Real-time factor: 0.6024844653705538


'bidenoutput_question4.wav'

Save the audio files for future use.

### Part 3: Generating AI Biden
In this part we are generating Artificially Intellegent Biden bot who can produce more coherent and effective responses compared to the real President Biden.

Before continue running the next code restart the session once in order to avoid any locale issue.

In [None]:
!git clone https://github.com/zabique/Wav2Lip

Cloning into 'Wav2Lip'...
remote: Enumerating objects: 378, done.[K
remote: Total 378 (delta 0), reused 0 (delta 0), pack-reused 378[K
Receiving objects: 100% (378/378), 526.97 KiB | 4.50 MiB/s, done.
Resolving deltas: 100% (205/205), done.


Make sure to do the following changes before proceeding:

In '/content/Wav2Lip/audio.py' change the function '_built_mel_basis()' with the below code.
```
def _build_mel_basis():
    assert hp.fmax <= hp.sample_rate // 2
    return librosa.filters.mel(sr=hp.sample_rate, n_fft=hp.n_fft, n_mels=hp.num_mels, fmin=hp.fmin, fmax=hp.fmax)
```

In [None]:
#download the pretrained model
!wget 'https://iiitaphyd-my.sharepoint.com/personal/radrabha_m_research_iiit_ac_in/_layouts/15/download.aspx?share=EdjI7bZlgApMqsVoEUUXpLsBxqXbn5z8VTmoxp55YNDcIA' -O '/content/Wav2Lip/checkpoints/wav2lip_gan.pth'

--2024-07-15 06:24:02--  https://iiitaphyd-my.sharepoint.com/personal/radrabha_m_research_iiit_ac_in/_layouts/15/download.aspx?share=EdjI7bZlgApMqsVoEUUXpLsBxqXbn5z8VTmoxp55YNDcIA
Resolving iiitaphyd-my.sharepoint.com (iiitaphyd-my.sharepoint.com)... 13.107.136.10, 13.107.138.10, 2620:1ec:8f8::10, ...
Connecting to iiitaphyd-my.sharepoint.com (iiitaphyd-my.sharepoint.com)|13.107.136.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 435801865 (416M) [application/octet-stream]
Saving to: ‘/content/Wav2Lip/checkpoints/wav2lip_gan.pth’


2024-07-15 06:24:12 (44.2 MB/s) - ‘/content/Wav2Lip/checkpoints/wav2lip_gan.pth’ saved [435801865/435801865]



In [None]:
%%capture
!cd Wav2Lip && pip install -r requirements.txt

In [None]:
#download pretrained model for face detection
!wget "https://www.adrianbulat.com/downloads/python-fan/s3fd-619a316812.pth" -O "/content/Wav2Lip/face_detection/detection/sfd/s3fd.pth"

--2024-07-15 06:25:51--  https://www.adrianbulat.com/downloads/python-fan/s3fd-619a316812.pth
Resolving www.adrianbulat.com (www.adrianbulat.com)... 45.136.29.207
Connecting to www.adrianbulat.com (www.adrianbulat.com)|45.136.29.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 89843225 (86M) [application/octet-stream]
Saving to: ‘/content/Wav2Lip/face_detection/detection/sfd/s3fd.pth’


2024-07-15 06:25:56 (23.4 MB/s) - ‘/content/Wav2Lip/face_detection/detection/sfd/s3fd.pth’ saved [89843225/89843225]



In [None]:
# Get a sample video for providing background and the audio response.
!cd Wav2Lip && python inference.py --checkpoint_path checkpoints/wav2lip_gan.pth --face "/content/bidenVideo.mp4" --audio "/content/bidenoutput_question4.wav"

Using cpu for inference.
Reading video frames...
Number of frames available for inference: 3410
(80, 2377)
Length of mel chunks: 886
  0% 0/7 [00:00<?, ?it/s]
  0% 0/56 [00:00<?, ?it/s][A
  2% 1/56 [02:32<2:19:51, 152.58s/it][A

In [None]:
#Play result video -  50% scaling
from IPython.display import HTML
from base64 import b64encode
mp4 = open('/content/Wav2Lip/results/result_voice.mp4','rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
HTML(f"""
<video width="50%" height="50%" controls>
      <source src="{data_url}" type="video/mp4">
</video>""")

In [None]:
#Download Result.mp4 to your computer
from google.colab import files
files.download('/content/Wav2Lip/results/result_voice.mp4')


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>