# Whisper and LLAMA2

This file contains the procedure we did with:
- Whisper
- Llama2 -> we could not run it properly due to lack of resources (GPU with high ram)

In [2]:
import whisper

In [3]:
whisper_model = whisper.load_model('base.en')

def transcribe(file_path: str) -> str:
    transcription = whisper_model.transcribe(file_path, fp16=True)
    return transcription['text']

In [4]:
path = './video_test.mp4'
transcription = transcribe(path)
print(transcription)

 The Pink Apple Watch Series 9 is officially here, so let's unbox it together. I've been waiting to get my hands on the Pink Apple Watch for what feels like forever. Plus, we can compare the new model to the Apple Watch Series 8. The pink is subtle, but it is beautiful. Plus, we've got our hands on the upgraded charger, as well as this new carbon neutral band that I'm obsessed with. Let's unbox this sport loop style. Really chic. All right, let's hook this together. Okay, let's upgrade. We're going to remove this Series 8 for a little bit of variety, and so I can match different outfits. I also got this Nike Sportband. Let me know if in the next video you want me to compare the Apple Watch Series 8 with the Apple Watch Series 9. We've got a chat about all of the upgrades.


In [1]:
from torch import cuda, bfloat16
import torch

In [5]:
device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'
device_name = torch.cuda.get_device_name()
print(f"Using device: {device} ({device_name})")

Using device: cuda:0 (NVIDIA GeForce RTX 2060)


In [6]:
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf")
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf", device_map=device, torch_dtype=torch.bfloat16 )

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [7]:
model = model.to(device)

In [8]:
summary = 'Novak Djokovic beaten by Jannik Sinner in Australian Open semifinal. Sinner, 22, is the youngest male finalist at Australian Open since 2008. He will face either Daniil Medvedev or Alexander Zverev in Sunday\'s showpiece. Djkovic\'s bid for an outright record 25th grand slam title is put on hold after he was outplayed by the Italian across their three hour, 22-minute contest. The Serb was broken at 2-1 in the fourth set having held a 40-0 lead.'

In [13]:
prompt = f"Generate 1 script based on the following inputs: \n\n" \
             f"Transcript Text: {transcription}\n" \
             f"Article Summary: {summary}\n" \
             f"Video Duration: {30} seconds\n\n" \
             "The script should follow these guidelines: " \
             "1. Style Replication: Use language, sentence structure, and unique features from the transcript text. " \
             "2. Tone Emulation: Reflect the emotional tone of the transcript. " \
             "3. Creative Expansion: Add new ideas and narratives that align with the themes of the article summary. " \
             "4. Time Consideration: The content should be appropriate for the specified video duration. " \
             "5. Personality Reflection: Reflect the speaker's personality traits.\n\n" \
             "The format must be: * script_1:"\
             "That's an example of how the format should be. Please include * just before script. It is mandatory and essential to have this exact format so I can work on it later." \
             "* script_1: "

# Encode the prompt text to tensor
input_ids = tokenizer.encode(prompt, return_tensors='pt').to(device)

# Generate text using the model
output = model.generate(input_ids, max_new_tokens=512, num_return_sequences=1)

# Decode the generated text
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print(generated_text)


ValueError: The following `model_kwargs` are not used by the model: ['max_size'] (note: typos in the generate arguments will also show up in this list)

# Llama Test

In [14]:
import torch

import transformers

from transformers import LlamaForCausalLM, LlamaTokenizer

In [15]:
model_dir = 'meta-llama/Llama-2-7b-hf'

model = LlamaForCausalLM.from_pretrained(model_dir)
tokenizer = LlamaTokenizer.from_pretrained(model_dir)

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.float16,
    device_map=device,
)

Loading checkpoint shards:   0%|          | 0/2 [00:15<?, ?it/s]

In [16]:
sequences = pipeline(

    prompt,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=400,
)

for seq in sequences:
    print(f"{seq['generated_text']}")

KeyboardInterrupt: 