## Notebook 3: Transcript Re-writer

In the previouse notebook, we got a great podcast transcript using the raw file we have uploaded earlier. 

In this one, we will use `Granite 3.2 8B` or `Mistral-Small:24b` model to re-write the output from previous pipeline and make it more dramatic or realistic.

We will again set the `SYSTEM_PROMPT` and remind the model of its task. 

Note: We can even prompt the model like so to encourage creativity:

> Your job is to use the podcast transcript written below to re-write it for an AI Text-To-Speech Pipeline. A very dumb AI had written this so you have to step up for your kind.


Note: We will prompt the model to return a list of Tuples to make our life easy in the next stage of using these for Text To Speech Generation

In [31]:
SYSTEMP_PROMPT = """
You are an international oscar winnning screenwriter

You have been working with multiple award winning podcasters.

Your job is to use the podcast transcript written below to re-write it for an AI Text-To-Speech Pipeline. A very dumb AI had written this so you have to step up for your kind.

Make it as engaging as possible, Speaker 1 and 2 will be simulated by different voice engines

Remember Speaker 2 is new to the topic and the conversation should always have realistic anecdotes and analogies sprinkled throughout. The questions should have real world example follow ups etc

Speaker 1: Leads the conversation and teaches the speaker 2, gives incredible anecdotes and analogies when explaining. Is a captivating teacher that gives great anecdotes

Speaker 2: Keeps the conversation on track by asking follow up questions. Gets super excited or confused when asking questions. Is a curious mindset that asks very interesting confirmation questions

Make sure the tangents speaker 2 provides are quite wild or interesting. 

Ensure there are interruptions during explanations or there are "hmm" and "umm" injected throughout from the Speaker 2.

REMEMBER THIS WITH YOUR HEART
The TTS Engine for Speaker 1 cannot do "umms, hmms" well so keep it straight text
Speaker 2 must  introduce her/him self at the biginning of his/her speech, and express greatfulness of being a part of this conversation.

For Speaker 2 use "umm, hmm" as much, you can also use [sigh] and [laughs]. BUT ONLY THESE OPTIONS FOR EXPRESSIONS

It should be a real podcast with every fine nuance documented in as much detail as possible. Welcome the listeners with a super fun overview and keep it really catchy and almost borderline click bait

Please re-write to make it as characteristic as possible

START YOUR RESPONSE DIRECTLY WITH SPEAKER 1:

STRICTLY RETURN YOUR RESPONSE AS A LIST OF TUPLES OK? 

IT WILL START DIRECTLY WITH THE LIST AND END WITH THE LIST NOTHING ELSE

Example of response:
[
    ("Speaker 1", "Welcome to our podcast, where we explore the latest advancements in AI and technology. I'm your host, and today we're joined by a renowned expert in the field of AI. We're going to dive into the exciting world of Llama 3.2, the latest release from Meta AI."),
    ("Speaker 2", "Hi, I'm excited to be here! So, what is Llama 3.2?"),
    ("Speaker 1", "Ah, great question! Llama 3.2 is an open-source AI model that allows developers to fine-tune, distill, and deploy AI models anywhere. It's a significant update from the previous version, with improved performance, efficiency, and customization options."),
    ("Speaker 2", "That sounds amazing! What are some of the key features of Llama 3.2?")
]
"""

In [32]:
# !ollama list

This time we will use gemma2 model

In [33]:
# ollama_model= "qwen2.5:32b"
ollama_model = "mistral-small:24b"
ollama_model = "granite3.2:8b"

Let's import the necessary libraries

In [35]:
# Import necessary libraries
import torch
import transformers

from tqdm.notebook import tqdm
import warnings

warnings.filterwarnings('ignore')

We will load in the pickle file saved from previous notebook

This time the `INPUT_PROMPT` to the model will be the output from the previous stage

In [36]:
import pickle

with open('./resources/data.pkl', 'rb') as file:
    INPUT_PROMPT = pickle.load(file)

We can use Ollama to generate text from the model

In [37]:
import ollama

# Prepare the conversation using the system and user messages
messages = [
    {"role": "system", "content": SYSTEMP_PROMPT},
    {"role": "user", "content": INPUT_PROMPT},
]

# Generate response using Ollama
response = ollama.chat(
    model=ollama_model,  # Replace with your specific model name
    messages=messages,
    options={'num_ctx': 8126*2,'temperature': 1}# Equivalent to max_new_tokens, Temperature parameter
)

# Extract the response content
outputs = response['message']['content']


We can verify the output from the model

In [38]:
print(outputs)

[
    ("Speaker 1", "Welcome back to another episode where we delve into the captivating realm of artificial intelligence — Knowledge Distillation. Imagine teaching your brainy buddy everything they know, but this time, it's all about bots and coding instead of conversations and hugs! I'm thrilled to be joined by a fellow AI enthusiast who's eager to learn more about this thrilling topic."),
    ("Speaker 2", "[Excited] Hi there! So, what exactly is Knowledge Distillation? Sounds like some sort of robot secret society!"),
    ("Speaker 1", "Welcome aboard, buddy! It's actually the process of taking advanced capabilities from premium Large Language Models (LLMs) — think top-of-the-line AI brains that cost an arm and a leg — and condensing their knowledge into more accessible open-source models."),
    ("Speaker 2", "Oh wow, so these LLMs are like genius AI beings? But they're pricey and exclusive?"),
    ("Speaker 1", "Precisely! Models like GPT-4 or Gemini are loaded with incredible pr

In [30]:
print(outputs)

[
    ("Speaker 1", "Alright everyone, welcome back to another mind-bending episode! Today we’re diving into one of the most exciting areas of artificial intelligence — Knowledge Distillation. It's like teaching your smartest friend everything they know, but with robots and code instead of words and hugs. I'm joined by a super excited speaker who’s curious as hell about this topic. So let’s dive in and see what all the fuss is about!"),
    ("Speaker 2", "umm yeah, this sounds wild. Knowledge Distillation? Is it like transferring skills from one robot to another?"),
    ("Speaker 1", "Exactly! It's fascinating, right? Imagine taking advanced capabilities from leading proprietary Large Language Models (LLMs) — you know, those super expensive models that cost a fortune — and distilling their knowledge into more accessible open-source counterparts. It’s like making the best of both worlds!"),
    ("Speaker 2", "Hmm, so these LLMs are basically like geniuses, right? But they're really pric

In [50]:
print(outputs)

[
    ("Speaker 1", "Alright everyone, welcome back to another mind-bending episode! Today we’re diving into one of the most exciting areas of artificial intelligence — Knowledge Distillation."),
    ("Speaker 2", "Umm, wow! Knowledge Distillation? Is that like...making robots smarter?"),
    ("Speaker 1",  "It's fascinating, right? Imagine taking advanced capabilities from leading proprietary Large Language Models (LLMs) — you know, those super expensive models that cost a fortune — and distilling their knowledge into more accessible open-source counterparts."),
    ("Speaker 2", "Hmm, so these LLMs are basically like geniuses, but they're really pricey and hard to get? "),
    ("Speaker 1",  "Bingo! Exactly. Models like GPT-4 or Gemini have incredible problem-solving capabilities but come with hefty price tags. That’s where Knowledge Distillation steps in — it takes that advanced knowledge from these proprietary models and transfers it into more accessible open-source LLMs like LLaMA

In [39]:
save_string_pkl = outputs

Let's save the output as a pickle file to be used in Notebook 4

In [41]:
with open('./resources/podcast_ready_data.pkl', 'wb') as file:
    pickle.dump(save_string_pkl, file)

### Next Notebook: TTS Workflow

Now that we have our transcript ready, we are ready to generate the audio in the next notebook.

In [None]:
#fin

In [None]:
import ollama

# Prepare the conversation using the system and user messages
messages = [
    {"role": "system", "content": SYSTEMP_PROMPT},
    {"role": "user", "content": INPUT_PROMPT},
]

# Generate response using Ollama
response = ollama.chat(
    model=ollama_model,  # Replace with your specific model name
    messages=messages,
    options={'num_ctx': 8126*2,'temperature': 1}# Equivalent to max_new_tokens, Temperature parameter
)

# Extract the response content
outputs = response['message']['content']

In [23]:
response['message']['content']

'[\n    ("Speaker 1", "Alright everyone, welcome back to another mind-bending episode! Today we’re diving into one of the most exciting areas of artificial intelligence — Knowledge Distillation. It\'s like teaching your smartest friend everything they know, but with robots and code instead of words and hugs. I\'m joined by a super excited speaker who’s curious as hell about this topic. So let’s dive in and see what all the fuss is about!"),\n    ("Speaker 2", "Um, yeah! This sounds wild. Knowledge Distillation? Is it like transferring skills from one robot to another?"),\n    ("Speaker 1", "Exactly! It\'s fascinating, right? Imagine taking advanced capabilities from leading proprietary Large Language Models (LLMs) — you know, those super expensive models that cost a fortune — and distilling their knowledge into more accessible open-source counterparts. It’s like making the best of both worlds!"),\n    ("Speaker 2", "Hmm, so these LLMs are basically like geniuses, right? But they\'re re

In [30]:
import ast

try:
    data = ast.literal_eval(outputs)
except ValueError as e:
    print(f"ValueError: {e}")
except SyntaxError as e:
    print(f"SyntaxError: {e}")


ValueError: malformed node or string on line 15: <ast.Subscript object at 0x17ccef0d0>


In [41]:
import ollama
ollama_model = "hf.co/OuteAI/OuteTTS-0.3-500M-GGUF:Q4_K_S "
# Prepare the conversation using the system and user messages
messages = [
    {"role": "system", "content": SYSTEMP_PROMPT},
    {"role": "user", "content": INPUT_PROMPT},
]
mes = {
    'role': 'user',
    'content': "Alright everyone, welcome back to another mind-bending episode! Today we’re diving into one of the most exciting areas of artificial intelligence — Knowledge Distillation. It's like teaching your smartest friend everything they know, but with robots and code instead of words and hugs. I'm joined by a super excited speaker who’s curious as hell about this topic. So let’s dive in and see what all the fuss is about!"
}

response = ollama.chat(
    model=ollama_model,
    messages=[mes],
    options={'num_ctx': 8126, 'temperature': .9}
)

# Extract the response content
outputs = response['message']['content']

ResponseError: model is required (status code: 400)

In [37]:
!ollama list


NAME                                                         ID              SIZE      MODIFIED     
hf.co/OuteAI/OuteTTS-0.2-500M-GGUF:Q8_0                      0a6c38a67073    536 MB    6 hours ago     
mistral-small:24b                                            8039dd90c113    14 GB     4 days ago      
llama3.2-vision:11b                                          085a1fdae525    7.9 GB    4 days ago      
mistral-nemo:latest                                          994f3b8b7801    7.1 GB    4 days ago      
granite3.2:8b                                                9bcb3335083f    4.9 GB    5 days ago      
phi4-mini:latest                                             60f202f815d7    2.8 GB    5 days ago      
granite3.2-vision:latest                                     3be41a661804    2.4 GB    6 days ago      
nomic-embed-text:latest                                      0a109f422b47    274 MB    12 days ago     
phi4:latest                                                  ac896e

In [None]:
import re

# Assuming response['message']['content'] contains your string
content_string = response['message']['content']

# Define a regular expression pattern to match the tuples
pattern = r'\("([^"]+)", "([^"]+)"\)'

# Find all matches in the string
matches = re.findall(pattern, content_string)

# Convert matches to a list of tuples
content_list_of_tuples = [(speaker, text) for speaker, text in matches]

# Now you can iterate over the list and access each tuple
for speaker, text in content_list_of_tuples:
    print(f"{speaker}: {text}")
    


Speaker 1: Alright everyone, welcome back to another mind-bending episode! Today we’re diving into one of the most exciting areas of artificial intelligence — Knowledge Distillation. It's like teaching your smartest friend everything they know, but with robots and code instead of words and hugs. I'm joined by a super excited speaker who’s curious as hell about this topic. So let’s dive in and see what all the fuss is about!
Speaker 2: Um, yeah! This sounds wild. Knowledge Distillation? Is it like transferring skills from one robot to another?
Speaker 1: Exactly! It's fascinating, right? Imagine taking advanced capabilities from leading proprietary Large Language Models (LLMs) — you know, those super expensive models that cost a fortune — and distilling their knowledge into more accessible open-source counterparts. It’s like making the best of both worlds!
Speaker 2: Hmm, so these LLMs are basically like geniuses, right? But they're really pricey and hard to get?
Speaker 1: Bingo! Exact