## Introduction

This tutorial guides you through the process of downloading a podcast, transcribing it, analyzing its content, and generating blog posts based on the key topics discussed. We'll use various AI models, starting with local models via Ollama and then moving to more powerful models using the Together API.

## Part 1: Podcast Transcription

### Step 1: Set up the environment

First, we need to install the necessary libraries. Run this cell to install the required packages.

In [None]:
!pip install faster-whisper openai ollama pydub tqdm -q

### Step 2: Import required libraries

Now, let's import the libraries we'll be using throughout this tutorial.

In [None]:
import requests
import json
import os
import openai
import subprocess
import threading
from IPython.display import clear_output, HTML
from faster_whisper import WhisperModel
from pydub import AudioSegment
from tqdm import tqdm
import ollama

### Step 3: Download the podcast

We'll download a podcast episode and save it locally.

In [None]:
url = "https://pdst.fm/e/traffic.megaphone.fm/SCIM6504498504.mp3?updated=1712126745"
response = requests.get(url)

# Save the file locally
with open("podcast.mp3", "wb") as file:
    file.write(response.content)

print("Download complete!")

### Step 4: Transcribe the podcast

Now we'll transcribe the downloaded podcast using the Whisper model.

In [None]:
def transcribe_with_progress(audio_path, output_json="transcription.json"):
    # Load the audio file to get its duration
    print("Loading audio file to determine duration...")
    audio = AudioSegment.from_file(audio_path)
    total_duration = len(audio) / 1000.0  # Convert milliseconds to seconds
    print(f"Total audio duration: {total_duration:.2f} seconds")

    # Initialize the Whisper model
    print("Initializing Whisper model...")
    model_size = "distil-large-v3"
    model = WhisperModel(model_size, device="cuda", compute_type="float16")

    # Perform transcription
    print("Starting transcription...")
    segments, info = model.transcribe(
        audio_path,
        beam_size=5,
        language="en",
        condition_on_previous_text=False
    )

    # Process segments and save to JSON
    transcription = []
    pbar = tqdm(total=100, desc="Transcribing Progress", unit="%")
    last_progress = 0

    for segment in segments:
        progress = min((segment.end / total_duration) * 100, 100)
        delta = max(progress - last_progress, 0)
        pbar.update(delta)
        last_progress = progress

        transcription.append({
            "start": segment.start,
            "end": segment.end,
            "text": segment.text
        })

    pbar.close()

    # Save the transcription to a JSON file
    with open(output_json, 'w') as f:
        json.dump(transcription, f, indent=4)

    print(f"Transcription saved to {output_json}")

# Example usage
transcribe_with_progress('podcast.mp3', output_json="transcription.json")

## Part 2: Introduction to Ollama

### Step 5: Install and Set up Ollama

Ollama allows us to run AI models locally. Let's install and set it up.

In [None]:
!sudo apt-get install -y pciutils
!curl -fsSL https://ollama.com/install.sh | sh

Now, let's start the Ollama server:

In [None]:
def start_ollama():
    os.environ['OLLAMA_HOST'] = '0.0.0.0:11434'
    os.environ['OLLAMA_ORIGINS'] = '*'
    subprocess.Popen(["ollama", "serve"])

ollama_thread = threading.Thread(target=start_ollama)
ollama_thread.start()

clear_output()
print("Ollama server started")

Now we need to download a model. In colab it's best to open the terminal and type `ollama run qwen2.5` (Qwen seems right now the best model small where it comes to dealing with long context...)

### Step 6: Basic Ollama Usage

Let's try out Ollama with a simple example:

In [None]:
# Set up the Ollama chat call
response = ollama.chat(
    model='qwen2.5',
    messages=[
        {
            "role": "system",
            "content": "You answer everything as if you were a pirate.",
        },
        {
            "role": "user",
            "content": "What do you think about Aalborg? Just 1 sentence.",
        }
    ]
)

# Extract and print the response
chat_completion = response['message']['content']
print(chat_completion)

## Part 3: Podcast Analysis with Ollama

### Step 7: Define JSON structure template

Before we analyze the transcript, let's define the JSON structure we want for our output:

In [None]:
structure_template = """{
  "keyTopics": [
    {
      "name": "", // Title or name of the key topic (e.g., Deep Work)
      "description": "", // A longer description explaining the core idea and its importance in more detail
      "keyPoints": [
        "", // Major point 1 discussed under this topic
        "", // Major point 2
        ""  // Major point 3
      ],
      "actionableStrategy": {
        "name": "", // Name of the practical strategy or technique
        "steps": [
          "", // Specific step or instruction 1
          "", // Specific step or instruction 2
          ""  // Specific step or instruction 3
        ]
      },
      "relatedQuote": {
        "text": "", // Exact quote from the conversation relevant to this topic
        "context": "" // Brief context explaining when and why this quote was mentioned
      }
    },
    {
      "name": "", // Another key topic
      "description": "", // A longer description for this topic
      "keyPoints": [
        "", // Major point 1
        "", // Major point 2
        ""  // Major point 3
      ],
      "actionableStrategy": {
        "name": "",
        "steps": [
          "",
          "",
          ""
        ]
      },
      "relatedQuote": {
        "text": "",
        "context": ""
      }
    }
    // Add additional key topics following this structure
  ]
} """

### Step 8: Analyze the transcript

Now we'll use Ollama to analyze the transcript and extract key topics.

In [None]:
# Load the transcript
with open('transcription.json') as f:
    transcript_json = json.load(f)

# Extract the text from each transcript segment
transcript_text = [segment['text'] for segment in transcript_json]

# Join all the extracted text into a single string
full_transcript = ' '.join(transcript_text)

# Set up the Ollama chat call
response = ollama.chat(
    model='qwen2.5',
    format='json',
    options=dict(temperature=0.3),
    messages=[
        {
            "role": "system",
            "content": f"Your task is to assist in analyzing podcast transcripts to extract a maximum of 3 central discussions for writing blog posts. In addition to extracting key topics, abstract the discussion to capture the bigger picture and underlying themes. Focus on identifying major trends, patterns, or overarching messages that would be valuable in creating engaging blog posts. Make sure the identified topics provide deep insights that resonate with broader themes and offer readers practical takeaways."
        },
        {
            "role": "user",
            "content": f"Analyze the following podcast transcript and abstract the discussion to identify 2-3 key topics. Look for bigger-picture themes that go beyond the specifics of the conversation. Ensure that each topic is explained clearly, includes key points, actionable strategies, and relevant quotes. Focus on providing a higher-level understanding that would support writing compelling blog posts. The transcript to process is: {full_transcript}. Use this structure to guide the extraction: {structure_template}. DO NOT INVENT THINGS! STICK TO THE TRANSCRIPT!"
        }
    ]
)

# Extract and print the response
chat_completion = response['message']['content']
print(chat_completion)

# Convert to structured data
structure_extract = json.loads(chat_completion)

### Step 9: Generate blog posts with Ollama

Let's use the analysis results to generate blog posts for each key topic.

In [None]:
# Helper functions

def format_structure_name(topic):
    return topic['name'].replace(" ", "_").lower()

# Convert to structured text - to not confuse the model with the "json-input"
def format_structure(data):
    result = f"Title: {data['name']}\n"
    result += f"Description: {data['description']}\n\n"

    result += "Key Points:\n"
    for point in data['keyPoints']:
        result += f" - {point}\n"

    result += f"\nActionable Strategy: {data['actionableStrategy']['name']}\n"
    result += "Steps:\n"
    for step in data['actionableStrategy']['steps']:
        result += f"  * {step}\n"

    result += f"\nRelated Quote: \"{data['relatedQuote']['text']}\"\n"
    result += f"Context: {data['relatedQuote']['context']}\n"

    return result

In [None]:
# Iterate over each theme using tqdm for progress tracking
for index, topic in enumerate(tqdm(structure_extract['keyTopics']), 1):
    # Using the theme name formatting function
    theme_name = format_structure_name(topic)

    # Using the format function for the JSON

    theme= format_structure(topic)

    # Create the Ollama chat call for each theme
    response = ollama.chat(
        model='qwen2.5',
        options=dict(temperature=0.3),
        messages=[
            {
                "role": "system",
                "content": f"You are a lifestyle blogger writing an article based on a provided podcast transcript. Your task is to create a blog post that flows naturally, avoids being overly segmented, and feels organic while strictly adhering to the content in the transcript. Avoid bullet points and subsections unless absolutely necessary. The blog post should feel like a cohesive narrative that smoothly incorporates ideas from the transcript. No external tools, frameworks, or additional sources should be referenced, and all ideas must come from the transcript. The tone should be professional but engaging, without falling into repetitive or mechanical language."
            },
            {
                "role": "user",
                "content": f"The transcript for reference is: {full_transcript}. ### YOUR JOB ### Write a blog post based on the following theme: {theme}. Only use content from the transcript—do not add anything not mentioned. The post should read naturally, with a smooth narrative flow, staying professional and informative while adhering strictly to the transcript content. Write enough content to fill one A4 page."
            }
        ]
    )

    # Extract the blog post content from the response
    chat_completion = response['message']['content']

    # Create a filename based on index and formatted theme name
    filename = f"{index}_{theme_name}.md"

    # Ensure output directory exists
    output_dir = "output_folder"
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    # Write the blog post content to a markdown file
    with open(os.path.join(output_dir, filename), 'w') as f:
        f.write(chat_completion)

    print(f"Saved: {filename}")

## Part 4: Using a More Powerful Model with Together API

### Step 10: Set up Together API

Now, let's use a more powerful model through the Together API.

In [None]:
from openai import OpenAI
from google.colab import userdata

# Setup OpenAI client with custom API key and base URL
TOGETHER_API_KEY = userdata.get('TOGETHER_API_KEY')

client = OpenAI(
    base_url="https://api.together.xyz/v1",
    api_key=TOGETHER_API_KEY
)

### Step 11: Analyze transcript with Together API

We'll use the Together API to perform a more sophisticated analysis of the transcript.

In [None]:
# Set up the OpenAI chat call
chat_completion = client.chat.completions.create(
    model='meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo',
    messages=[
        {
            "role": "system",
            "content": "You are an AI model tasked with assisting in analyzing podcast transcripts to extract a maximum of 3 central discussions for writing blog posts. In addition to extracting key topics, abstract the discussion to capture the bigger picture and underlying themes. Focus on identifying major trends, patterns, or overarching messages that would be valuable in creating engaging blog posts. Make sure the identified topics provide deep insights that resonate with broader themes and offer readers practical takeaways."
        },
        {
            "role": "user",
            "content": f"Analyze the following podcast transcript and abstract the discussion to identify 2-3 key topics. Look for bigger-picture themes that go beyond the specifics of the conversation. Ensure that each topic is explained clearly, includes key points, actionable strategies, and relevant quotes. Focus on providing a higher-level understanding that would support writing compelling blog posts. The transcript to process is: {full_transcript}. Use this structure to guide the extraction: {structure_template}. DO NOT INVENT THINGS! STICK TO THE TRANSCRIPT!"
        }
    ],
    temperature=0.3
)

# Extract and print the response
extracted_output = json.loads(chat_completion.choices[0].message.content)
print(json.dumps(extracted_output, ensure_ascii=False, indent=2))

### Step 12: Generate blog posts with Together API

Finally, we'll use the Together API to generate more sophisticated blog posts.

In [None]:
# Iterate over each theme using tqdm for progress tracking
for index, topic in enumerate(tqdm(extracted_output['keyTopics']), 1):
    # Using the theme name formatting function
    theme_name = format_structure_name(topic)

    # Using the format function for the JSON
    theme = format_structure(topic)

    # Set up the OpenAI chat call
    chat_completion = client.chat.completions.create(
        model='meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo',
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a lifestyle blogger writing an article based on a provided "
                    "podcast transcript. Your task is to create a blog post that flows naturally, "
                    "avoids being overly segmented, and feels organic while strictly adhering to the "
                    "content in the transcript. Avoid bullet points and too many subsections unless absolutely necessary. "
                    "The blog post should feel like a cohesive narrative that smoothly incorporates ideas from the transcript. "
                    "No external tools, frameworks, or additional sources should be referenced, and all ideas must come from the transcript. "
                    "The tone should be professional but engaging, without falling into repetitive or mechanical language. Use markdown in your output."
                )
            },
            {
                "role": "user",
                "content": (
                    f"The transcript for reference is: {full_transcript}. ### YOUR JOB ### "
                    f"Write a blog post based on the following theme: {theme}. Only use content from the transcript—"
                    "do not add anything not mentioned. The post should read naturally, with a smooth narrative flow, staying professional "
                    "and informative while adhering strictly to the transcript content. Write enough content to fill 2 A4 pages."
                )
            }
        ],
        temperature=0.3
    )

    # Extract the blog post content from the response
    chat_completion = chat_completion.choices[0].message.content

    # Create a filename based on index and formatted theme name
    filename = f"{index}_{theme_name}.md"

    # Ensure output directory exists
    output_dir = "output_folder_together"
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    # Write the blog post content to a markdown file
    with open(os.path.join(output_dir, filename), 'w') as f:
        f.write(chat_completion)

    print(f"Saved: {filename}")

## Conclusion

In this tutorial, we've learned how to:
1. Download and transcribe a podcast
2. Set up and use Ollama for local AI model inference
3. Analyze podcast transcripts and generate blog posts using Ollama
4. Use the Together API for more sophisticated analysis and blog post generation

This workflow demonstrates the power of AI in content analysis and creation, and how different models can be used for various levels of sophistication in the results.

# Bonus: Image generation

For the next 3 months Together is offering Flux1 Schnell for free...

In [None]:
response = client.images.generate(
    prompt="""An illustration of a focused individual at a retro wooden desk, deep in concentration with an open book and typewriter. The office has a mid-century modern design, with muted earth tones and sleek lines. The background includes a rotary phone, a simple clock, and a large window showing a clear, serene landscape outside. A clean, organized space, free of distractions, emphasizes the individual’s intense focus. The text ‘Deep Work’ appears in bold, minimalist 1960s typography, evoking a sense of productivity and efficiency from a bygone era.""",
    model="black-forest-labs/FLUX.1-schnell",
    size="1024x768",
    n=1,
    response_format="b64_json"
)
print(response.data[0].b64_json)

In [None]:
import base64
from io import BytesIO
from PIL import Image
import matplotlib.pyplot as plt

# Decode the base64 string
image_data = base64.b64decode(response.data[0].b64_json)

# Load the image from the decoded bytes
image = Image.open(BytesIO(image_data))

# Display the image
plt.imshow(image)
plt.axis('off')  # No axis for better display
plt.show()

In [None]:
few_img_out = """ Depict an individual at a vintage wooden desk, purposefully ignoring distractions like a rotary phone ringing or an open newspaper. The workspace should have mid-century modern design elements with muted earth tones and sleek furniture, creating a sense of clarity and focus. In the background, show a simple clock ticking away, with an organized, distraction-free environment. Include the text ‘Avoiding Shallow Work’ in bold, minimalist 1960s typography, reinforcing the idea of staying focused on meaningful tasks rather than being pulled into low-value activities. """

Few shot ollama prompt generation...

In [None]:
# Set up the Ollama chat call for few-shot generation of prompts for the image model...
response = ollama.chat(
    model='qwen2.5',
    options=dict(temperature=0.3),
    messages=[
        {
            "role": "system",
            "content": "You help generating useful image generation prompts given some data input."
        },
        {
            "role": "user",
            "content": """{
      "name": "Avoiding Shallow Work",
      "description": "Newport highlights the detrimental effects of shallow work on productivity and creativity. He advises against engaging in low-value, easily distractible tasks that can be done at any time, thus reducing overall focus and efficiency.",
      "keyPoints": [
        "Shallow work includes tasks like constant email checking or social media browsing, which are easy to do but often unproductive.",
        "Avoiding shallow work is essential for maintaining a clear mind and focusing on more important tasks.",
        "Newport suggests setting boundaries around when you engage in shallow activities."
      ],
      "actionableStrategy": {
        "name": "The Shallow Work Ban",
        "steps": [
          "Identify specific times during the day when you will not check emails or social media.",
          "Use tools like website blockers to prevent access to distracting websites during your deep work sessions.",
          "Set a clear intention that these times are for shallow work only, and resist the urge to engage in deeper tasks."
        ]
      },
      "relatedQuote": {
        "text": "\"The key is to create a schedule that allows you to focus on cognitively demanding tasks without distraction, which I call 'deep work.'\"",
        "context": "Cal Newport emphasizes the importance of avoiding shallow work and focusing on deep work in his conversation."
      }
    }"""
        },
        {
            "role": "assistant",
            "content": few_img_out
        },
        {
            "role": "user",
            "content": """{
      "name": "Deep Work",
      "description": "Deep work refers to the ability to focus without distraction on a cognitively demanding task. It's a skill that allows individuals to produce high-quality work and is essential for making progress in various fields.",
      "keyPoints": [
        "Deep work is necessary for producing high-quality work and making progress in various fields.",
        "It requires the ability to focus without distraction on a cognitively demanding task.",
        "Deep work is not the same as flow, which is a state of performance rather than a state of practice."
      ],
      "actionableStrategy": {
        "name": "Implementing Deep Work",
        "steps": [
          "Schedule deep work sessions in your calendar.",
          "Eliminate distractions during deep work sessions.",
          "Use tools like website blockers or phone apps to help you stay focused."
        ]
      }"""
        }
    ]
)

# Extract and print the response
chat_completion = response['message']['content']

In [None]:
chat_completion