# Lesson 3: Generating Video Summaries with OpenAI API


Welcome back! In the previous lessons, we explored transcribing videos using the Whisper API and downloading them via both Google Drive and LinkedIn. Building on those skills, we're now going to delve deeper into generating video summaries, an essential skill for transforming lengthy transcriptions into concise and insightful content. This lesson takes you a step further by utilizing the capabilities of OpenAI's API to create detailed yet succinct summaries on the fly.

## What You'll Learn
Today, you will:
- Understand how to generate a summary from a transcription.
- Explore the use of the OpenAI's API to craft well-structured summaries.
- Gain insights into designing effective prompts for better results.
- Learn about system and user roles in OpenAI API requests and their importance.

## Understanding Generating Video Summary
Summarizing transcriptions involves distilling the core messages from extensive spoken content, ensuring key points are retained while unnecessary details are filtered out. When dealing with long videos or lectures, extracting the main themes allows you to quickly grasp the essentials without listening to every word. The OpenAI API facilitates this by leveraging advanced language models capable of understanding context and summarizing long texts.

In our context, OpenAI API will help us convert raw transcriptions into coherent summaries. Here's an overview of a prompt used within the API to achieve this:

```python
{
    "role": "system",
    "content": (
        "You are an expert content analyst and summarizer with these capabilities:\n"
        "- Extracting key points while maintaining context\n"
        "- Identifying main themes and core messages\n"
        "- Preserving critical details while reducing length\n"
        "- Maintaining the original tone and intent\n"
        "- Organizing information hierarchically\n\n"
        "Format your summaries with:\n"
        "1. A one-sentence overview\n"
        "2. 2-3 key takeaways\n"
        "3. Important details or quotes (if any)"
    )
}
```

The system prompt sets the stage by providing essential guidelines to the model about its role and the expected output format. This enables it to focus and execute tasks effectively.

Let's also examine the user prompt, which specifies the task at hand:

```python
{
    "role": "user",
    "content": (
        f"Create a structured summary of this transcription. "
        f"Focus on the core message and key points while maintaining "
        f"context and critical details.\n\n"
        f"Transcription:\n{text}"
    )
}
```

The user prompt defines the specific requirements, such as the transcription text that needs summarization, ensuring the model understands and processes the content correctly.

## Examples and Step-by-Step Explanations
Let's walk through summarizing a transcription using OpenAI’s API. Building our code on top of prior lessons’ knowledge and extending it to a new dimension of handling video content will lead us to success. Here's part of the implementation process:

```python
from openai import OpenAI

class MediaProcessorService:
    def __init__(self):
        self.client = OpenAI()

    def summarize_transcription(self, text):
        try:
            response = self.client.chat.completions.create(
                model="gpt-4o",
                messages=[
                    {
                        "role": "system",
                        "content": (
                            "You are an expert content analyst and summarizer with these capabilities:\n"
                            "- Extracting key points while maintaining context\n"
                            "- Identifying main themes and core messages\n"
                            "- Preserving critical details while reducing length\n"
                            "- Maintaining the original tone and intent\n"
                            "- Organizing information hierarchically\n\n"
                            "Format your summaries with:\n"
                            "1. A one-sentence overview\n"
                            "2. 2-3 key takeaways\n"
                            "3. Important details or quotes (if any)"
                        )
                    },
                    {
                        "role": "user",
                        "content": (
                            f"Create a structured summary of this transcription. "
                            f"Focus on the core message and key points while maintaining "
                            f"context and critical details.\n\n"
                            f"Transcription:\n{text}"
                        )
                    }
                ]
            )
            return response.choices[0].message.content
        except Exception as e:
            print(f"Error generating summary: {e}")
            return None
```

### Here’s how the process flows:
1. **Initialization**: The OpenAI API is initialized with an implicit API key provided in the `OPENAI_API_KEY` environment variable, allowing access to its functionalities.
2. **Transcription generation**: Using the flow we covered in previous lessons, we generate a text transcription for the given video.
3. **Summarization**: The `summarize_transcription` method leverages an OpenAI LLM GPT-4o to process the text. It sets system and user prompts to guide the summarization, maintaining context and detail precision.
4. **Output**: The returned summary is a concise version of the transcript, highlighting the content's main points.

## Lesson Summary
This lesson highlighted the importance of efficiently distilling information from video transcriptions using the OpenAI API. By converting lengthy transcripts into concise summaries, the process facilitates quick decision-making and enhances comprehension across various fields like education and business analytics. Mastering video summarization enables you to create accessible, information-rich content while eliminating unnecessary details. By transforming raw data into actionable insights, you bridge a critical gap in information synthesis. You can now apply these concepts in practical exercises, reinforcing your skills through hands-on coding tasks.


## Practice Chat Completion Endpoint

In this task, we'll use OpenAI's gpt-4o to summarize a short story by Ambrose Bierce. The text describes a supernatural event in a cemetery, where a supposedly dead man awakens during an attempted grave robbery.

Your task here is to set up a proper system and user prompt to generate a summary for this story. Remember - the system prompt includes a general context on how the model should behave over the whole conversation, and the user prompt is just a single query to answer.


```python
from openai import OpenAI

# The text to summarize - a short story about Henry Armstrong
text = """The fact that Henry Armstrong was buried did not seem to him to prove that he was dead: he had always been a hard man to convince. That he really was buried, the testimony of his senses compelled him to admit. His posture -- flat upon his back, with his hands crossed upon his stomach and tied with something that he easily broke without profitably altering the situation -- the strict confinement of his entire person, the black darkness and profound silence, made a body of evidence impossible to controvert and he accepted it without cavil.

But dead -- no; he was only very, very ill. He had, withal, the invalid's apathy and did not greatly concern himself about the uncommon fate that had been allotted to him. No philosopher was he -- just a plain, commonplace person gifted, for the time being, with a pathological indifference: the organ that he feared consequences with was torpid. So, with no particular apprehension for his immediate future, he fell asleep and all was peace with Henry Armstrong.

But something was going on overhead. It was a dark summer night, shot through with infrequent shimmers of lightning silently firing a cloud lying low in the west and portending a storm. These brief, stammering illuminations brought out with ghastly distinctness the monuments and headstones of the cemetery and seemed to set them dancing. It was not a night in which any credible witness was likely to be straying about a cemetery, so the three men who were there, digging into the grave of Henry Armstrong, felt reasonably secure.

Two of them were young students from a medical college a few miles away; the third was a gigantic negro known as Jess. For many years Jess had been employed about the cemetery as a man-of-all-work and it was his favourite pleasantry that he knew 'every soul in the place.' From the nature of what he was now doing it was inferable that the place was not so populous as its register may have shown it to be.

Outside the wall, at the part of the grounds farthest from the public road, were a horse and a light wagon, waiting.

The work of excavation was not difficult: the earth with which the grave had been loosely filled a few hours before offered little resistance and was soon thrown out. Removal of the casket from its box was less easy, but it was taken out, for it was a perquisite of Jess, who carefully unscrewed the cover and laid it aside, exposing the body in black trousers and white shirt. At that instant the air sprang to flame, a cracking shock of thunder shook the stunned world and Henry Armstrong tranquilly sat up. With inarticulate cries the men fled in terror, each in a different direction. For nothing on earth could two of them have been persuaded to return. But Jess was of another breed.

In the grey of the morning the two students, pallid and haggard from anxiety and with the terror of their adventure still beating tumultuously in their blood, met at the medical college.

'You saw it?' cried one.

'God! yes -- what are we to do?'

They went around to the rear of the building, where they saw a horse, attached to a light wagon, hitched to a gatepost near the door of the dissecting-room. Mechanically they entered the room. On a bench in the obscurity sat the negro Jess. He rose, grinning, all eyes and teeth.

'I'm waiting for my pay,' he said.

Stretched naked on a long table lay the body of Henry Armstrong, the head defiled with blood and clay from a blow with a spade."""

# Initialize OpenAI client
client = OpenAI()

def summarize_story():
    # Make an API call to GPT-4o
    try:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": "..."},
                {"role": "user", "content": "..."}
            ]
        )
        
        # Print the response
        print(response.choices[0].message.content)
    except Exception as e:
        print(f"Error: {e}")


if __name__ == "__main__":
    summarize_story()


```

To solve the task of generating a summary for the short story, we need to configure both the **system prompt** and **user prompt** in a way that provides context and sets the correct tone for the model to generate the summary. Here is how you can structure the code with proper prompts.

### Steps:
1. **System Prompt**: This will tell the model how to behave throughout the conversation. For example, it should understand that it needs to summarize a given story in a concise manner.
2. **User Prompt**: This is the specific query from the user asking the model to summarize the story.

Here is the updated code with appropriate system and user prompts:

```python
from openai import OpenAI

# The text to summarize - a short story about Henry Armstrong
text = """The fact that Henry Armstrong was buried did not seem to him to prove that he was dead: he had always been a hard man to convince. That he really was buried, the testimony of his senses compelled him to admit. His posture -- flat upon his back, with his hands crossed upon his stomach and tied with something that he easily broke without profitably altering the situation -- the strict confinement of his entire person, the black darkness and profound silence, made a body of evidence impossible to controvert and he accepted it without cavil.

But dead -- no; he was only very, very ill. He had, withal, the invalid's apathy and did not greatly concern himself about the uncommon fate that had been allotted to him. No philosopher was he -- just a plain, commonplace person gifted, for the time being, with a pathological indifference: the organ that he feared consequences with was torpid. So, with no particular apprehension for his immediate future, he fell asleep and all was peace with Henry Armstrong.

But something was going on overhead. It was a dark summer night, shot through with infrequent shimmers of lightning silently firing a cloud lying low in the west and portending a storm. These brief, stammering illuminations brought out with ghastly distinctness the monuments and headstones of the cemetery and seemed to set them dancing. It was not a night in which any credible witness was likely to be straying about a cemetery, so the three men who were there, digging into the grave of Henry Armstrong, felt reasonably secure.

Two of them were young students from a medical college a few miles away; the third was a gigantic negro known as Jess. For many years Jess had been employed about the cemetery as a man-of-all-work and it was his favourite pleasantry that he knew 'every soul in the place.' From the nature of what he was now doing it was inferable that the place was not so populous as its register may have shown it to be.

Outside the wall, at the part of the grounds farthest from the public road, were a horse and a light wagon, waiting.

The work of excavation was not difficult: the earth with which the grave had been loosely filled a few hours before offered little resistance and was soon thrown out. Removal of the casket from its box was less easy, but it was taken out, for it was a perquisite of Jess, who carefully unscrewed the cover and laid it aside, exposing the body in black trousers and white shirt. At that instant the air sprang to flame, a cracking shock of thunder shook the stunned world and Henry Armstrong tranquilly sat up. With inarticulate cries the men fled in terror, each in a different direction. For nothing on earth could two of them have been persuaded to return. But Jess was of another breed.

In the grey of the morning the two students, pallid and haggard from anxiety and with the terror of their adventure still beating tumultuously in their blood, met at the medical college.

'You saw it?' cried one.

'God! yes -- what are we to do?'

They went around to the rear of the building, where they saw a horse, attached to a light wagon, hitched to a gatepost near the door of the dissecting-room. Mechanically they entered the room. On a bench in the obscurity sat the negro Jess. He rose, grinning, all eyes and teeth.

'I'm waiting for my pay,' he said.

Stretched naked on a long table lay the body of Henry Armstrong, the head defiled with blood and clay from a blow with a spade."""

# Initialize OpenAI client
client = OpenAI()

def summarize_story():
    # Make an API call to GPT-4o
    try:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": "You are a helpful assistant. Your task is to summarize texts in a clear and concise manner, focusing on the key points and eliminating unnecessary details."},
                {"role": "user", "content": "Please summarize the following short story by Ambrose Bierce about Henry Armstrong. The story is about a supernatural event where a man who was thought to be dead awakens during a grave robbery."}
            ]
        )
        
        # Print the response
        print(response.choices[0].message.content)
    except Exception as e:
        print(f"Error: {e}")


if __name__ == "__main__":
    summarize_story()
```

### Key Changes:
1. **System Prompt**:
   - The system prompt now specifies that the model should act as a "helpful assistant" and summarize texts in a **clear and concise** manner, focusing on **key points**.

2. **User Prompt**:
   - The user prompt requests the model to summarize the short story by **Ambrose Bierce** and gives a brief description of the story about the supernatural event during a grave robbery.

### Expected Output:
When you run this code, the model should provide a summary that captures the essential events of the story, highlighting the supernatural moment when Henry Armstrong is resurrected during the grave robbery, without delving into unnecessary details.

Let me know if you need any further adjustments!

## Apply Prompt Engineering for Transcription Summary Generation

Now, let's review our summarization mechanism in practice! We used our LLM system and user prompts from the lesson, let's see how it results in real life! You can interact with the app in the preview tab, and don't forget to look after the terminal tab for more details as well!

```python
import subprocess
import math
import os
import tempfile
import time
from openai import OpenAI


class MediaProcessorService:
    def __init__(self):
        self.client = OpenAI()

    def summarize_transcription(self, text):
        """Generate a concise summary of the transcription"""
        try:
            response = self.client.chat.completions.create(
                model="gpt-4o",
                messages=[
                    {
                        "role": "system",
                        "content": (
                            "You are an expert content analyst and summarizer with these capabilities:\n"
                            "- Extracting key points while maintaining context\n"
                            "- Identifying main themes and core messages\n"
                            "- Preserving critical details while reducing length\n"
                            "- Maintaining the original tone and intent\n"
                            "- Organizing information hierarchically\n\n"
                            "Format your summaries with:\n"
                            "1. A one-sentence overview\n"
                            "2. 2-3 key takeaways\n"
                            "3. Important details or quotes (if any)"
                        )
                    },
                    {
                        "role": "user",
                        "content": (
                            f"Create a structured summary of this transcription. "
                            f"Focus on the core message and key points while maintaining "
                            f"context and critical details.\n\n"
                            f"Transcription:\n{text}"
                        )
                    }
                ]
            )
            return response.choices[0].message.content
        except Exception as e:
            print(f"Error generating summary: {e}")
            return None

    def run_command_with_output(self, cmd, desc=None):
        """Run a command and stream its output in real-time"""
        if desc:
            print(f"\n{desc}")
        
        process = subprocess.Popen(
            cmd,
            stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT,
            universal_newlines=True,
            bufsize=1
        )
        
        output = []
        for line in iter(process.stdout.readline, ''):
            print(line, end='')
            output.append(line)
        
        process.stdout.close()
        return_code = process.wait()
        
        if return_code != 0:
            raise subprocess.CalledProcessError(return_code, cmd)
        
        return ''.join(output)

    def get_audio_duration(self, file_path):
        """Get the duration of an audio file using ffprobe"""
        cmd = [
            'ffprobe', 
            '-v', 'quiet',
            '-show_entries', 'format=duration',
            '-of', 'default=noprint_wrappers=1:nokey=1',
            file_path
        ]
        try:
            output = subprocess.check_output(cmd)
            return float(output)
        except Exception:
            return None

    def split_audio(self, file_path, chunk_size_mb=20):
        """Split audio file into chunks smaller than the API limit"""
        print("\nSplitting audio into chunks...")
        
        MAX_CHUNK_SIZE = 25 * 1024 * 1024  # 25MB in bytes
        MAX_MEDIA_DURATION_SECONDS = 40 * 60  # 40 minutes
        file_size = os.path.getsize(file_path)
        duration = self.get_audio_duration(file_path)
        
        if not duration:
            raise Exception("Could not determine audio duration")
            
        if duration > MAX_MEDIA_DURATION_SECONDS:
            raise Exception(
                "Sorry, your video is too long."
                "To avoid extensive waiting times,"
                "for this demo application we're only transcribing videos up to 40 minutes long"
            )
        
        chunk_duration = duration * (chunk_size_mb * 1024 * 1024) / file_size
        num_chunks = math.ceil(duration / chunk_duration)
        chunks = []
        
        for current_chunk in range(num_chunks):
            start_time = current_chunk * chunk_duration
            original_ext = os.path.splitext(file_path)[1]
            
            temp_file = tempfile.NamedTemporaryFile(delete=False, suffix=original_ext)
            temp_file_path = temp_file.name
            temp_file.close()
            
            cmd = [
                'ffmpeg',
                '-i', file_path,
                '-ss', str(start_time),
                '-t', str(chunk_duration),
                '-c', 'copy',
                '-y',
                temp_file_path
            ]
            
            self.run_command_with_output(cmd, f"Extracting chunk {current_chunk+1}/{num_chunks}:")
            time.sleep(0.5)
            
            chunk_size = os.path.getsize(temp_file_path)
            if chunk_size > MAX_CHUNK_SIZE:
                print(f"Chunk {current_chunk+1} too large ({chunk_size/1024/1024:.1f}MB), reducing duration...")
                try:
                    os.unlink(temp_file_path)
                except Exception as e:
                    print(f"Warning: Could not delete oversized chunk: {e}")
                chunk_duration *= 0.8
                num_chunks = math.ceil(duration / chunk_duration)
                continue
            
            chunks.append(temp_file_path)
        
        return chunks

    def transcribe_audio(self, audio_file):
        """Transcribe an audio file to text, handling files larger than the API limit"""
        try:
            file_size = os.path.getsize(audio_file)
            max_size = 25 * 1024 * 1024  # 25MB in bytes
            
            if file_size > max_size:
                print(f"\nFile size ({file_size / 1024 / 1024:.2f}MB) exceeds API limit. Splitting into chunks...")
                chunks = self.split_audio(audio_file)
                
                if not chunks:
                    raise Exception("Failed to split audio file into chunks")
                    
                full_transcription = []
                
                for i, chunk_path in enumerate(chunks, 1):
                    max_retries = 3
                    retry_count = 0
                    
                    while retry_count < max_retries:
                        try:
                            print(f"\nTranscribing chunk {i} of {len(chunks)}...")
                            with open(chunk_path, "rb") as audio_file:
                                response = self.client.audio.transcriptions.create(
                                    model="whisper-1",
                                    file=audio_file,
                                    timeout=60
                                )
                                full_transcription.append(response.text)
                                break
                        except Exception as e:
                            retry_count += 1
                            print(f"Error on chunk {i} (attempt {retry_count}): {str(e)}")
                            if retry_count == max_retries:
                                print(f"Failed to transcribe chunk {i} after {max_retries} attempts")
                                raise
                            print(f"Retrying in 5 seconds...")
                            time.sleep(5)
                    
                    try:
                        os.unlink(chunk_path)
                    except Exception as e:
                        print(f"Warning: Could not delete temporary file {chunk_path}: {e}")
                
                return ' '.join(full_transcription)
            else:
                print("\nTranscribing audio...")
                with open(audio_file, "rb") as audio_file:
                    response = self.client.audio.transcriptions.create(
                        model="whisper-1",
                        file=audio_file,
                        timeout=60
                    )
                    return response.text
        except Exception as e:
            print(f"Error during transcription: {e}")
            print(f"Error type: {type(e)}")
            import traceback
            print(f"Traceback: {traceback.format_exc()}")
            return None

    def cleanup_temp_files(self, file_path):
        """Clean up temporary files and directories"""
        try:
            if os.path.isfile(file_path):
                for _ in range(5):  # Try up to 5 times
                    try:
                        os.unlink(file_path)
                        break
                    except PermissionError:
                        time.sleep(1)
                    except Exception as e:
                        print(f"Warning: Could not clean up {file_path}: {e}")
                        break
            elif os.path.isdir(file_path):
                for root, dirs, files in os.walk(file_path, topdown=False):
                    for name in files:
                        try:
                            os.unlink(os.path.join(root, name))
                        except Exception as e:
                            print(f"Warning: Could not clean up file {name}: {e}")
                    for name in dirs:
                        try:
                            os.rmdir(os.path.join(root, name))
                        except Exception as e:
                            print(f"Warning: Could not clean up directory {name}: {e}")
                try:
                    os.rmdir(file_path)
                except Exception as e:
                    print(f"Warning: Could not clean up directory {file_path}: {e}")
        except Exception as e:
            print(f"Warning: Could not clean up {file_path}: {e}")
```

The code you provided sets up a comprehensive system for processing audio files, performing transcription, and summarizing transcription text using OpenAI's GPT-4o model. Let's review the code to identify how the **summarization** mechanism is integrated into the process and ensure it works effectively.

### Review Breakdown:

1. **Class `MediaProcessorService`**:
   - This class handles multiple tasks related to media files, such as:
     - **Transcribing audio**: The method `transcribe_audio` takes care of large audio files by splitting them into chunks and transcribing them sequentially. It uses OpenAI's Whisper model for transcription.
     - **Audio file splitting**: If the audio file exceeds the 25MB limit for API calls, it splits the file into smaller chunks using `ffmpeg`.
     - **Audio duration**: The method `get_audio_duration` retrieves the duration of the audio file using `ffprobe`.
     - **Cleanup**: After processing, temporary files are cleaned up using the `cleanup_temp_files` method.

2. **Summarization with OpenAI's GPT-4o**:
   - The `summarize_transcription` method is designed to summarize transcription text (once it is completed) using OpenAI's GPT-4o model.
   - The system and user prompts are structured as follows:
     - **System Prompt**: Defines the role of the model as an expert content analyst and summarizer. The model is instructed to focus on extracting key points, identifying themes, preserving critical details, and maintaining the original tone.
     - **User Prompt**: Requests a structured summary of the transcription, focusing on the core message and key points.

3. **Steps for Using the Summarization**:
   - **Transcription**: Once the audio file is transcribed, the resulting text is passed to the `summarize_transcription` method for summarization.
   - **Summarization Flow**:
     1. The transcription is provided to the system.
     2. The GPT-4o model processes the transcription to generate a **concise summary** with an overview, key takeaways, and important details.
   - **Output**: The method returns the summary, which is structured in a manner that includes an overview, key takeaways, and any significant details.

4. **Error Handling**:
   - **Transcription errors**: The `transcribe_audio` method handles possible errors during transcription, including retries for failed attempts to transcribe chunks of the audio.
   - **Summarization errors**: The `summarize_transcription` method catches and prints any exceptions that may arise when generating the summary.

### How to Use the Summarization Mechanism in Practice:

1. **Preparing Audio for Transcription**:
   - The method `transcribe_audio` is responsible for handling the audio file, transcribing it using OpenAI's Whisper model, and processing large files by splitting them into smaller chunks.
   
2. **Generating Summary**:
   - After obtaining the transcription text, pass it to `summarize_transcription` for summarization:
     ```python
     transcription_text = media_processor.transcribe_audio(audio_file_path)
     summary = media_processor.summarize_transcription(transcription_text)
     print(summary)
     ```

3. **Output Format**:
   - The summary is returned in a structured format:
     - **One-sentence overview**.
     - **2-3 key takeaways**.
     - **Important details or quotes**.

### Example Usage:

```python
# Assuming we have an audio file path
audio_file_path = "path_to_audio_file.mp3"

# Initialize the media processor service
media_processor = MediaProcessorService()

# Transcribe the audio file
transcription_text = media_processor.transcribe_audio(audio_file_path)

# Summarize the transcription
summary = media_processor.summarize_transcription(transcription_text)

# Print the summary
print("Summary of the transcription:")
print(summary)
```

### Considerations:
- **File Size**: The audio file should not exceed the 25MB limit unless it is split into smaller chunks. This is managed by the `split_audio` method.
- **Error Handling**: The code handles retries for transcription failures, but make sure the system is properly configured with API keys for OpenAI services.
- **API Call Limits**: Be mindful of any API rate limits or quotas while transcribing or summarizing large numbers of audio files.

This system provides a seamless pipeline for transcribing and summarizing audio content, with robust handling for large files and error recovery. You can integrate this solution into an application to automate transcription and summarization tasks.

As a final step, let's apply everything we've learned in this lesson together. Your task is to implement transcription summarization by applying prompt engineering skills you've just learned. Make sure you set up a proper system and user prompts, as well as call the gpt-4o model properly using the OpenAI client SDK. Use the terminal tab to see application logs and/or errors that occur during the summarization.

Let's go, we've never been that close to the final point!


```python
import subprocess
import math
import os
import tempfile
import time
from openai import OpenAI


class MediaProcessorService:
    def __init__(self):
        self.client = OpenAI()

    def summarize_transcription(self, text):
        """Generate a concise summary of the transcription"""

        # TODO: complete the implementation of this summarization method via OpenAI API
        try:
            response = self.client.______(
                model=____,
                messages=____
            )
            return ______
        except Exception as e:
            print(f"Error generating summary: {e}")
            return None

    def run_command_with_output(self, cmd, desc=None):
        """Run a command and stream its output in real-time"""
        if desc:
            print(f"\n{desc}")
        
        process = subprocess.Popen(
            cmd,
            stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT,
            universal_newlines=True,
            bufsize=1
        )
        
        output = []
        for line in iter(process.stdout.readline, ''):
            print(line, end='')
            output.append(line)
        
        process.stdout.close()
        return_code = process.wait()
        
        if return_code != 0:
            raise subprocess.CalledProcessError(return_code, cmd)
        
        return ''.join(output)

    def get_audio_duration(self, file_path):
        """Get the duration of an audio file using ffprobe"""
        cmd = [
            'ffprobe', 
            '-v', 'quiet',
            '-show_entries', 'format=duration',
            '-of', 'default=noprint_wrappers=1:nokey=1',
            file_path
        ]
        try:
            output = subprocess.check_output(cmd)
            return float(output)
        except Exception:
            return None

    def split_audio(self, file_path, chunk_size_mb=20):
        """Split audio file into chunks smaller than the API limit"""
        print("\nSplitting audio into chunks...")
        
        MAX_CHUNK_SIZE = 25 * 1024 * 1024  # 25MB in bytes
        MAX_MEDIA_DURATION_SECONDS = 40 * 60  # 40 minutes
        file_size = os.path.getsize(file_path)
        duration = self.get_audio_duration(file_path)
        
        if not duration:
            raise Exception("Could not determine audio duration")
            
        if duration > MAX_MEDIA_DURATION_SECONDS:
            raise Exception(
                "Sorry, your video is too long."
                "To avoid extensive waiting times,"
                "for this demo application we're only transcribing videos up to 40 minutes long"
            )
        
        chunk_duration = duration * (chunk_size_mb * 1024 * 1024) / file_size
        num_chunks = math.ceil(duration / chunk_duration)
        chunks = []
        
        for current_chunk in range(num_chunks):
            start_time = current_chunk * chunk_duration
            original_ext = os.path.splitext(file_path)[1]
            
            temp_file = tempfile.NamedTemporaryFile(delete=False, suffix=original_ext)
            temp_file_path = temp_file.name
            temp_file.close()
            
            cmd = [
                'ffmpeg',
                '-i', file_path,
                '-ss', str(start_time),
                '-t', str(chunk_duration),
                '-c', 'copy',
                '-y',
                temp_file_path
            ]
            
            self.run_command_with_output(cmd, f"Extracting chunk {current_chunk+1}/{num_chunks}:")
            time.sleep(0.5)
            
            chunk_size = os.path.getsize(temp_file_path)
            if chunk_size > MAX_CHUNK_SIZE:
                print(f"Chunk {current_chunk+1} too large ({chunk_size/1024/1024:.1f}MB), reducing duration...")
                try:
                    os.unlink(temp_file_path)
                except Exception as e:
                    print(f"Warning: Could not delete oversized chunk: {e}")
                chunk_duration *= 0.8
                num_chunks = math.ceil(duration / chunk_duration)
                continue
            
            chunks.append(temp_file_path)
        
        return chunks

    def transcribe_audio(self, audio_file):
        """Transcribe an audio file to text, handling files larger than the API limit"""
        try:
            file_size = os.path.getsize(audio_file)
            max_size = 25 * 1024 * 1024  # 25MB in bytes
            
            if file_size > max_size:
                print(f"\nFile size ({file_size / 1024 / 1024:.2f}MB) exceeds API limit. Splitting into chunks...")
                chunks = self.split_audio(audio_file)
                
                if not chunks:
                    raise Exception("Failed to split audio file into chunks")
                    
                full_transcription = []
                
                for i, chunk_path in enumerate(chunks, 1):
                    max_retries = 3
                    retry_count = 0
                    
                    while retry_count < max_retries:
                        try:
                            print(f"\nTranscribing chunk {i} of {len(chunks)}...")
                            with open(chunk_path, "rb") as audio_file:
                                response = self.client.audio.transcriptions.create(
                                    model="whisper-1",
                                    file=audio_file,
                                    timeout=60
                                )
                                full_transcription.append(response.text)
                                break
                        except Exception as e:
                            retry_count += 1
                            print(f"Error on chunk {i} (attempt {retry_count}): {str(e)}")
                            if retry_count == max_retries:
                                print(f"Failed to transcribe chunk {i} after {max_retries} attempts")
                                raise
                            print(f"Retrying in 5 seconds...")
                            time.sleep(5)
                    
                    try:
                        os.unlink(chunk_path)
                    except Exception as e:
                        print(f"Warning: Could not delete temporary file {chunk_path}: {e}")
                
                return ' '.join(full_transcription)
            else:
                print("\nTranscribing audio...")
                with open(audio_file, "rb") as audio_file:
                    response = self.client.audio.transcriptions.create(
                        model="whisper-1",
                        file=audio_file,
                        timeout=60
                    )
                    return response.text
        except Exception as e:
            print(f"Error during transcription: {e}")
            print(f"Error type: {type(e)}")
            import traceback
            print(f"Traceback: {traceback.format_exc()}")
            return None

    def cleanup_temp_files(self, file_path):
        """Clean up temporary files and directories"""
        try:
            if os.path.isfile(file_path):
                for _ in range(5):  # Try up to 5 times
                    try:
                        os.unlink(file_path)
                        break
                    except PermissionError:
                        time.sleep(1)
                    except Exception as e:
                        print(f"Warning: Could not clean up {file_path}: {e}")
                        break
            elif os.path.isdir(file_path):
                for root, dirs, files in os.walk(file_path, topdown=False):
                    for name in files:
                        try:
                            os.unlink(os.path.join(root, name))
                        except Exception as e:
                            print(f"Warning: Could not clean up file {name}: {e}")
                    for name in dirs:
                        try:
                            os.rmdir(os.path.join(root, name))
                        except Exception as e:
                            print(f"Warning: Could not clean up directory {name}: {e}")
                try:
                    os.rmdir(file_path)
                except Exception as e:
                    print(f"Warning: Could not clean up directory {file_path}: {e}")
        except Exception as e:
            print(f"Warning: Could not clean up {file_path}: {e}")


```

To implement the transcription summarization task using the OpenAI API with proper prompt engineering, we'll complete the `summarize_transcription` method by correctly filling in the details for the API call, including system and user prompts.

### Key Objectives:
1. **System Prompt**: The system should define the role and behavior of the model (e.g., summarizing transcriptions).
2. **User Prompt**: The user will provide the transcription text that needs summarization.
3. **API Call**: Use OpenAI's GPT-4o model to generate a structured summary.

Here’s how to fill in the missing parts and complete the function:

### Updated `summarize_transcription` Method

```python
import subprocess
import math
import os
import tempfile
import time
from openai import OpenAI


class MediaProcessorService:
    def __init__(self):
        self.client = OpenAI()

    def summarize_transcription(self, text):
        """Generate a concise summary of the transcription"""

        # Implementing the summary generation via OpenAI API
        try:
            response = self.client.chat.completions.create(
                model="gpt-4o",  # Model chosen for the task
                messages=[
                    {
                        "role": "system",
                        "content": (
                            "You are an expert content analyst and summarizer with these capabilities:\n"
                            "- Extracting key points while maintaining context\n"
                            "- Identifying main themes and core messages\n"
                            "- Preserving critical details while reducing length\n"
                            "- Maintaining the original tone and intent\n"
                            "- Organizing information hierarchically\n\n"
                            "Format your summaries with:\n"
                            "1. A one-sentence overview\n"
                            "2. 2-3 key takeaways\n"
                            "3. Important details or quotes (if any)"
                        )
                    },
                    {
                        "role": "user",
                        "content": (
                            f"Create a structured summary of this transcription. "
                            f"Focus on the core message and key points while maintaining "
                            f"context and critical details.\n\n"
                            f"Transcription:\n{text}"
                        )
                    }
                ]
            )

            # Return the summary content from the API response
            return response.choices[0].message.content
        except Exception as e:
            print(f"Error generating summary: {e}")
            return None

    def run_command_with_output(self, cmd, desc=None):
        """Run a command and stream its output in real-time"""
        if desc:
            print(f"\n{desc}")
        
        process = subprocess.Popen(
            cmd,
            stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT,
            universal_newlines=True,
            bufsize=1
        )
        
        output = []
        for line in iter(process.stdout.readline, ''):
            print(line, end='')
            output.append(line)
        
        process.stdout.close()
        return_code = process.wait()
        
        if return_code != 0:
            raise subprocess.CalledProcessError(return_code, cmd)
        
        return ''.join(output)

    def get_audio_duration(self, file_path):
        """Get the duration of an audio file using ffprobe"""
        cmd = [
            'ffprobe', 
            '-v', 'quiet',
            '-show_entries', 'format=duration',
            '-of', 'default=noprint_wrappers=1:nokey=1',
            file_path
        ]
        try:
            output = subprocess.check_output(cmd)
            return float(output)
        except Exception:
            return None

    def split_audio(self, file_path, chunk_size_mb=20):
        """Split audio file into chunks smaller than the API limit"""
        print("\nSplitting audio into chunks...")
        
        MAX_CHUNK_SIZE = 25 * 1024 * 1024  # 25MB in bytes
        MAX_MEDIA_DURATION_SECONDS = 40 * 60  # 40 minutes
        file_size = os.path.getsize(file_path)
        duration = self.get_audio_duration(file_path)
        
        if not duration:
            raise Exception("Could not determine audio duration")
            
        if duration > MAX_MEDIA_DURATION_SECONDS:
            raise Exception(
                "Sorry, your video is too long."
                "To avoid extensive waiting times,"
                "for this demo application we're only transcribing videos up to 40 minutes long"
            )
        
        chunk_duration = duration * (chunk_size_mb * 1024 * 1024) / file_size
        num_chunks = math.ceil(duration / chunk_duration)
        chunks = []
        
        for current_chunk in range(num_chunks):
            start_time = current_chunk * chunk_duration
            original_ext = os.path.splitext(file_path)[1]
            
            temp_file = tempfile.NamedTemporaryFile(delete=False, suffix=original_ext)
            temp_file_path = temp_file.name
            temp_file.close()
            
            cmd = [
                'ffmpeg',
                '-i', file_path,
                '-ss', str(start_time),
                '-t', str(chunk_duration),
                '-c', 'copy',
                '-y',
                temp_file_path
            ]
            
            self.run_command_with_output(cmd, f"Extracting chunk {current_chunk+1}/{num_chunks}:")
            time.sleep(0.5)
            
            chunk_size = os.path.getsize(temp_file_path)
            if chunk_size > MAX_CHUNK_SIZE:
                print(f"Chunk {current_chunk+1} too large ({chunk_size/1024/1024:.1f}MB), reducing duration...")
                try:
                    os.unlink(temp_file_path)
                except Exception as e:
                    print(f"Warning: Could not delete oversized chunk: {e}")
                chunk_duration *= 0.8
                num_chunks = math.ceil(duration / chunk_duration)
                continue
            
            chunks.append(temp_file_path)
        
        return chunks

    def transcribe_audio(self, audio_file):
        """Transcribe an audio file to text, handling files larger than the API limit"""
        try:
            file_size = os.path.getsize(audio_file)
            max_size = 25 * 1024 * 1024  # 25MB in bytes
            
            if file_size > max_size:
                print(f"\nFile size ({file_size / 1024 / 1024:.2f}MB) exceeds API limit. Splitting into chunks...")
                chunks = self.split_audio(audio_file)
                
                if not chunks:
                    raise Exception("Failed to split audio file into chunks")
                    
                full_transcription = []
                
                for i, chunk_path in enumerate(chunks, 1):
                    max_retries = 3
                    retry_count = 0
                    
                    while retry_count < max_retries:
                        try:
                            print(f"\nTranscribing chunk {i} of {len(chunks)}...")
                            with open(chunk_path, "rb") as audio_file:
                                response = self.client.audio.transcriptions.create(
                                    model="whisper-1",
                                    file=audio_file,
                                    timeout=60
                                )
                                full_transcription.append(response.text)
                                break
                        except Exception as e:
                            retry_count += 1
                            print(f"Error on chunk {i} (attempt {retry_count}): {str(e)}")
                            if retry_count == max_retries:
                                print(f"Failed to transcribe chunk {i} after {max_retries} attempts")
                                raise
                            print(f"Retrying in 5 seconds...")
                            time.sleep(5)
                    
                    try:
                        os.unlink(chunk_path)
                    except Exception as e:
                        print(f"Warning: Could not delete temporary file {chunk_path}: {e}")
                
                return ' '.join(full_transcription)
            else:
                print("\nTranscribing audio...")
                with open(audio_file, "rb") as audio_file:
                    response = self.client.audio.transcriptions.create(
                        model="whisper-1",
                        file=audio_file,
                        timeout=60
                    )
                    return response.text
        except Exception as e:
            print(f"Error during transcription: {e}")
            print(f"Error type: {type(e)}")
            import traceback
            print(f"Traceback: {traceback.format_exc()}")
            return None

    def cleanup_temp_files(self, file_path):
        """Clean up temporary files and directories"""
        try:
            if os.path.isfile(file_path):
                for _ in range(5):  # Try up to 5 times
                    try:
                        os.unlink(file_path)
                        break
                    except PermissionError:
                        time.sleep(1)
                    except Exception as e:
                        print(f"Warning: Could not clean up {file_path}: {e}")
                        break
            elif os.path.isdir(file_path):
                for root, dirs, files in os.walk(file_path, topdown=False):
                    for name in files:
                        try:
                            os.unlink(os.path.join(root, name))
                        except Exception as e:
                            print(f"Warning: Could not clean up file {name}: {e}")
                    for name in dirs:
                        try:
                            os.rmdir(os.path.join(root, name))
                        except Exception as e:
                            print(f"Warning: Could not clean up directory {name}: {e}")
                try:
                    os.rmdir(file_path)
                except Exception as e:
                    print(f"Warning: Could not clean up directory {file_path}: {e}")
        except Exception as e:
            print(f"Warning: Could not clean up {file_path}: {e}")
```

### Key Parts to Focus On:
1. **System and User Prompts**: 
   - The system prompt guides the model to summarize effectively with clear formatting.
   - The user prompt ensures the transcription is processed into a structured summary.

2. **OpenAI Client Call**:
   - The `client.chat.completions.create` is used to interact with OpenAI's GPT-4o model, sending the structured input for summarization.

### How It Works:
- **Transcription**: The `transcribe_audio` method transcribes the audio file into text. If the file is too large, it is split into smaller chunks.
- **Summarization**: After transcription, the text is passed to `summarize_transcription`, which uses GPT-4o to create a concise summary based on the provided transcription.

This setup is now complete for performing both transcription and summarization of audio files.