# Lesson 2: Making Your First Whisper API Request

Welcome back! In the previous lesson, we set up a development environment using a virtual environment and installed the necessary dependencies to interact with the OpenAI API. Today, we're diving into making your first API request using Whisper, which is crucial for creating a transcription system. This builds on your understanding of environment setup and Python scripting, and now we’ll focus on interacting with APIs.

You'll learn to transform audio data into text using the Whisper API.

## Understanding Making Your First Whisper API Request

The Whisper API from OpenAI is designed to handle audio transcription. The core idea is to send audio data to the API, which then returns a transcribed text. This process begins with a valid API key that authenticates your requests. The API interprets byte-stream data from audio files, transcribing what’s spoken into text with varying levels of detail depending on its configuration.

While Whisper handles diverse audio inputs, it primarily focuses on capturing spoken content and might skip non-verbal sounds while ensuring the output is human-readable. The result is a JSON object containing the transcribed text and, sometimes, details like the duration of the audio.

## Making Your First API Request

Let’s explore a simple example demonstrating how to make your first transcription request using the Whisper API:

```python
from openai import OpenAI

client = OpenAI()

def transcribe_audio(file_path):
    """
    Transcribe an audio file using OpenAI's Whisper API.
    """
    try:
        with open(file_path, 'rb') as audio_file:
            transcript = client.audio.transcriptions.create(
                model="whisper-1",
                file=audio_file,
                timeout=60
            )
            return transcript.text
    except Exception as e:
        raise Exception(f"Transcription failed: {str(e)}")

if __name__ == "__main__":
    result = transcribe_audio("resources/sample_audio.mp3")
    print("Transcription:", result)
```

### This code demonstrates the transcription process:

1. **Client Initialization**: Instantiate an `openai.OpenAI` client. This client manages your requests to the OpenAI API, authenticated by the previously loaded API key. This client automatically uses the `OPENAI_API_KEY` environment variable for authentication.

2. **File Handling**: Open the audio file in binary read mode ("rb"). Reading as bytes ensures the data format is suitable for API processing.

3. **API Call**: The `client.audio.transcriptions.create` method submits the audio data for transcription. The model specifies which version of Whisper to use, in this case, "whisper-1". The timeout defines how long the request can take before it times out.

4. **Handling the Response**: The API call returns a JSON response. Access the `text` attribute to retrieve the transcribed content, ready for further processing or storage.

## Moving On To Practice

Now, that we know how to make an API request to OpenAI, let's try to do some practice! Onward and upward!


## Add Missing Code for Whisper API Transcription

Welcome back, Explorer! Now, let's see how our Whisper OpenAI model works in action. Run the code, and you will see a preview window opened, where you can choose the sample audio or video file, click the transcribe button, and it will send an API request to Whisper to receive the transcription. Easy as that!

Under the hood, the transcribe function in the code retrieves the transcription using the OpenAI SDK. Take your time to go through the code to understand the code thoroughly.

Hint: You can open and use the terminal tab to see how requests are being processed.

```python
from openai import OpenAI

client = OpenAI()


def transcribe(file_path):
    """
    Transcribe an audio file using OpenAI's API.
    """
    try:
        with open(file_path, 'rb') as audio_file:
            transcript = client.audio.transcriptions.create(
                model="whisper-1",
                file=audio_file
            )
            return transcript.text
    except Exception as e:
        raise Exception(f"Transcription failed: {str(e)}")

```

```markdown
## Add Missing Code for Whisper API Transcription

Welcome back, Explorer! Now, let's see how our Whisper OpenAI model works in action. Run the code, and you will see a preview window opened, where you can choose the sample audio or video file, click the transcribe button, and it will send an API request to Whisper to receive the transcription. Easy as that!

Under the hood, the `transcribe` function in the code retrieves the transcription using the OpenAI SDK. Take your time to go through the code to understand the code thoroughly.

Hint: You can open and use the terminal tab to see how requests are being processed.

```python
from openai import OpenAI

client = OpenAI()

def transcribe(file_path):
    """
    Transcribe an audio file using OpenAI's API.
    """
    try:
        # Open the audio file in binary read mode ('rb')
        with open(file_path, 'rb') as audio_file:
            # Use the OpenAI SDK to send the transcription request
            transcript = client.audio.transcriptions.create(
                model="whisper-1",  # Specify which Whisper model to use
                file=audio_file     # Provide the file to be transcribed
            )
            return transcript.text  # Return the transcribed text
    except Exception as e:
        # Handle any exceptions by raising an error with a message
        raise Exception(f"Transcription failed: {str(e)}")

if __name__ == "__main__":
    # Example usage: Provide the path to an audio file for transcription
    file_path = "path/to/your/audio/file.mp3"
    try:
        result = transcribe(file_path)
        print("Transcription:", result)  # Print the transcribed text
    except Exception as e:
        print(f"Error: {str(e)}")  # Print any errors that occur
```

### How It Works:
- **`transcribe` function**: This function takes an audio file's path as input and sends the audio data to OpenAI's Whisper API. The transcription is then returned as text.
- **File Handling**: The audio file is opened in binary read mode ('rb') to ensure it’s in the proper format for the API.
- **API Call**: The transcription request is made with `client.audio.transcriptions.create`. This sends the audio to the Whisper model for processing.
- **Error Handling**: If the transcription fails for any reason, an exception will be raised, providing a helpful error message.

Now, with the code in place, run the script and test it out with your own audio file. Happy transcribing!
```

## Audio Transcription Enhancement with OpenAI Whisper API

Local files are nice, but can we actually process remote mp4 videos? Let's try it out and see! Given the remote mp4 file URL in the code, organize running OpenAI Whisper transcribe operation on it.

Hint: you will need to specify the video format in order for Whisper to parse the content properly. You can do that by providing the file name on top of the content: file=('video.mp4', <content bytes>)


```python

from openai import OpenAI
import requests

# TODO: Define the OpenAI client
client = ____


def transcribe_remote(url):
    """
    Transcribe a remote video file from a URL using OpenAI's Whisper API.
    """
    try:
        # TODO: Download the content using the `requests` library
        
        # TODO: Send the content to OpenAI API
        transcript = client.audio.transcriptions.create(
            model="whisper-1",
            file=______
        )
        return transcript.text
        
    except Exception as e:
        return f"Error: {str(e)}"


if __name__ == "__main__":
    remote_video_url = "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/WeAreGoingOnBullrun.mp4"
    print(transcribe_remote(remote_video_url))

```

To make this script work for transcribing a remote MP4 file using OpenAI's Whisper API, you'll need to implement a couple of missing steps. Specifically, you'll download the file from the provided URL and send it to the Whisper API for transcription.

Here’s how you can complete the code:

1. **Download the file from the URL**: Use the `requests` library to get the content of the video.
2. **Provide the file content to OpenAI's Whisper API**: You'll send the file as a binary stream (using `file=('video.mp4', <content_bytes>)`).

Here’s the updated code with the missing steps filled in:

```python
from openai import OpenAI
import requests

# Define the OpenAI client
client = OpenAI(api_key='your-openai-api-key')

def transcribe_remote(url):
    """
    Transcribe a remote video file from a URL using OpenAI's Whisper API.
    """
    try:
        # Download the content using the `requests` library
        response = requests.get(url)
        
        if response.status_code == 200:
            # Send the content to OpenAI API for transcription
            transcript = client.audio.transcriptions.create(
                model="whisper-1",
                file=("video.mp4", response.content)
            )
            return transcript.text
        else:
            return f"Error: Failed to download the video (status code: {response.status_code})"
        
    except Exception as e:
        return f"Error: {str(e)}"


if __name__ == "__main__":
    remote_video_url = "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/WeAreGoingOnBullrun.mp4"
    print(transcribe_remote(remote_video_url))
```

### Key Changes:
1. **Download the file**: `requests.get(url)` is used to fetch the MP4 content from the given URL.
2. **Error handling**: If the download fails (non-200 HTTP status), the function will return an error message with the status code.
3. **Send file to Whisper API**: The file is passed as a tuple `("video.mp4", response.content)` to the Whisper API for transcription.

### Notes:
- Make sure to replace `'your-openai-api-key'` with your actual OpenAI API key.
- You can modify the `remote_video_url` to point to any other MP4 file URL you wish to transcribe.

This should work to transcribe the video from a remote source via the Whisper API. Let me know if you need further adjustments!

## Transcribing Audio with Whisper API

Galactic Pioneer, your mission is to journey to new frontiers by transcribing audio using the Whisper API. This time, start by downloading the mp3 file from the URL provided in the code. Once downloaded, load environment variables, set the API key, open the downloaded audio file, and create the transcription request. Follow the step-by-step guidance, and watch the magic unfold as you transform audio into text.

```python
from openai import OpenAI
import requests

client = OpenAI()


# URL of the mp3 file
url = "https://dare.wisc.edu/wp-content/uploads/sites/1051/2008/04/Arthur.mp3"

# TODO: Download the mp3 file from the URL

# TODO: Open the audio file in binary mode

# TODO: Create a transcription request with a timeout and specific model

# TODO: Print the transcribed text


```

To transcribe the audio from the provided URL using the Whisper API, here are the steps you need to follow:

1. **Download the mp3 file**: Use the `requests` library to download the audio file from the given URL.
2. **Set up environment variables**: You'll need to load your OpenAI API key from the environment variables.
3. **Open the downloaded audio file**: Open the file in binary mode for the Whisper API to process it.
4. **Create the transcription request**: Send the audio file to OpenAI's Whisper API for transcription.
5. **Handle the transcription**: Extract and print the transcribed text.

Here’s the full implementation with the missing steps:

```python
from openai import OpenAI
import requests
import os

# Set up OpenAI client with API key
client = OpenAI()

# URL of the mp3 file
url = "https://dare.wisc.edu/wp-content/uploads/sites/1051/2008/04/Arthur.mp3"

# Step 1: Download the mp3 file from the URL
response = requests.get(url)
if response.status_code != 200:
    print(f"Error: Failed to download the mp3 file (status code: {response.status_code})")
    exit()

# Step 2: Open the audio file in binary mode
audio_data = response.content

# Step 3: Create a transcription request with a timeout and specific model
try:
    transcription = client.audio.transcriptions.create(
        model="whisper-1",  # Specifying the Whisper model
        file=("Arthur.mp3", audio_data),
        timeout=120  # Optional: set a timeout in case the process takes too long
    )

    # Step 4: Print the transcribed text
    print("Transcription:\n", transcription.text)
    
except Exception as e:
    print(f"Error: {str(e)}")
```

### Key Steps:
1. **Download the mp3 file**: Using `requests.get(url)` to fetch the audio file.
2. **Open the audio file in binary mode**: The file content is stored in `audio_data` using `response.content`.
3. **Create transcription request**: The transcription request is created using the `client.audio.transcriptions.create` method, specifying the Whisper model (`whisper-1`) and the binary file data.
4. **Handle errors**: If the API request fails, an error message is printed.

### Notes:
- Make sure that the `OPENAI_API_KEY` environment variable is set properly. You can load it using a `.env` file or set it directly in your environment.
- If you are running this in a local environment, you might want to use `python-dotenv` to load the environment variables from a `.env` file.

Now when you run this script, it will download the mp3 file, transcribe it using the Whisper API, and print the resulting text. Let me know if you have any further questions!