### Step 1: Loading OpenAI API Key from `.env`

In this cell, we securely load the OpenAI API key from a `.env` file instead of hardcoding it in the notebook.  

**Steps performed:**
1. Import required libraries: `os` for environment variables, `dotenv` to load `.env` files, and `openai` for API access.
2. Load environment variables from the `.env` file using `load_dotenv()`.
3. Set the OpenAI API key for the `openai` library with `os.getenv("OPENAI_API_KEY")`.
4. Optionally, verify that the key is loaded by printing `True` if it exists.

**Security Note:** This prevents the API key from being pushed to GitHub if `.env` is added to `.gitignore`.


In [1]:
# Load OpenAI API key from .env
import os
from dotenv import load_dotenv
import openai

# Load environment variables from .env file
load_dotenv()

# Set OpenAI API key
openai.api_key = os.getenv("OPENAI_API_KEY")

# Verify the key is loaded (optional)
print("API Key loaded:", bool(openai.api_key))


API Key loaded: True


### Step 2: Prep audio (manually)

### Step 3: Basic Transcription (Without Chunking and Prompt)

In this step, we transcribe a short audio file using OpenAI's Whisper API. 
We will:
- Load the audio file
- Send it to the Whisper API
- Receive the transcription
- Display the text output

This helps us understand the API response format before handling longer audio files or chunking.


In [2]:
# Step 3: Basic Transcription (Without Chunking and without Prompt)

from pathlib import Path

# Path to audio file
audio_file_path = Path("audio/CA138clip.mp3")

# Open the audio file in binary mode
with open(audio_file_path, "rb") as audio_file:
    # Call Whisper API for transcription
    transcript = openai.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file
    )

# Display the transcribed text
print("Transcription:\n")
print(transcript.text)


Transcription:

It was rather interesting just to watch them gathering their materials and bouncing around their, what they call it, kangaroo walk or something like that. Who named it that? I don't know. I bet those men are going to get quite a reception when they get back to Maine. Oh, yes. I'll be so glad when they land back now. But I think that's pretty well a fact, because they've landed so many safely now that I feel relieved. Just getting off of the moon was the thing that was. Have they met with the one that was circling? Yes, they've rendezvoused. So I understand. That wasn't shown either, but they say they have rendezvoused. So that's a matter of making the circles and then coming down. What do you sort of imagine for the future? Do you imagine them sending up ships? I think they will. I think they will do some more exploring up there. Very positive. Because that was such a very small area, when you think of it, that they just gathered rocks and samples of soil and all. And t

### Step 4: Transcription with Prompts (Guided Approach)

In this step, we guide the Whisper API transcription by providing a **prompt** with context.  

**Objectives:**
- Improve transcription accuracy for technical terms, names, or specific context.
- Compare results with the unguided transcription from Step 2.
- Understand how Whisper handles contextual guidance.

**Process:**
1. Define a prompt describing the context (e.g., meeting type, technical terms).
2. Send both the audio file and prompt to Whisper API.
3. Receive and display the guided transcription.


### NOTE

In this step, we guide the Whisper API transcription using a **prompt specifically adapted** to the content of the audio.  

Since this audio is an interview after the first moon landing, the prompt reflects the context:
- Historical interview about astronauts
- Technical and mission details
- Dialogue flow between interviewer and interviewee

This ensures that the transcription is accurate, clear, and suitable for archival purposes.



In [None]:
# Step 4: Guided Transcription with Adapted Prompt

# Adapted prompt for the actual audio content
prompt_text = (
    "You are transcribing a historical interview about astronauts returning from the first moon landing. "
    "Focus on capturing names, mission details, technical terms, and any dialogue accurately. "
    "Preserve the flow of conversation and clearly represent questions and answers. "
    "The transcription should be clear, professional, and suitable for archival or historical records."
)

# Open the audio file in binary mode
with open(audio_file_path, "rb") as audio_file:
    # Call Whisper API with the adapted prompt
    guided_transcript = openai.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file,
        prompt=prompt_text
    )

# Display the guided transcription text
print("Guided Transcription:\n")
print(guided_transcript.text)


Guided Transcription:

It was rather interesting just to watch them gathering their materials and bouncing around. What do they call it? Kangaroo walk? Something like that. Who named it that? I don't know. I bet those men are going to get quite a reception when they get back to Earth. Oh, yes. I'll be so glad when they land back now. I think that's pretty well a fact because they've landed so many safely now that I feel relieved. Just getting off of the moon was the thing that was... Have they met with the one that was circling? Yes, they've rendezvoused, so I understand. That wasn't shown either, but they say they have rendezvoused. That's a matter of making the circles and then coming down. What do you sort of imagine for the future? Do you imagine them standing up? I think they will. I think they will do some more exploring up there. Very positive. Because that was such a very small area, when you think of it, that they just gathered rocks and samples of soil and all. They did probe

### Step 5: Audio Chunking

For long recordings, we split the audio into smaller, manageable chunks for transcription.  

**Objectives:**
- Split audio into segments
- Make it easier to process with Whisper
- Preserve timestamps for combining results later

**NOTE:**
We used a small audio file (1:26)
We split the audio into 20-second chunks to demonstrate chunking, even for a short clip. The same technique can be used for largers audio file.  
This allows us to show multiple chunks in the demo.

In [None]:
# Step 5: Audio Chunking

from pydub import AudioSegment

# === Step 1: Set audio file path (relative to notebook) ===
audio_file_path = Path("audio/CA138clip.mp3")

# === Step 2: Load audio ===
audio = AudioSegment.from_file(audio_file_path)
print(f"Audio loaded successfully! Duration: {len(audio)/1000:.2f} seconds")

# === Step 3: Prepare output folder (relative) ===
output_dir = Path("audio_chunks")
output_dir.mkdir(exist_ok=True)

# === Step 4: Split audio into 20-second chunks ===
chunk_length_ms = 20 * 1000  # 20 seconds
chunks = []

for i, start in enumerate(range(0, len(audio), chunk_length_ms)):
    end = min(start + chunk_length_ms, len(audio))
    chunk = audio[start:end]
    chunk_filename = output_dir / f"chunk_{i+1}.mp3"
    chunk.export(chunk_filename, format="mp3")
    chunks.append(chunk_filename)
    print(f"Created chunk {i+1}: {chunk_filename} ({len(chunk)/1000:.2f} seconds)")


Audio loaded successfully! Duration: 86.20 seconds
Created chunk 1: audio_chunks\chunk_1.mp3 (20.00 seconds)
Created chunk 2: audio_chunks\chunk_2.mp3 (20.00 seconds)
Created chunk 3: audio_chunks\chunk_3.mp3 (20.00 seconds)
Created chunk 4: audio_chunks\chunk_4.mp3 (20.00 seconds)
Created chunk 5: audio_chunks\chunk_5.mp3 (6.20 seconds)


### Step 6: Transcribing Audio Chunks with Timestamps

We transcribe each audio chunk individually and adjust timestamps to match the position in the original audio.

- Each chunk is sent to the OpenAI Whisper API.
- Transcription results include text and start/end times.
- Chunks are combined into a full transcript.
- The final output includes timestamps for each segment for easy navigation and search.


In [None]:
# Step 6: Transcribing Audio Chunks with Timestamps

chunk_files = sorted(Path("audio_chunks").glob("chunk_*.mp3"))

full_transcript = []
current_offset_ms = 0  # Keep track of chunk start

for chunk_file in chunk_files:
    print(f"Transcribing {chunk_file}...")

    with open(chunk_file, "rb") as audio:
        response = openai.audio.transcriptions.create(
            model="whisper-1",
            file=audio,
            prompt="This is an interview after the moon landing. Transcribe accurately."
        )
    
    text = response.text

    start_sec = current_offset_ms / 1000
    end_sec = start_sec + (len(AudioSegment.from_file(chunk_file)) / 1000)

    full_transcript.append({
        "start": start_sec,
        "end": end_sec,
        "text": text
    })

    current_offset_ms += len(AudioSegment.from_file(chunk_file))

# Print full transcript with timestamps
for segment in full_transcript:
    print(f"[{segment['start']:.1f}s - {segment['end']:.1f}s]: {segment['text']}\n")


Transcribing audio_chunks\chunk_1.mp3...
Transcribing audio_chunks\chunk_2.mp3...
Transcribing audio_chunks\chunk_3.mp3...
Transcribing audio_chunks\chunk_4.mp3...
Transcribing audio_chunks\chunk_5.mp3...
[0.0s - 20.0s]: It was rather interesting just to watch them gathering their materials and bouncing around. What do they call it? Kangaroo walk? Really? Something like that. Who named it that? I don't know. I bet those men are going to get quite a reception when they get back. Oh yes, I'll be so glad.

[20.0s - 40.0s]: I think that's pretty well fact, because they've landed so many safely now that I feel relieved. Just getting off of the moon was the thing that was... Have they met with the one that was circling the moon? Yes, they've round the moon.

[40.0s - 60.0s]: So I understand. That wasn't shown either, so I... But they say they have rendezvoused, so... That's a matter of making the circles and then coming down. What do you sort of imagine for the future? Do you imagine them st

### Step 7: Exporting Transcriptions with Timestamps

In this step, we take the chunked transcription data and export it in multiple formats:

1. **Human-readable text file** - easy to read and review
2. **SRT (subtitle) file** - can be used for video subtitles
3. **JSON file** - structured format for programmatic use

Each format includes timestamps for each chunk to preserve context and sequence.


In [None]:
import json

# Example transcription data (replace with your actual chunked results)
transcriptions = [
    {"start": 0.0, "end": 20.0, "text": "It was rather interesting just to watch them gathering their materials and bouncing around. What do they call it? Kangaroo walk? Really? Something like that. Who named it that? I don't know. I bet those men are going to get quite a reception when they get back. Oh yes, I'll be so glad."},
    {"start": 20.0, "end": 40.0, "text": "I think that's pretty well fact, because they've landed so many safely now that I feel relieved. Just getting off of the moon was the thing that was... Have they met with the one that was circling the moon? Yes, they've round the moon."},
    {"start": 40.0, "end": 60.0, "text": "So I understand. That wasn't shown either, so I... But they say they have rendezvoused, so... That's a matter of making the circles and then coming down. What do you sort of imagine for the future? Do you imagine them standing up? I think they will. I think they will."},
    {"start": 60.0, "end": 80.0, "text": "We'll do some more exploring up there. Very positive. Because that was such a very small area, when you think of it, that they just gathered rocks and samples of soil and all. And they did a probe."},
    {"start": 80.0, "end": 86.2, "text": "For more information, visit www.FEMA.gov"}
]

# Output folder
output_dir = Path("transcriptions")
output_dir.mkdir(exist_ok=True)

# 1. Export human-readable text file
txt_file = output_dir / "transcription.txt"
with open(txt_file, "w", encoding="utf-8") as f:
    for chunk in transcriptions:
        f.write(f"[{chunk['start']:.1f}s - {chunk['end']:.1f}s]: {chunk['text']}\n\n")
print(f"Text file exported to {txt_file}")

# 2. Export SRT file
def format_srt_time(seconds):
    h = int(seconds // 3600)
    m = int((seconds % 3600) // 60)
    s = int(seconds % 60)
    ms = int((seconds - int(seconds)) * 1000)
    return f"{h:02}:{m:02}:{s:02},{ms:03}"

srt_file = output_dir / "transcription.srt"
with open(srt_file, "w", encoding="utf-8") as f:
    for i, chunk in enumerate(transcriptions, start=1):
        start_time = format_srt_time(chunk['start'])
        end_time = format_srt_time(chunk['end'])
        f.write(f"{i}\n{start_time} --> {end_time}\n{chunk['text']}\n\n")
print(f"SRT file exported to {srt_file}")

# 3. Export JSON file
json_file = output_dir / "transcription.json"
with open(json_file, "w", encoding="utf-8") as f:
    json.dump(transcriptions, f, indent=4)
print(f"JSON file exported to {json_file}")


Text file exported to transcriptions\transcription.txt
SRT file exported to transcriptions\transcription.srt
JSON file exported to transcriptions\transcription.json
