YouTube Transcript Project Setup Guide

This guide will walk you through setting up a project that extracts transcripts from YouTube videos using FastAPI (backend) and displays them in a Next.js (frontend) application.

Project Structure

Here’s the folder structure for the project:

youtube-project/
├── backend/
│   ├── env/                   # Python virtual environment
│   ├── __pycache__/           # Python cache files
│   ├── main.py                # FastAPI entry point
│   ├── routes.py              # API routes
│   └── requirements.txt       # Python dependencies
├── frontend/
│   ├── app/                   # Next.js app directory
│   ├── node_modules/          # Node.js dependencies
│   ├── public/                # Static files
│   ├── package.json           # Node.js dependencies
│   └── package-lock.json      # Node.js lock file
├── youtube_ai/
│   ├── mp3/                   # Downloaded audio files
│   ├── audio.json             # Transcript in JSON format
│   ├── audio.txt              # Transcript in text format
│   ├── extract_code.py        # Script to extract transcripts
│   └── requirements.txt       # Python dependencies for transcript extraction
└── README.md                  # Project documentation

Step 1: Set Up the Backend (FastAPI)

1.1 Create the Backend Folder

Open your terminal and navigate to your project directory:
```
cd C:\Users\Aamir\OneDrive\Desktop\youtube-project
```
Create the backend folder:
```
mkdir backend
cd backend
```

1.2 Set Up a Python Virtual Environment

Create a virtual environment:
```
python -m venv env
```
Activate the virtual environment:
- On Windows:
```
.\env\Scripts\activate
```
- On macOS/Linux:
```
source env/bin/activate
```

1.3 Install FastAPI and Dependencies

Install FastAPI and Uvicorn:
```
pip install fastapi uvicorn
```
Save the dependencies to requirements.txt:
```
pip freeze > requirements.txt
```

1.4 Create the FastAPI Application

Create main.py:

from fastapi import FastAPI
from routes import router  # Import the router

app = FastAPI()

# Include the router
app.include_router(router)

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="127.0.0.1", port=8000)

Create routes.py:

from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
import json
from pathlib import Path

# Create a router
router = APIRouter()

class Transcript(BaseModel):
    message: str
    transcription: str

@router.get("/transcript", response_model=Transcript)
async def get_transcript():
    # Define the path to the audio.json file
    audio_json_path = Path("C:/Users/Aamir/OneDrive/Desktop/youtube-project/youtube_ai/audio.json")
    
    # Check if the file exists
    if not audio_json_path.exists():
        raise HTTPException(status_code=404, detail="Transcript file not found")
    
    # Read the transcript data from the file
    try:
        with open(audio_json_path, "r", encoding="utf-8") as file:
            transcript_data = json.load(file)
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Error reading transcript file: {str(e)}")
    
    return {"message": "Transcript fetched successfully!", "transcription": transcript_data}

Run the FastAPI server:
```
uvicorn main:app --reload
```

Step 2: Set Up the Frontend (Next.js)

2.1 Create the Frontend Folder

Navigate to your project directory:

cd C:\Users\Aamir\OneDrive\Desktop\youtube-project

Create the frontend folder:
```
mkdir frontend
cd frontend
```

2.2 Set Up a Next.js App

Initialize a Next.js app:
```
npx create-next-app .
```
Install dependencies:
```
npm install
```

2.3 Create the Transcript Page

Open app/page.js and replace its content with:

import { useEffect, useState } from 'react';

export default function Home() {
  const [transcript, setTranscript] = useState(null);
  const [loading, setLoading] = useState(true);
  const [error, setError] = useState(null);

  useEffect(() => {
    const fetchTranscript = async () => {
      try {
        const response = await fetch('http://127.0.0.1:8000/transcript');
        if (!response.ok) {
          throw new Error('Failed to fetch transcript');
        }
        const data = await response.json();
        setTranscript(data.transcription);
      } catch (error) {
        setError(error.message);
      } finally {
        setLoading(false);
      }
    };

    fetchTranscript();
  }, []);

  if (loading) {
    return <p>Loading transcript...</p>;
  }

  if (error) {
    return <p>Error: {error}</p>;
  }

  return (
    <div>
      <h1>Transcript</h1>
      <pre>{JSON.stringify(transcript, null, 2)}</pre>
    </div>
  );
}

Run the Next.js app:
```
npm run dev
```

Step 3: Set Up the YouTube Transcript Extraction

3.1 Create the `youtube_ai` Folder

Navigate to your project directory:

cd C:\Users\Aamir\OneDrive\Desktop\youtube-project

Create the youtube_ai folder:
```
mkdir youtube_ai
cd youtube_ai
```

3.2 Install Dependencies

Create a virtual environment:
```
python -m venv env
```
Activate the virtual environment:
- On Windows:
```
.\env\Scripts\activate
```
- On macOS/Linux:
```
source env/bin/activate
```

Install dependencies:

pip install youtube-transcript-api pytube openai-whisper

Save the dependencies to requirements.txt:
```
pip freeze > requirements.txt
```

3.3 Create the Transcript Extraction Script

Create extract_code.py:

from youtube_transcript_api import YouTubeTranscriptApi
from pytube import YouTube
import whisper
import json
import os

def extract_transcript(video_url):
    try:
        # Extract video ID from URL
        video_id = video_url.split("v=")[1]
        
        # Try fetching the transcript
        transcript = YouTubeTranscriptApi.get_transcript(video_id)
        with open("audio.json", "w") as file:
            json.dump(transcript, file)
        print("Transcript saved to audio.json")
    except Exception as e:
        print(f"Error fetching transcript: {e}")
        print("Using Whisper AI to generate transcript...")
        
        # Download audio using pytube
        yt = YouTube(video_url)
        audio_stream = yt.streams.filter(only_audio=True).first()
        audio_stream.download(output_path="mp3", filename="audio.mp3")
        
        # Transcribe audio using Whisper
        model = whisper.load_model("base")
        result = model.transcribe("mp3/audio.mp3")
        with open("audio.json", "w") as file:
            json.dump(result, file)
        print("Transcript saved to audio.json")

if __name__ == "__main__":
    video_url = input("Enter YouTube Video URL: ")
    extract_transcript(video_url)

Run the script:
```
python extract_code.py
```

Step 4: Run the Project

Start the FastAPI backend:

cd C:\Users\Aamir\OneDrive\Desktop\youtube-project\backend
uvicorn main:app --reload

Start the Next.js frontend:

cd C:\Users\Aamir\OneDrive\Desktop\youtube-project\frontend
npm run dev

Open your browser and navigate to:
```
http://localhost:3000
```

Troubleshooting

CORS Issues: Ensure CORS is enabled in the FastAPI backend (see Step 1.4).
File Not Found: Ensure the audio.json file exists in the youtube_ai folder.
Dependency Errors: Reinstall dependencies using pip install -r requirements.txt or npm install.

This guide should help you set up the project without errors. Let me know if you need further assistance! 😊

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

YouTube Transcript Project Setup Guide

Project Structure

Step 1: Set Up the Backend (FastAPI)

1.1 Create the Backend Folder

1.2 Set Up a Python Virtual Environment

1.3 Install FastAPI and Dependencies

1.4 Create the FastAPI Application

Step 2: Set Up the Frontend (Next.js)

2.1 Create the Frontend Folder

2.2 Set Up a Next.js App

2.3 Create the Transcript Page

Step 3: Set Up the YouTube Transcript Extraction

3.1 Create the `youtube_ai` Folder

3.2 Install Dependencies

3.3 Create the Transcript Extraction Script

Step 4: Run the Project

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
frontend		frontend
youtube_ai		youtube_ai
README.md		README.md

Programming-Communities/YouTube-Transcript-Project-Setup-Guide

Folders and files

Latest commit

History

Repository files navigation

YouTube Transcript Project Setup Guide

Project Structure

Step 1: Set Up the Backend (FastAPI)

1.1 Create the Backend Folder

1.2 Set Up a Python Virtual Environment

1.3 Install FastAPI and Dependencies

1.4 Create the FastAPI Application

Step 2: Set Up the Frontend (Next.js)

2.1 Create the Frontend Folder

2.2 Set Up a Next.js App

2.3 Create the Transcript Page

Step 3: Set Up the YouTube Transcript Extraction

3.1 Create the youtube_ai Folder

3.2 Install Dependencies

3.3 Create the Transcript Extraction Script

Step 4: Run the Project

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

3.1 Create the `youtube_ai` Folder

Packages