This guide will walk you through setting up a project that extracts transcripts from YouTube videos using FastAPI (backend) and displays them in a Next.js (frontend) application.
Here’s the folder structure for the project:
youtube-project/
├── backend/
│ ├── env/ # Python virtual environment
│ ├── __pycache__/ # Python cache files
│ ├── main.py # FastAPI entry point
│ ├── routes.py # API routes
│ └── requirements.txt # Python dependencies
├── frontend/
│ ├── app/ # Next.js app directory
│ ├── node_modules/ # Node.js dependencies
│ ├── public/ # Static files
│ ├── package.json # Node.js dependencies
│ └── package-lock.json # Node.js lock file
├── youtube_ai/
│ ├── mp3/ # Downloaded audio files
│ ├── audio.json # Transcript in JSON format
│ ├── audio.txt # Transcript in text format
│ ├── extract_code.py # Script to extract transcripts
│ └── requirements.txt # Python dependencies for transcript extraction
└── README.md # Project documentation
- Open your terminal and navigate to your project directory:
cd C:\Users\Aamir\OneDrive\Desktop\youtube-project
- Create the
backendfolder:mkdir backend cd backend
- Create a virtual environment:
python -m venv env
- Activate the virtual environment:
- On Windows:
.\env\Scripts\activate
- On macOS/Linux:
source env/bin/activate
- On Windows:
- Install FastAPI and Uvicorn:
pip install fastapi uvicorn
- Save the dependencies to
requirements.txt:pip freeze > requirements.txt
-
Create
main.py:from fastapi import FastAPI from routes import router # Import the router app = FastAPI() # Include the router app.include_router(router) if __name__ == "__main__": import uvicorn uvicorn.run(app, host="127.0.0.1", port=8000)
-
Create
routes.py:from fastapi import APIRouter, HTTPException from pydantic import BaseModel import json from pathlib import Path # Create a router router = APIRouter() class Transcript(BaseModel): message: str transcription: str @router.get("/transcript", response_model=Transcript) async def get_transcript(): # Define the path to the audio.json file audio_json_path = Path("C:/Users/Aamir/OneDrive/Desktop/youtube-project/youtube_ai/audio.json") # Check if the file exists if not audio_json_path.exists(): raise HTTPException(status_code=404, detail="Transcript file not found") # Read the transcript data from the file try: with open(audio_json_path, "r", encoding="utf-8") as file: transcript_data = json.load(file) except Exception as e: raise HTTPException(status_code=500, detail=f"Error reading transcript file: {str(e)}") return {"message": "Transcript fetched successfully!", "transcription": transcript_data}
-
Run the FastAPI server:
uvicorn main:app --reload
- Navigate to your project directory:
cd C:\Users\Aamir\OneDrive\Desktop\youtube-project
- Create the
frontendfolder:mkdir frontend cd frontend
- Initialize a Next.js app:
npx create-next-app . - Install dependencies:
npm install
-
Open
app/page.jsand replace its content with:import { useEffect, useState } from 'react'; export default function Home() { const [transcript, setTranscript] = useState(null); const [loading, setLoading] = useState(true); const [error, setError] = useState(null); useEffect(() => { const fetchTranscript = async () => { try { const response = await fetch('http://127.0.0.1:8000/transcript'); if (!response.ok) { throw new Error('Failed to fetch transcript'); } const data = await response.json(); setTranscript(data.transcription); } catch (error) { setError(error.message); } finally { setLoading(false); } }; fetchTranscript(); }, []); if (loading) { return <p>Loading transcript...</p>; } if (error) { return <p>Error: {error}</p>; } return ( <div> <h1>Transcript</h1> <pre>{JSON.stringify(transcript, null, 2)}</pre> </div> ); }
-
Run the Next.js app:
npm run dev
- Navigate to your project directory:
cd C:\Users\Aamir\OneDrive\Desktop\youtube-project
- Create the
youtube_aifolder:mkdir youtube_ai cd youtube_ai
- Create a virtual environment:
python -m venv env
- Activate the virtual environment:
- On Windows:
.\env\Scripts\activate
- On macOS/Linux:
source env/bin/activate
- On Windows:
- Install dependencies:
pip install youtube-transcript-api pytube openai-whisper
- Save the dependencies to
requirements.txt:pip freeze > requirements.txt
-
Create
extract_code.py:from youtube_transcript_api import YouTubeTranscriptApi from pytube import YouTube import whisper import json import os def extract_transcript(video_url): try: # Extract video ID from URL video_id = video_url.split("v=")[1] # Try fetching the transcript transcript = YouTubeTranscriptApi.get_transcript(video_id) with open("audio.json", "w") as file: json.dump(transcript, file) print("Transcript saved to audio.json") except Exception as e: print(f"Error fetching transcript: {e}") print("Using Whisper AI to generate transcript...") # Download audio using pytube yt = YouTube(video_url) audio_stream = yt.streams.filter(only_audio=True).first() audio_stream.download(output_path="mp3", filename="audio.mp3") # Transcribe audio using Whisper model = whisper.load_model("base") result = model.transcribe("mp3/audio.mp3") with open("audio.json", "w") as file: json.dump(result, file) print("Transcript saved to audio.json") if __name__ == "__main__": video_url = input("Enter YouTube Video URL: ") extract_transcript(video_url)
-
Run the script:
python extract_code.py
-
Start the FastAPI backend:
cd C:\Users\Aamir\OneDrive\Desktop\youtube-project\backend uvicorn main:app --reload
-
Start the Next.js frontend:
cd C:\Users\Aamir\OneDrive\Desktop\youtube-project\frontend npm run dev
-
Open your browser and navigate to:
http://localhost:3000
- CORS Issues: Ensure CORS is enabled in the FastAPI backend (see Step 1.4).
- File Not Found: Ensure the
audio.jsonfile exists in theyoutube_aifolder. - Dependency Errors: Reinstall dependencies using
pip install -r requirements.txtornpm install.
This guide should help you set up the project without errors. Let me know if you need further assistance! 😊