<a href="https://colab.research.google.com/github/Ayushichadha/robocorp_project/blob/main/VideoTranscription.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Task 1: Video Analysis - Speech to Text and Sentiment Analysis * Integrate a speech-to-text engine to process videos and generate transcripts with timestamps. * Develop a sentiment analysis service to evaluate the video's dialogue and detect changes over time. * Create endpoints to retrieve speech transcripts and sentiment analysis results.

Here's a breakdown of what the code does:

1.Installs necessary dependencies.
2.Defines functions for:

a.)Downloading a YouTube video
b.)Converting video to audio
c.)Transcribing audio to text
d.)Performing sentiment analysis
e.)Plotting sentiment over time


3.Executes the main workflow:
a.)Downloads the specified TED Talk video
b.)Converts it to audio
c.)Transcribes the audio
d.)Performs sentiment analysis on the transcript
e.)Plots the sentiment scores over time
f.)Prints the transcript with sentiment scores
g.)Cleans up temporary files



In [1]:
# Install necessary packages
!pip install yt-dlp
!pip install transformers
!pip install datasets
!pip install librosa
!apt-get install -y ffmpeg

# Import necessary libraries
import os
import yt_dlp as youtube_dl
from transformers import pipeline
import librosa
import matplotlib.pyplot as plt

def download_youtube_video(url, output_path):
    """
    Download a YouTube video using yt-dlp.

    Args:
    url (str): URL of the YouTube video
    output_path (str): Path to save the downloaded video

    Returns:
    str: Path to the downloaded video file
    """
    ydl_opts = {
        'outtmpl': output_path,
        'format': 'bestaudio/best',
        'postprocessors': [{
            'key': 'FFmpegExtractAudio',
            'preferredcodec': 'wav',
            'preferredquality': '192',
        }],
    }

    with youtube_dl.YoutubeDL(ydl_opts) as ydl:
        ydl.download([url])

    return output_path

def transcribe_audio(audio_path):
    """
    Transcribe audio file to text using transformers' Wav2Vec2.

    Args:
    audio_path (str): Path to the input audio file

    Returns:
    list: List of dictionaries containing timestamps and transcribed text
    """
    # Load the pre-trained Wav2Vec2 model and tokenizer
    transcriber = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h")

    # Load the audio file
    audio, sample_rate = librosa.load(audio_path, sr=16000)

    # Segment the audio into chunks
    chunk_duration = 30  # 30 seconds per chunk
    chunk_length = chunk_duration * sample_rate
    transcript = []

    for i in range(0, len(audio), chunk_length):
        chunk = audio[i:i + chunk_length]
        text = transcriber(chunk)["text"]
        transcript.append({
            'timestamp': i // sample_rate,
            'text': text
        })

    return transcript

def perform_sentiment_analysis(transcript):
    """
    Perform sentiment analysis on transcribed text using transformers' sentiment analysis model.

    Args:
    transcript (list): List of dictionaries containing timestamps and transcribed text

    Returns:
    list: List of dictionaries containing timestamps, text, and sentiment scores
    """
    sentiment_analyzer = pipeline("sentiment-analysis")

    sentiment_results = []
    for entry in transcript:
        sentiment = sentiment_analyzer(entry['text'])[0]
        sentiment_results.append({
            'timestamp': entry['timestamp'],
            'text': entry['text'],
            'sentiment': sentiment['label'],
            'score': sentiment['score']
        })

    return sentiment_results

def plot_sentiment_over_time(sentiment_results):
    """
    Plot sentiment scores over time.

    Args:
    sentiment_results (list): List of dictionaries containing timestamps and sentiment scores
    """
    timestamps = [entry['timestamp'] for entry in sentiment_results]
    sentiments = [entry['score'] if entry['sentiment'] == 'POSITIVE' else -entry['score'] for entry in sentiment_results]

    plt.figure(figsize=(12, 6))
    plt.plot(timestamps, sentiments, marker='o')
    plt.title('Sentiment Analysis Over Time')
    plt.xlabel('Time (seconds)')
    plt.ylabel('Sentiment Score')
    plt.grid(True)
    plt.show()

# Main execution
video_url = "https://youtu.be/Q2s-WLW6UxU?si=VakPzdQldEXyG-kJ"
audio_path = 'ted_talk_audio.wav'

# Download video
download_youtube_video(video_url, audio_path)

# Transcribe audio
transcript = transcribe_audio(audio_path)

# Perform sentiment analysis
sentiment_results = perform_sentiment_analysis(transcript)

# Plot sentiment over time
plot_sentiment_over_time(sentiment_results)

# Print transcript and sentiment results
for result in sentiment_results:
    print(f"Timestamp: {result['timestamp']}")
    print(f"Text: {result['text']}")
    print(f"Sentiment: {result['sentiment']}")
    print("---")

# Clean up temporary files
os.remove(audio_path)


Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
ffmpeg is already the newest version (7:4.4.2-0ubuntu0.22.04.1).
0 upgraded, 0 newly installed, 0 to remove and 45 not upgraded.
[youtube] Extracting URL: https://youtu.be/Q2s-WLW6UxU?si=VakPzdQldEXyG-kJ
[youtube] Q2s-WLW6UxU: Downloading webpage




[youtube] Q2s-WLW6UxU: Downloading ios player API JSON
[youtube] Q2s-WLW6UxU: Downloading player 5604538d
[youtube] Q2s-WLW6UxU: Downloading web player API JSON




[youtube] Q2s-WLW6UxU: Downloading web player API JSON




[youtube] Q2s-WLW6UxU: Downloading web player API JSON
[youtube] Q2s-WLW6UxU: Downloading m3u8 information
[info] Q2s-WLW6UxU: Downloading 1 format(s): 251
[download] Destination: ted_talk_audio.wav
[download] 100% of   10.65MiB in 00:00:00 at 21.98MiB/s  
[ExtractAudio] Destination: ted_talk_audio.wav.wav
Deleting original file ted_talk_audio.wav (pass -k to keep)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
Some weights of the model checkpoint at facebook/wav2vec2-base-960h were not used when initializing Wav2Vec2ForCTC: ['wav2vec2.encoder.pos_conv_embed.conv.weight_g', 'wav2vec2.encoder.pos_conv_embed.conv.weight_v']
- This IS expected if you are initializing Wav2Vec2ForCTC from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing Wav2Vec2ForCTC from the checkpoint of a model that you expect to be exactly identical (initia

FileNotFoundError: [Errno 2] No such file or directory: 'ted_talk_audio.wav'