# 🎬 Momentify: AI-Powered Sports Highlight Generator

**Momentify** is an intelligent video processing tool that automatically identifies and extracts key moments from sports videos. By combining speech transcription, natural language understanding, and video editing, Momentify allows users to upload a match video and request specific highlights—such as "Messi goals" or "penalty kicks"—which are then compiled into a final highlight reel.

---

### 📚 Notebook Structure

1. **Install Dependencies**  
   Required Python libraries such as Whisper, MoviePy, Gradio, and Groq SDK.

2. **Import Libraries & Load API Keys**  
   Load necessary modules and securely retrieve your Groq API key from Kaggle Secrets.

3. **Initialize Models**  
   Load the Whisper speech-to-text model and Groq LLM client.

4. **Define Core Functions**  
   - `transcribe_video()` – Transcribes audio from uploaded video  
   - `extract_highlights()` – Uses Groq (LLaMA 4) to find highlight-worthy timestamps  
   - `process_video()` – Cuts the video clips and combines them

5. **Build Gradio Interface**  
   Interactive web UI for uploading video, entering a query, and previewing the result.

6. **Launch App**  
   Runs the full application through a simple user-friendly interface.

---


# 📦 Install Dependencies

These are the required packages:
- `gradio` for UI
- `moviepy` for video editing
- `whisper` for transcription
- `groq` for using LLaMA 3/4 models via Groq API


In [114]:
!pip install -q gradio
!pip install -q moviepy
!pip install -q git+https://github.com/openai/whisper.git
!pip install -q groq

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


# 📚 Import Libraries

We import everything needed:
- Whisper for transcription
- MoviePy for video editing
- Groq for highlight extraction
- Gradio for the interactive interface


In [116]:
import whisper
import gradio as gr
import moviepy.editor as mp
import json
import re
import os
from groq import Groq
from kaggle_secrets import UserSecretsClient

# 🔐 Load GROQ API Key from Kaggle Secrets

This is safer than putting your key directly in the code.
You must add `GROQ_API_KEY` to your env Secrets.


In [117]:
user_secrets = UserSecretsClient()
groq_api_key = user_secrets.get_secret("GROQ_API_KEY")
client = Groq(api_key=groq_api_key)

# 🗣️ Load Whisper Model (once)

We use the medium-sized model for a good balance between speed and accuracy.

In [118]:
model = whisper.load_model("medium")

# 🎧 Transcribe Uploaded Video with Whisper

This function takes an MP4 file and returns the transcript text and timestamps.


In [119]:
def transcribe_video(file_path):
    result = model.transcribe(file_path, verbose=True)
    return result['text'], result['segments']

# 🧠 Extract Highlights from Transcript using LLaMA 4 (Groq)

This sends the transcript and user query to Groq API to get segment timestamps.
It expects structured JSON and handles streamed responses.


In [129]:
def extract_highlights(transcript_segments, user_query):
    transcript_text = "\n".join([
        f"[{int(seg['start'])} --> {int(seg['end'])}] {seg['text']}"
        for seg in transcript_segments
    ])

    prompt = f"""
You are a JSON-only assistant.

Return only valid JSON in the following format:

{{
  "segments": [
    {{ "start": 52, "end": 58 }},
    {{ "start": 108, "end": 110 }}
  ]
}}

No markdown, no explanation, no wrapping. Just plain JSON.
User query: {user_query}
Transcript:
{transcript_text}
"""

    completion = client.chat.completions.create(
        model="meta-llama/llama-4-scout-17b-16e-instruct",
        messages=[{"role": "user", "content": prompt}],
        temperature=1,
        max_tokens=1024,
        top_p=1,
        stream=True,
        stop=None,
    )

    streamed_response = ""
    for chunk in completion:
        streamed_response += chunk.choices[0].delta.content or ""

    raw_json = streamed_response.strip()

    try:
        return json.loads(raw_json)
    except json.JSONDecodeError:
        raise ValueError("❌ Groq did not return valid JSON:\n" + raw_json)


# 🎬 Process Video: Transcribe → Detect Highlights → Cut Clips

This function:
1. Transcribes the video
2. Extracts highlights with Groq
3. Cuts clips using MoviePy
4. Returns steps and the final video path


In [133]:
def process_video(video_file, user_query):
    steps = []
    steps.append("🎬 **Step 1:** Video uploaded.")

    video_path = video_file
    steps.append("🔍 **Step 2:** Transcribing video with Whisper...")
    transcript_text, transcript_segments = transcribe_video(video_path)

    steps.append("🧠 **Step 3:** Extracting highlights using LLaMA 4...")
    results = extract_highlights(transcript_segments, user_query)

    segments = [(max(0, seg["start"] - 5), seg["end"] + 5) for seg in results["segments"]]
    steps.append(f"✂️ **Step 4:** Extracting {len(segments)} clips from video...")

    video = mp.VideoFileClip(video_path)
    clips = [video.subclip(start, end) for start, end in segments]
    final_clip = mp.concatenate_videoclips(clips)

    output_path = "highlight_output.mp4"
    steps.append("🎞️ **Step 5:** Rendering final highlight video...")
    final_clip.write_videofile(output_path, codec="libx264", audio_codec="aac", verbose=False, logger=None)

    steps.append("✅ **Step 6:** Done! Here is your final highlight video.")
    return "\n\n".join(steps), output_path


# 🌐 Gradio UI Interface

This creates a web app where users can upload a video and enter a query.
The processed highlight video is displayed after processing.


In [134]:
with gr.Blocks(theme=gr.themes.Soft()) as demo:
    gr.Markdown("# ⚽️ Momentify)")
    gr.Markdown("Upload a sports video and ask for highlights (e.g., `Messi goals`, `penalties`, etc.)")

    with gr.Row():
        with gr.Column():
            video_input = gr.Video(label="📁 Upload MP4 Video")
            query_input = gr.Textbox(label="📝 Highlight Query", placeholder="e.g. Messi goals")
            submit_btn = gr.Button("🚀 Generate Highlights")
            output_status = gr.Markdown()

        with gr.Column():
            output_video = gr.Video(label="🎯 Final Highlight Video")

    submit_btn.click(
        fn=process_video,
        inputs=[video_input, query_input],
        outputs=[output_status, output_video]
    )

demo.launch()

* Running on local URL:  http://127.0.0.1:7882
Kaggle notebooks require sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

* Running on public URL: https://eac42cf54d0ce588ad.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




Detecting language using up to the first 30 seconds. Use `--language` to specify the language
Detected language: English
[00:00.000 --> 00:06.920]  in the foothills of greatness from the Andes to the Alps from River Plate to
[00:06.920 --> 00:14.320]  the banks of the Seine our planet unites around its ultimate game
[00:21.520 --> 00:26.000]  the final of the 22nd World Cup
[00:26.000 --> 00:28.000]  France again or Argentina
[00:28.000 --> 00:30.000]  Alvarez
[00:30.000 --> 00:32.000]  What a pop!
[00:36.000 --> 00:38.000]  Di Maria away from Conde
[00:46.000 --> 00:48.000]  Di Biele had a little flick
[00:50.000 --> 00:52.000]  a breath, a heartbeat
[00:52.000 --> 00:54.000]  and Messi!
[00:56.000 --> 00:58.000]  a goal for Evermore
[01:08.000 --> 01:10.000]  perhaps his moment for infinity
[01:10.000 --> 01:12.000]  Messi
[01:12.000 --> 01:14.000]  Delightful
[01:14.000 --> 01:16.000]  Alvarez, here's McAllister
[01:16.000 --> 01:18.000]  Di Maria is the spare man
[01:18.000 --> 01:

# Demo 

In [132]:
import whisper
import gradio as gr
import moviepy.editor as mp
import json
import re
import os
from groq import Groq
from kaggle_secrets import UserSecretsClient

# --- Load Secrets ---
user_secrets = UserSecretsClient()
groq_api_key = user_secrets.get_secret("GROQ_API_KEY")
client = Groq(api_key=groq_api_key)

# --- Load Whisper model once ---
model = whisper.load_model("medium")

# --- Transcribe the uploaded video ---
def transcribe_video(file_path):
    result = model.transcribe(file_path, verbose=True)
    return result['text'], result['segments']

# --- Extract highlight segments using Groq (LLaMA3) ---
def extract_highlights(transcript_segments, user_query):
    transcript_text = "\n".join([
        f"[{int(seg['start'])} --> {int(seg['end'])}] {seg['text']}"
        for seg in transcript_segments
    ])

    prompt = f"""
You are a JSON-only assistant.

Return only valid JSON in the following format:

{{
  "segments": [
    {{ "start": 52, "end": 58 }},
    {{ "start": 108, "end": 110 }}
  ]
}}

No markdown, no explanation, no wrapping. Just plain JSON.
User query: {user_query}
Transcript:
{transcript_text}
"""

    completion = client.chat.completions.create(
        model="llama3-70b-8192",
        messages=[{"role": "user", "content": prompt}],
        temperature=0,
        max_tokens=1024,
        top_p=1,
        stream=True,
    )

    streamed_response = ""
    for chunk in completion:
        streamed_response += chunk.choices[0].delta.content or ""

    raw_json = streamed_response.strip()

    try:
        return json.loads(raw_json)
    except json.JSONDecodeError:
        raise ValueError("❌ Groq did not return valid JSON:\n" + raw_json)

# --- Full processing pipeline ---
def process_video(video_file, user_query):
    steps = []
    steps.append("🎬 **Step 1:** Video uploaded.")

    # Save uploaded file
    video_path = video_file
    steps.append("🔍 **Step 2:** Transcribing video with Whisper...")
    transcript_text, transcript_segments = transcribe_video(video_path)

    steps.append("🧠 **Step 3:** Extracting highlights using LLaMA 3...")
    results = extract_highlights(transcript_segments, user_query)

    segments = [(max(0, seg["start"] - 5), seg["end"] + 5) for seg in results["segments"]]
    steps.append(f"✂️ **Step 4:** Extracting {len(segments)} clips from video...")

    video = mp.VideoFileClip(video_path)
    clips = [video.subclip(start, end) for start, end in segments]
    final_clip = mp.concatenate_videoclips(clips)

    output_path = "highlight_output.mp4"
    steps.append("🎞️ **Step 5:** Rendering final highlight video...")
    final_clip.write_videofile(output_path, codec="libx264", audio_codec="aac", verbose=False, logger=None)

    steps.append("✅ **Step 6:** Done! Here is your final highlight video.")
    return "\n\n".join(steps), output_path

# --- Gradio Interface ---
with gr.Blocks(theme=gr.themes.Soft()) as demo:
    gr.Markdown("# ⚽️ AI Sports Highlight Generator (Groq + Whisper + MoviePy)")
    gr.Markdown("Upload a sports video and ask for highlights (e.g., `Messi goals`, `penalties`, etc.)")

    with gr.Row():
        with gr.Column():
            video_input = gr.Video(label="📁 Upload MP4 Video")
            query_input = gr.Textbox(label="📝 Highlight Query", placeholder="e.g. Messi goals")

            submit_btn = gr.Button("🚀 Generate Highlights")
            output_status = gr.Markdown()

        with gr.Column():
            output_video = gr.Video(label="🎯 Final Highlight Video")

    submit_btn.click(
        fn=process_video,
        inputs=[video_input, query_input],
        outputs=[output_status, output_video]
    )

# --- Launch the app ---
demo.launch()


* Running on local URL:  http://127.0.0.1:7881
Kaggle notebooks require sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

* Running on public URL: https://e0ee32dbe3f8ced073.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




### *Programmed by Hussain Alyafei*
### *Thank you!*