# Video Transcriber - Kaggle Notebook

This notebook allows you to run the Video Transcriber application with a web interface in Kaggle.

## Features
- Transcribe videos using OpenAI's Whisper speech recognition model
- Process batches of videos
- Download videos from Instagram
- Process transcripts with Groq AI
- Save transcriptions to Notion

Let's get started!

## Step 1: Install Dependencies

Install the required packages and confirm FFmpeg is available.

In [None]:
# Check for FFmpeg (should be pre-installed on Kaggle)
!ffmpeg -version

# Install Python packages
!pip install torch openai-whisper requests gradio instaloader browser_cookie3

## Step 2: Create Application Files

Let's create all the necessary files for the application.

In [None]:
# Create working directory
!mkdir -p /kaggle/working/video-transcriber
%cd /kaggle/working/video-transcriber

## Step 3: Upload Application Files

You need to upload the application files to this notebook. Required files:

- main.py
- web_ui.py
- notion_integration.py
- groq_integration.py
- instaloader_integration.py

You can either:
1. Upload files using the "Add Data" button in Kaggle
2. Copy and paste code into cells and save them as files (shown below)
3. Clone from a GitHub repository

In [None]:
# Option 1: Clone from GitHub if available
# !git clone https://github.com/haroonalhadisk/video-transcriber-app.git .

# Option 2: Create placeholder for upload
print("Please upload the application files using Kaggle's 'Add Data' feature.")
print("Or create the files directly in cells below.")

## Step 4: Create Application Settings Directory

Create a directory for settings to persist within the Kaggle session.

In [None]:
# Create the settings directory
!mkdir -p ~/.videotranscriber

## Step 5: Run the Web Interface

Launch the application in web mode.

In [None]:
!python main.py --web

## Additional Tips

### Working with Kaggle Datasets

You can access files from Kaggle datasets directly.

In [None]:
# List files in the input directory (if you've added a dataset)
!ls -la /kaggle/input/

# Example of accessing a specific dataset
# !ls -la /kaggle/input/your-dataset-name/

### Save Results to Output Directory

Kaggle automatically preserves files in the `/kaggle/working` directory as outputs from your notebook.

In [None]:
# Create a dedicated output directory
!mkdir -p /kaggle/working/transcriptions
print("Use this directory in the web interface for outputs: /kaggle/working/transcriptions")

### Performance Tips

Kaggle provides GPU acceleration which can speed up Whisper transcription.

In [None]:
# Check if GPU is available
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU device: {torch.cuda.get_device_name(0)}")

## Troubleshooting

If you encounter issues:

1. Check that all application files are correctly uploaded or created
2. Verify paths in the web interface match Kaggle's directory structure
3. For Instagram issues, try username/password login instead of browser cookies
4. Be mindful of Kaggle's session time limits for long transcription jobs
5. Use smaller models or batch sizes for large workloads