# Content Metadata AI Toolkit 🚀

Welcome! This Colab notebook allows you to easily generate metadata (like titles, summaries, keywords, and categories) for your video or content files using AI.

**How it works:**
1.  **Setup:** Run the initial setup cell to install necessary tools and configure your API key.
2.  **Upload:** Upload your video or content file(s).
3.  **Generate:** Run the subsequent cells to generate different types of metadata based on your uploaded content.

**Prerequisites:**
* A Google Account (to use Colab).
* A Gemini API Key. You'll need to add this as a Colab Secret.

## Step 0: One-Time Setup

This first code cell performs the necessary setup:
* Clones the `video-metadata-ai-toolkit` repository from GitHub.
* Installs the required Python libraries.
* Creates a directory for your uploads.
* Imports necessary code modules.
* **Crucially, it retrieves your Gemini API key from Colab Secrets.**

**➡️ Action Required:**
1.  Click the **🔑 Secrets** tab in the left sidebar.
2.  Click **+ Add a new secret**.
3.  Enter the name `GEMINI_API_KEY`.
4.  Paste your Gemini API key into the value field.
5.  Enable the toggle **Notebook access**.
6.  Close the Secrets panel.
7.  Now, run the code cell below.

In [None]:
#@title Step 0: Run Setup (Installs tools & configures API Key)

# Change directory to the main content folder
%cd /content

# Remove the repository if it already exists to ensure a fresh clone
!if [[ -d "/content/video-metadata-ai-toolkit" ]]; then echo "Removing existing toolkit directory..."; rm -rf /content/video-metadata-ai-toolkit; fi

# Clone the toolkit repository from GitHub
print("Cloning repository..."),
!git clone https://github.com/google-marketing-solutions/video-metadata-ai-toolkit.git

# Install required Python packages (output is hidden for cleaner logs)
print("Installing requirements..."),
!pip install -r /content/video-metadata-ai-toolkit/requirements.txt &> /dev/null

# Create a directory to store uploaded files
!mkdir -p /content/uploads

# Change directory to where the AI scripts are located
%cd /content/video-metadata-ai-toolkit/ai_metadata

# Import necessary libraries
import os
import logging
from google.colab import userdata, files
import ai_metadata_generator # Toolkit's main AI functions
import file_io             # Toolkit's file handling functions

# --- API Key Configuration ---
print("Configuring API Key...")
try:
    # Attempt to get the API key from Colab Secrets
    api_key = userdata.get('GEMINI_API_KEY')
    if not api_key:
        raise ValueError("API Key not found in Colab Secrets.")
    # Set the API key as an environment variable for the toolkit to use
    %env GEMINI_API_KEY = $api_key
    print("✅ Gemini API Key configured successfully from Secrets.")
except ValueError as e:
    print(f"❌ Error: {e}")
    print("Please ensure you have added 'GEMINI_API_KEY' to Colab Secrets (see instructions above).")
except Exception as e:
    print(f"❌ An unexpected error occurred while accessing Secrets: {e}")

# Suppress verbose logging from libraries to keep output clean
logging.getLogger().setLevel(logging.CRITICAL)

print("\nSetup complete. You can now proceed to Step 1.")

## Step 1: Upload Content File(s)

Run the cell below to upload your video or other content files.

* Click the "Choose Files" button that appears after running the cell.
* Select one or more files from your computer.
* **Note:** If you upload multiple files, the toolkit will treat them collectively as a *single* piece of content when generating metadata in the following steps.

In [None]:
#@title Step 1: Upload Your File(s)

print("Please choose file(s) to upload...")
# Use Colab's file upload widget. Files are saved to the /content/uploads directory.
# 'input_files_dict' will store the uploaded file data.
input_files_dict = files.upload()

# Check if any files were uploaded
if not input_files_dict:
    print("\n⚠️ No files were uploaded. Please run this cell again and select files.")
    uploaded_files = [] # Ensure uploaded_files is an empty list if nothing was uploaded
else:
    # Get the names of the uploaded files
    uploaded_files = list(input_files_dict.keys())
    print(f"\n✅ Successfully uploaded: {', '.join(uploaded_files)}")
    # Store the uploaded file paths for later use (prefixed with the upload directory)
    uploaded_file_paths = [os.path.join('/content/uploads', f) for f in uploaded_files]

    # --- Save uploaded files to the '/content/uploads' directory ---
    # This step ensures the files persist in the Colab environment for the toolkit
    upload_dir = '/content/uploads'
    if not os.path.exists(upload_dir):
        os.makedirs(upload_dir)

    for filename, content in input_files_dict.items():
        filepath = os.path.join(upload_dir, filename)
        with open(filepath, 'wb') as f:
            f.write(content)
        print(f"   Saved '{filename}' to {filepath}")

    print("\nProceed to the next steps to generate metadata.")

# Clean up the dictionary holding file content in memory, as it's now saved to disk
# del input_files_dict

## Step 2: Generate Suggested Titles

This cell uses the AI to analyze your uploaded content and suggest relevant and engaging titles.

In [None]:
#@title Step 2: Suggest Titles

# Check if files were uploaded in the previous step
if 'uploaded_file_paths' not in globals() or not uploaded_file_paths:
    print("❌ No files uploaded yet. Please run Step 1 first.")
else:
    print(f"Analyzing content ({', '.join(uploaded_files)}) for title suggestions...")
    # Create File objects for the toolkit using the saved paths
    content_files = [file_io.File(f_path) for f_path in uploaded_file_paths]

    try:
        # Call the AI function to suggest titles
        titles = ai_metadata_generator.suggest_titles(content_files)
        print("\n**Suggested Titles:**")
        if titles:
            # Print each suggested title
            print("\n".join([f"- {t}" for t in titles]))
        else:
            print("No titles were generated.")

    except Exception as e:
        print(f"\n❌ An error occurred during title generation: {e}")
        print("   Please check your API key and ensure the uploaded files are valid.")

    finally:
        # Clean up temporary resources used by the File objects
        print("\nCleaning up resources...")
        for f in content_files:
            f.cleanup()


## Step 3: Generate Content Summary

This cell generates a concise summary of your content. The summary is intended for an external audience (e.g., a video description) and avoids spoilers.

In [None]:
#@title Step 3: Summarize Content (Spoiler-Free)

# Check if files were uploaded
if 'uploaded_file_paths' not in globals() or not uploaded_file_paths:
    print("❌ No files uploaded yet. Please run Step 1 first.")
else:
    print(f"Analyzing content ({', '.join(uploaded_files)}) for summary...")
    # Create File objects for the toolkit
    content_files = [file_io.File(f_path) for f_path in uploaded_file_paths]

    try:
        # Call the AI function to generate a summary
        description = ai_metadata_generator.summarize(content_files)
        print(f"\n**Content Summary ({';'.join([os.path.basename(f.name) for f in content_files])}):**")
        if description:
            print(description)
        else:
            print("No summary was generated.")

    except Exception as e:
        print(f"\n❌ An error occurred during summary generation: {e}")
        print("   Please check your API key and ensure the uploaded files are valid.")

    finally:
        # Clean up temporary resources
        print("\nCleaning up resources...")
        for f in content_files:
            f.cleanup()

## Step 4: Generate Keywords

This cell suggests relevant keywords based on the content.

* **Optional:** You can provide a comma-separated list of `allowed_keywords` below. If you do, the AI will only suggest keywords from that specific list.

In [None]:
#@title Step 4: Generate Keywords

# Check if files were uploaded
if 'uploaded_file_paths' not in globals() or not uploaded_file_paths:
    print("❌ No files uploaded yet. Please run Step 1 first.")
else:
    # --- User Input for Allowed Keywords --- 
    #@markdown Enter a comma-separated list of allowed keywords (optional). Leave blank to allow any keywords.
    allowed_keywords_input = "" #@param {type:"string"}
    # Split the input string into a list, removing whitespace
    allowed_keywords_list = [kw.strip() for kw in allowed_keywords_input.split(",") if kw.strip()] if allowed_keywords_input else []

    if allowed_keywords_list:
        print(f"Generating keywords restricted to: {', '.join(allowed_keywords_list)}...")
    else:
        print(f"Generating keywords based on content ({', '.join(uploaded_files)})...")

    # Create File objects for the toolkit
    content_files = [file_io.File(f_path) for f_path in uploaded_file_paths]

    try:
        # Call the AI function to generate keywords (metadata)
        keywords = ai_metadata_generator.generate_metadata(content_files, allowed_keywords_list)
        print(f"\n**Suggested Keywords ({';'.join([os.path.basename(f.name) for f in content_files])}):**")
        if keywords:
            # Print each suggested keyword
            print("\n".join([f"- {kw}" for kw in keywords]))
        else:
            print("No keywords were generated.")

    except Exception as e:
        print(f"\n❌ An error occurred during keyword generation: {e}")
        print("   Please check your API key and ensure the uploaded files are valid.")

    finally:
        # Clean up temporary resources
        print("\nCleaning up resources...")
        for f in content_files:
            f.cleanup()

## Step 5: Generate Key-Values

This cell generates specific values associated with a key you define (e.g., Key: `mood`, Values: `happy`, `sad`).

* **Required:** Enter the `key` you want to find values for (e.g., `mood`, `genre`, `topic`).
* **Optional:** You can provide a comma-separated list of `allowed_values`. If you do, the AI will only suggest values from that specific list for the given key.

In [None]:
#@title Step 5: Generate Key-Values

# Check if files were uploaded
if 'uploaded_file_paths' not in globals() or not uploaded_file_paths:
    print("❌ No files uploaded yet. Please run Step 1 first.")
else:
    # --- User Input for Key and Allowed Values ---
    #@markdown Enter the 'key' you want to generate values for (e.g., mood, genre, topic).
    key_input = "mood" #@param {type:"string"}
    #@markdown Enter a comma-separated list of allowed values for the key (optional).
    allowed_values_input = "" #@param {type:"string"}
    # Split the input string into a list, removing whitespace
    allowed_values_list = [val.strip() for val in allowed_values_input.split(",") if val.strip()] if allowed_values_input else []

    if not key_input:
        print("❌ Please provide a 'key' to generate values for.")
    else:
        print(f"Generating values for key '{key_input}'...")
        if allowed_values_list:
            print(f"   Restricted to values: {', '.join(allowed_values_list)}")

        # Create File objects for the toolkit
        content_files = [file_io.File(f_path) for f_path in uploaded_file_paths]

        try:
            # Define the key-value structure for the AI function
            key_value_obj = ai_metadata_generator.KeyValue(key_input, allowed_values_list)
            # Call the AI function to generate key-values
            key_values_result = ai_metadata_generator.generate_key_values(content_files, [key_value_obj])

            print(f"\n**Suggested Values for Key=`{key_input}` ({';'.join([os.path.basename(f.name) for f in content_files])}):**")
            # Check if the key exists in the result and has values
            if key_input in key_values_result and key_values_result[key_input]:
                # Print each suggested value for the key
                print("\n".join([f"- {val}" for val in key_values_result[key_input]]))
            else:
                print(f"No values were generated for the key '{key_input}'.")

        except Exception as e:
            print(f"\n❌ An error occurred during key-value generation: {e}")
            print("   Please check your API key and ensure the uploaded files are valid.")

        finally:
            # Clean up temporary resources
            print("\nCleaning up resources...")
            for f in content_files:
                f.cleanup()

## Step 6: Generate IAB Content Categories

This cell identifies relevant IAB Tech Lab Content Taxonomy categories (both Content and Audience taxonomies) for your content. This is useful for content classification and advertising purposes.

In [None]:
#@title Step 6: Generate IAB Categories

# Check if files were uploaded
if 'uploaded_file_paths' not in globals() or not uploaded_file_paths:
    print("❌ No files uploaded yet. Please run Step 1 first.")
else:
    print(f"Analyzing content ({', '.join(uploaded_files)}) for IAB categories...")
    # Create File objects for the toolkit
    content_files = [file_io.File(f_path) for f_path in uploaded_file_paths]

    try:
        # Call the AI function to generate IAB categories
        iab_categories = ai_metadata_generator.generate_iab_categories(content_files)
        print(f"\n**Suggested IAB Categories ({';'.join([os.path.basename(f.name) for f in content_files])}):**")

        if iab_categories:
            # Format the output as a table for better readability
            # Determine maximum width needed for taxonomy name for alignment
            max_taxonomy_len = 0
            if iab_categories: # Check if list is not empty
              max_taxonomy_len = max(len(cat.taxonomy_name) for cat in iab_categories)
            max_taxonomy_len = max(max_taxonomy_len, len('Taxonomy')) # Ensure header fits

            # Print table header
            header = f"{'Taxonomy'.ljust(max_taxonomy_len)}  {'ID':>7}  {'Category'}"
            print(header)
            print('-' * len(header))

            # Print each category row
            for category in iab_categories:
                print(
                    f"{category.taxonomy_name.ljust(max_taxonomy_len)} "
                    f" {category.unique_id:>7} "
                    f" {category.name}"
                )
        else:
            print("No IAB categories were generated.")

    except Exception as e:
        print(f"\n❌ An error occurred during IAB category generation: {e}")
        print("   Please check your API key and ensure the uploaded files are valid.")

    finally:
        # Clean up temporary resources
        print("\nCleaning up resources...")
        for f in content_files:
            f.cleanup()

## Finished!

You have successfully generated metadata for your content. You can run the steps again with different files or parameters.

*(Optional)* You can run the cell below to remove the files you uploaded from the `/content/uploads` directory if you no longer need them in this Colab session.

In [None]:
#@title (Optional) Clean Up Uploaded Files
import shutil

upload_dir = '/content/uploads'

if os.path.exists(upload_dir):
    try:
        shutil.rmtree(upload_dir)
        print(f"✅ Removed directory and all contents: {upload_dir}")
        # Optional: Also remove the list of uploaded file paths if it exists
        if 'uploaded_file_paths' in globals():
            del globals()['uploaded_file_paths']
        if 'uploaded_files' in globals():
             del globals()['uploaded_files']
    except Exception as e:
        print(f"❌ Error removing directory {upload_dir}: {e}")
else:
    print(f"ℹ️ Directory {upload_dir} does not exist, no cleanup needed.")