# 🖼️ Meme Generator Lab: Student Lab
## Getting Started
Ref repository: https://github.com/IFML-UT/MLLAcademy-2025

**What we're going to do inside of this notebook:**

> This notebook is used to simulate student interactions with the meme generator pipeline.
Use it to validate that the full captioning stack is working before you begin creating a meme of your own.

## 🗺️ Roadmap for the entire lab:
> The lab is broken down into 3 major sections:

1. **Working with LLMs & Inference**: You'll use an open source large language model (LLM) to generate a meme based on general topics or themes. You'll select the best caption from 3 results. We'll be using Meta's `Llama` model family for this text generation


2. **Multi-modal Generative AI:** With your caption from the first part of the lab, you will use OpenCLIP to query the top 3 matches from a library of popular meme images to select the best image, based on your text caption. You'll understand what multi-modal means, and how a pre-trained vision transformer model like `ViT-B-32` can return relevant images based on text inputs.

3. **Combine both the generated text and best image into your final AI-meme for your finished product.**


### ⚙️ How It Works:
- Inputs a freeform meme idea or phrase
- Classifies it into a pre-approved topic
- Uses LLaMA 3.1 8B Instruct (via Hugging Face) to generate 3 clean meme captions
- Filters for profanity or off-topic content




In [None]:
# Setup -- Run this cell first to install dependencies and import necessary modules.
# This script is designed to be run in a Jupyter notebook environment, such as Google Colab or a local Jupyter setup.
# It installs required packages and imports functions for generating safe captions from images.

!git clone https://github.com/IFML-UT/MLLAcademy-2025.git
# 🛠️ Install dependencies (for Colab or Drive-mount workflows)

# --- this cell will create a folder called MLLAcademy-2025 in your current directory

### Load Requirements & Configure Hugging Face Inference API Token: 
To help keep this lab computationally light and flexible for our lab use, we are using Hugging Face inference token (generated by IFML) for your use during this week. 
- API stands for application programing interface, once configured it allows two different software applications communicate and send data to one another. 
- This secret token will expire after this week. If you would like to continue to run this lab later on your own, you can do so by creating a free HuggingFace account, creating a token within the free tier (https://huggingface.co/settings/tokens) and then pasting your new token into the cell's `getpass` feature below. 

> “Paste your Hugging Face API token (provided to you) in the cell below when prompted. If you don't have one because you are trying this lab outside of our scheduled session no worries! Visit https://huggingface.co/settings/tokens to create a free account, create a token of `type = READ`, and then copy your access token.”

In [1]:
# This script auto-detects your environment (Colab or local) and configures everything accordingly.

import os
import sys
import re
import json
from pathlib import Path
from getpass import getpass

# --- Detect Environment ---
def get_runtime_env():
    try:
        import google.colab
        return "colab"
    except ImportError:
        return "local"

env = get_runtime_env()
print(f"Detected environment: {env}")

# --- Install Dependencies ---
if env == "colab":
    %pip install -r MLLAcademy-2025/requirements.txt
else:
    %pip install -r requirements.txt

# --- Hugging Face Token Management ---
token_path = Path("/content/hf_token.txt") if env == "colab" else Path("../hf_token.txt")

if not token_path.exists():
    print("Please enter your Hugging Face API token:")
    token = getpass("Hugging Face Token: ")
    with open(token_path, "w") as f:
        f.write(token.strip())
    print(f"✅ Hugging Face token saved to {token_path}")
else:
    print(f"✅ Hugging Face token found at {token_path}")

# --- Ensure utils folder is in sys.path ---
sys.path.append(str(Path("/content/MLLAcademy-2025/utils").resolve()) if env == "colab" else str(Path("../utils").resolve()))

# --- Import the Safe Caption Generator ---
from safe_caption_generator import safe_caption_generator
print("\nSafe Caption Generator module imported successfully, ready to use!")


Detected environment: local
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
[31mERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
[31mERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.
✅ Hugging Face token found at ../hf_token.txt
Note: you may need to restart the kernel to use updated packages.
✅ Hugging Face token found at ../hf_token.txt
🧭 Detect

In [2]:
# Helper function - for printing captions cleanly and export the results to a JSON file
# This file will be used later when we generate the images 

def print_captions(captions):
    env = get_runtime_env()
    captions_path = Path("/content/MLLAcademy-2025/captions.json") if env == "colab" else Path("../captions.json")

    with open(captions_path, "w") as f:
        json.dump(captions, f)

    print(f"✅ Captions saved to {captions_path}")
    print("\n---\n\n")
    for i, c in enumerate(captions, 1):
        print(f"Caption {i}: {c}\n")

## Type in a prompt between the quotes 
This will assign your topic to the variable `user_input`
 - Running the cell below will then run the caption generator and print the captions
 
Additionally, we are going to be using a Python function called `safe_caption_generator` to assist us in prompting the LLM. For example, this code is within the function and prompts the LLM prior to its text generation, based on your input: 

```
PROMPT_TEMPLATE = (
    "Write a short, funny meme caption about this topic: {user_input}.\n"
    "Only return a single caption, in quotes, with no explanation or extra text."
)
```

### We are going to specifically guide our text generation to stay aligned on certain topics.
You may find that certain topics will be blocked from use. If you run into a "try again error message" please adjust your input. Here are the broad topics we are going to use within this lab for your captions: 
- "final exams"
- "group projects"
- "studying late", 
- "Monday mornings"
- "school cafeteria food"
- "summer break"
- "forgetting your homework"
- "getting a pop quiz"
- "trying to stay awake in class"
- "sports"
- "coding projects"
- "hackathons"
- "hanging out with friends"
- "summer weather"
- "family vacations"
- "college applications"
-  "video games"

_You don't have to use these exact words in your `user_input`, but it needs to be semantically similar. For example, "Going to a baseball game instead of studying" would match our themes of both `sports` and `forgetting your homework`, and possibly even `studying late`._

 > Note: This cell may take anywhere from 30 seconds to 2 minutes depending on your prompt and notebook compute resources at the time of execution.

In [1]:
# --- Now we are going to run the safe caption generator based on your input ---
# In this cell, we'll test our `safe_caption_generator` function with a sample input. It will:
#   - Use your input prompt.
#   - Check if the input matches approved topics.
#   - Generate 3 captions using a language model.
#   - Save the captions to a JSON file for use in later cells.

try:
    # modify this input to test different prompts 
    user_input = "studying all night and finally passing the exam"
    print(f"Testing prompt: '{user_input}'")
    
    # Generate and save 3 meme captions and print each
    captions = safe_caption_generator(user_input, num_captions=3)
    print_captions(captions)

except ValueError as e:
    print(f"⚠️ Error: {e}")

Testing prompt: 'studying all night and finally passing the exam'


NameError: name 'safe_caption_generator' is not defined

### Try it out!
Use the box below to enter your meme idea, click "Generate," and see three captions!

Each generation is saved in `captions.json` for use in the next part of the lab. Each new generation overwrites that file. If you want to save any specific caption, save it in a new file within your directory. You'll have a chance to select your favorite caption in the next lab. 

In [None]:
# Interactive Prompt (for Demo in class)
from IPython.display import display
import ipywidgets as widgets

input_box = widgets.Text(value='', placeholder='Enter your meme idea...', description='Prompt:')
run_button = widgets.Button(description="Generate")
output = widgets.Output()

def run_on_click(b):
    output.clear_output()
    with output:
        try:
            captions = safe_caption_generator(input_box.value)
            # for idx, c in enumerate(captions, 1): # backup code to print each caption rather than use function
            #   print(f"{idx}. {c}")
            print_captions(captions)
        except Exception as e:
            print(f"⚠️ Error: {e}")

run_button.on_click(run_on_click)
display(input_box, run_button, output)

Text(value='', description='Prompt:', placeholder='Enter your meme idea...')

Button(description='Generate', style=ButtonStyle())

Output()

## Troubleshooting Guide

- If you get a profanity or topic error, verify the input is:
  - Clean (no banned phrases)
  - Topically close to: studying, group projects, sports, coding, school, etc.

- If you get an API error:
  - Ensure `hf_token.txt` exists and contains a valid Hugging Face token; if the token is missing, please ask for a new token.
  - Ensure `.gitignore` excludes it from version control

- If you get no captions back:
  - Check output formatting with `print(repr(captions))`
  - Rerun cell — model output may vary by seed

---
✅ Instructor notebook complete. Move on to Notebook A when you're ready.
