# Module 1: Meme Generation

- Note: Desired Output: Meme template, Meme text, and caption (optional). Caption is just the text generation that we'll do in simple text generation module.

### Sub-Modules

0. Prerequisite: Need to create a meme templates table which contains 5-6 features that will be our user inputs (loc, audience, industry, etc) and contains 1 keywords col for each template.
    - 0.1: Get Meme names from API.
    - 0.2: Create prompt-based AI call function to get decided features for every meme template (e.g. Location, Audience, Keywords, etc). 
    - 0.3: Create dataset for all meme templates and their respective features.

1. Get input: Free-form and structured (will be getting it from user frontend).

2. Prepare input properly: keyword extraction for free-form, no preprocessing needed for structured. 

3. Perform Template matching by matching each column separately for every meme template (need a simple algo for this). For every meme template in our dataset:
    - 3.1: Match structured inputs first, one by one. Simply return boolean TRUE/FALSE -> A utility function needed to match strings (take care of capitalization). Simply return 1/0 for each column.
    - 3.2: Match the extracted keywords with keywords col in dataset, return percentage of matched keywords (e.g. 2/5 input-converted-tokeywords match with dataset keywords). Need 1 utility func for this -> returns a numerical value between 0-1 which represents the percentage.
    - 3.3: A final scoring func which simply returns the weighted sum of products $(Col1*weight1 + c2*w2 + ...)$
    - 3.4: Store this to a score-table (this will be created for every user input query). Sort the table in descending order and return the top 5 templates based on score.

4. Generate the meme text based on the template by using chat gpt api. One utility func needed to recieve [template name, user input, static prompt (this will be pre-set every generation)], and returns __formatted__ meme text (ensure it returns appropriate format for that meme template, and only returns 1 example).

5. Create a func to call meme-creation api ([link](https://rapidapi.com/meme-generator-api-meme-generator-api-default/api/meme-generator)): pass template-name and meme-text, receive created jpeg image (ensure a consistent format- png, jpg, jpeg, etc). Temporarily store while displaying to user, and if approved, store to user's db entry.

- Note: Steps 4 and 5 will be repeated for all 5 meme templates that were chosen as the highest scoring templates.

6. Give option for caption creation. Create a utility func using chatgpt and generate caption as well, based on the user input and static prompt (will need another static prompt for this func).

---

### Sub-Module 0

**Step 1:** Getting meme template names from the API that we're using ([Meme-Generator-Rapid-API](https://rapidapi.com/meme-generator-api-meme-generator-api-default/api/meme-generator))

In [1]:
from openai import OpenAI
import json
import time
import pandas as pd

In [None]:
import requests

url = "https://ronreiter-meme-generator.p.rapidapi.com/images"

headers = {
	"x-rapidapi-key": "",
	"x-rapidapi-host": "ronreiter-meme-generator.p.rapidapi.com"
}

response = requests.get(url, headers=headers)

memes = response.json()

In [61]:
print(memes)



In [63]:
memes.remove('poster')

In [64]:
len(memes)

999

__Utility Func 0.1:__ ChatGPT Call for Feature Generation for Meme Template Dataset:

In [None]:
key = ''
client = OpenAI(api_key=key)

In [69]:
OPTION_SETS = {
    "Primary Audience": ["Kids", "Teens", "Young Adults", "Millennials", "Adults", "Older Adults", "General Audience (All)"],
    "Secondary Audience": ["Teens", "Young Adults", "Millennials", "Adults", "Older Adults", "General Audience (All)", "Professionals", "Tech Enthusiasts", "Gamers", "Pop Culture Fans"],
    "Industry": ["eCommerce", "Fashion & Beauty", "Food & Beverage", "Tech & Software", "Finance", "Healthcare", "Education", "Entertainment", "Automotive", "Travel", "B2B", "General"],
    "Product/Service": ["Physical Products", "Luxury Goods", "Subscription Services", "Food & Beverages", "Events & Experiences", "Tech Products", "Education & Courses", "B2B Services", "Brand Awareness"],
    "Best Platforms": ["Instagram", "Facebook", "Twitter (X)", "General"],
    "Humor Style": ["Relatable", "Sarcastic", "Wholesome", "Dark Humor", "Satirical", "Self-Deprecating", "Absurd", "Workplace", "Pop Culture", "Nostalgic", "Reaction-Based"],
    "Emotion Targeted": ["FOMO", "Excitement", "Regret", "Confidence", "Shock", "Confusion", "Happiness", "Frustration", "Empowerment", "Curiosity"],
    "Engagement Type": ["Tag a Friend", "Shareable", "Call to Action", "Comment Bait", "Self-Identification", "Educational", "Controversial", "Nostalgic", "Entertainment"],
    "Seasonality": ["Evergreen", "Summer", "Winter", "Black Friday", "Christmas", "Halloween", "Back to School", "Valentine’s Day", "April Fools", "Other Holidays"],
    "Tone Alignment": ["Playful", "Premium Yet Fun", "Professional", "Luxury", "Edgy", "Trendy", "Nostalgic", "Corporate", "General"]
}

In [None]:
def generate_meme_features_batch(meme_templates):
    json_schema = {
        "name": "meme_features_response",
        "strict": True,
        "schema": {
            "type": "object",
            "properties": {
                "memes": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "required": list(OPTION_SETS.keys()) + ["Meme Name", "Keywords"],
                        "properties": {
                            "Meme Name": {"type": "string"},
                            "Keywords": {
                                "type": "array",
                                "items": {"type": "string"} 
                            }
                        },
                        "additionalProperties": False
                    }
                }
            },
            "required": ["memes"],
            "additionalProperties": False
        }
    }

    for field, options in OPTION_SETS.items():
        if field == "Best Platforms":  
            json_schema["schema"]["properties"]["memes"]["items"]["properties"][field] = {
                "type": "array",
                "items": {"type": "string", "enum": options}
            }
        elif field in ["Primary Audience", "Secondary Audience", "Industry", "Product/Service", "Humor Style",
                       "Emotion Targeted", "Engagement Type", "Seasonality", "Tone Alignment"]:
            json_schema["schema"]["properties"]["memes"]["items"]["properties"][field] = {
                "type": "array",
                "items": {"type": "string", "enum": options}
            }
        else:
            json_schema["schema"]["properties"]["memes"]["items"]["properties"][field] = {
                "type": "string",
                "enum": options
            }

    # Construct batch prompt with strict format
    formatted_memes = "\n".join([f"- Meme {i+1}: \"{meme}\"" for i, meme in enumerate(meme_templates)])

    prompt = f"""
    You are an AI that generates structured marketing insights for multiple meme templates.

    - Always return **only valid JSON** that follows the schema.
    - **Each meme should be processed separately in the JSON array.**
    - Do **not mix data between memes**; keep them completely independent.
    - If you are unsure, **pick the most relevant option from the given enums**.
    - **Each meme should have 10 keywords.**
    
    Meme Templates in this batch:
    {formatted_memes}
    """
    
    completion = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are an AI that only returns structured JSON arrays."},
            {"role": "user", "content": prompt}
        ],
        response_format={"type": "json_schema", "json_schema": json_schema}
    )

    response_text = completion.choices[0].message.content.strip()

    try:
        response_json = json.loads(response_text)  # Ensure valid JSON
        if "memes" in response_json and isinstance(response_json["memes"], list):
            return response_json["memes"]  # list of processed memes
        else:
            return {"error": "Invalid response format"}
    except json.JSONDecodeError:
        return {"error": "Invalid JSON response"}

__Utility Func 0.2:__ Creating a dataset for all meme templates and their respective features:

In [71]:
def process_meme_templates(df):
    batch_size = 5
    all_results = []
    
    if "Meme Name" not in df.columns:
        raise ValueError("Input DataFrame must contain a 'Meme Name' column.")

    # Loop through df in batches of 5:
    total_batches = (len(df) + batch_size - 1) // batch_size
    for i in range(0, len(df), batch_size):
        batch = df["Meme Name"].iloc[i:i + batch_size].tolist()  # batch of 5 meme names
        print(f"Processing batch {i // batch_size + 1} / {total_batches}...")

        try:
            batch_results = generate_meme_features_batch(batch)  
            if isinstance(batch_results, list):  # ensure valid API response
                all_results.extend(batch_results)
            else:
                print(f"Warning: Received invalid response for batch {batch}")
        except Exception as e:
            print(f"Error processing batch {batch}: {e}")
            continue  # skip to next batch in case of error
        
        # time.sleep(1) 

    memes_dataset = pd.DataFrame(all_results)

    required_columns = [
        "Meme Name", "Primary Audience", "Secondary Audience", "Industry", "Product/Service",
        "Best Platforms", "Humor Style", "Emotion Targeted", "Engagement Type",
        "Seasonality", "Tone Alignment", "Keywords"
    ]
    
    for col in required_columns:
        if col not in memes_dataset.columns:
            memes_dataset[col] = None  # Fill missing columns with None

    return memes_dataset

In [72]:
columns = ["Meme Name", "Primary Audience", "Secondary Audience", "Industry", "Product/Service", "Best Platforms", "Humor Style", "Emotion Targeted", "Engagement Type", "Seasonality", "Tone Alignment", "Keywords"]
meme_dataset = pd.DataFrame(columns=columns, index=range(len(memes)))  
meme_dataset["Meme Name"] = memes  

In [93]:
meme_dataset.head(0)

Unnamed: 0,Meme Name,Primary Audience,Secondary Audience,Industry,Product/Service,Best Platforms,Humor Style,Emotion Targeted,Engagement Type,Seasonality,Tone Alignment,Keywords


In [85]:
memes_dataset = process_meme_templates(meme_dataset.iloc[:4])

Processing batch 1 / 1...


In [86]:
memes_dataset

Unnamed: 0,Meme Name,Keywords,Primary Audience,Secondary Audience,Industry,Product/Service,Best Platforms,Humor Style,Emotion Targeted,Engagement Type,Seasonality,Tone Alignment
0,10-Guy,"[10 things, exaggeration, humor, funny, litera...","[Teens, Young Adults, Millennials]","[General Audience (All), Pop Culture Fans]","[Entertainment, General]","[Brand Awareness, Physical Products]","[Instagram, Twitter (X)]","[Relatable, Satirical]","[Happiness, Curiosity]","[Shareable, Comment Bait]",[Evergreen],[Playful]
1,1950s-Middle-Finger,"[rebellion, attitude, 1950s, nostalgia, humor,...","[Young Adults, Millennials]","[Teens, General Audience (All)]","[Fashion & Beauty, Entertainment]","[Brand Awareness, Physical Products]","[Facebook, Instagram]","[Sarcastic, Nostalgic]","[Frustration, Empowerment]","[Self-Identification, Educational]",[Evergreen],[Edgy]
2,1990s-First-World-Problems,"[complaints, first world, 90s, humor, irony, s...","[Teens, Young Adults, Millennials]","[General Audience (All), Pop Culture Fans]","[Tech & Software, General]","[Brand Awareness, Subscription Services]","[Facebook, Twitter (X)]","[Relatable, Self-Deprecating]","[Regret, Confusion]","[Comment Bait, Shareable]",[Evergreen],[Playful]
3,1st-World-Canadian-Problems,"[Canadian, problems, humor, sarcasm, culture, ...","[Young Adults, Millennials]","[Teens, General Audience (All)]","[Entertainment, General]","[Brand Awareness, Physical Products]","[Instagram, Facebook]","[Relatable, Satirical]","[Frustration, Happiness]","[Comment Bait, Shareable]",[Evergreen],[Playful]


In [95]:
memes_dataset = process_meme_templates(meme_dataset)

Processing batch 1 / 200...
Processing batch 2 / 200...
Processing batch 3 / 200...
Processing batch 4 / 200...
Processing batch 5 / 200...
Processing batch 6 / 200...
Processing batch 7 / 200...
Processing batch 8 / 200...
Processing batch 9 / 200...
Processing batch 10 / 200...
Processing batch 11 / 200...
Processing batch 12 / 200...
Processing batch 13 / 200...
Processing batch 14 / 200...
Processing batch 15 / 200...
Processing batch 16 / 200...
Processing batch 17 / 200...
Processing batch 18 / 200...
Processing batch 19 / 200...
Processing batch 20 / 200...
Processing batch 21 / 200...
Processing batch 22 / 200...
Processing batch 23 / 200...
Processing batch 24 / 200...
Processing batch 25 / 200...
Processing batch 26 / 200...
Processing batch 27 / 200...
Processing batch 28 / 200...
Processing batch 29 / 200...
Processing batch 30 / 200...
Processing batch 31 / 200...
Processing batch 32 / 200...
Processing batch 33 / 200...
Processing batch 34 / 200...
Processing batch 35 / 2

In [96]:
memes_dataset.to_csv(r"C:\Users\hp\Desktop\fyp\mvp\memes_dataset.csv", index=False)

In [102]:
memes_dataset.tail()

Unnamed: 0,Meme Name,Keywords,Primary Audience,Secondary Audience,Industry,Product/Service,Best Platforms,Humor Style,Emotion Targeted,Engagement Type,Seasonality,Tone Alignment
994,Zombie-Overly-Attached-Girlfriend,"[Obsession, Relationship, Funny, Humor, Sarcas...",[Young Adults],"[Teens, General Audience (All)]","[Entertainment, Fashion & Beauty]","[Events & Experiences, Brand Awareness]","[Instagram, Twitter (X)]","[Relatable, Absurd]","[Happiness, Excitement]","[Tag a Friend, Entertainment]",[Evergreen],"[Playful, Trendy]"
995,Zorg,"[Alien, Futuristic, Space, Funny, Weird, Sci-F...","[Teens, Young Adults, General Audience (All)]","[Pop Culture Fans, Gamers]","[Entertainment, Tech & Software]","[Tech Products, Brand Awareness]","[Instagram, Facebook]","[Absurd, Relatable]","[Curiosity, Happiness]","[Shareable, Comment Bait]",[Evergreen],[Playful]
996,Zuckerberg,"[Facebook, Tech, Innovation, Social Media, Vir...","[Young Adults, Adults]","[Professionals, Tech Enthusiasts]","[Tech & Software, B2B]","[Brand Awareness, Tech Products]","[Twitter (X), Facebook]","[Satirical, Self-Deprecating]","[Frustration, Excitement]","[Tag a Friend, Controversial]",[Evergreen],[Edgy]
997,Zura-Janai-Katsura-Da,"[Anime, Japanese, Catchphrase, Funny, Pop Cult...","[Teens, Young Adults]","[Pop Culture Fans, Gamers]",[Entertainment],"[Brand Awareness, Events & Experiences]","[Instagram, Twitter (X)]","[Nostalgic, Reaction-Based]","[Excitement, Curiosity]","[Self-Identification, Shareable]",[Evergreen],[Trendy]
998,confession-kid,"[Child, Confession, Cute, Relatable, Humor, Fu...","[Kids, Teens]","[Adults, General Audience (All)]","[Education, Entertainment]","[Brand Awareness, Food & Beverages]","[Instagram, Facebook]","[Wholesome, Relatable]","[Happiness, Frustration]","[Comment Bait, Shareable]",[Evergreen],[Playful]


- $0.14 cost, 25 mins

Adding Num of Text Lines for each meme (Unused Right now):

In [102]:
text_line_addition = pd.read_csv(r"C:\Users\hp\Desktop\fyp\mvp\memes_dataset.csv")

In [103]:
text_line_addition = text_line_addition[["Meme Name"]]

In [105]:
def get_meme_text_lines_batch(meme_names, batch_size=10):
    json_schema = {
        "name": "meme_text_lines_response",
        "strict": True,
        "schema": {
            "type": "object",
            "properties": {
                "memes": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "required": ["Meme Name", "Text Lines"],
                        "properties": {
                            "Meme Name": {"type": "string"},
                            "Text Lines": {"type": "integer"}
                        },
                        "additionalProperties": False
                    }
                }
            },
            "required": ["memes"],
            "additionalProperties": False
        }
    }

    all_results = []

    for i in range(0, len(meme_names), batch_size):
        batch = meme_names[i:i + batch_size]

        formatted_memes = "\n".join([f"- {meme}" for meme in batch])

        prompt = f"""
        You are an AI assistant that determines the number of text lines typically present in different meme templates.

        - Your task is to return **only a single integer** indicating the number of text lines for each meme.
        - The number should match the common format of the meme.
        - Do not provide explanations or additional text.
        - Always return a valid JSON response.

        **Examples:**
        - Meme: "Drakeposting" → Response: 2
        - Meme: "Expanding Brain" → Response: 4
        - Meme: "Surprised Pikachu" → Response: 1

        Meme Templates in this batch:
        {formatted_memes}
        """

        completion = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {"role": "system", "content": "You are an AI that only returns structured JSON outputs."},
                {"role": "user", "content": prompt}
            ],
            response_format={"type": "json_schema", "json_schema": json_schema}
        )

        response_text = completion.choices[0].message.content.strip()

        try:
            response_json = json.loads(response_text)
            if "memes" in response_json and isinstance(response_json["memes"], list):
                all_results.extend(response_json["memes"])
            else:
                print(f"Warning: Received invalid response for batch {batch}")
        except json.JSONDecodeError:
            print(f"Error processing batch {batch}: Invalid JSON response")
            continue

    return all_results

In [106]:
text_line_results = get_meme_text_lines_batch(text_line_addition["Meme Name"].tolist()[:10])

In [107]:
text_line_results

[{'Meme Name': '10-Guy', 'Text Lines': 2},
 {'Meme Name': '1950s-Middle-Finger', 'Text Lines': 1},
 {'Meme Name': '1990s-First-World-Problems', 'Text Lines': 2},
 {'Meme Name': '1st-World-Canadian-Problems', 'Text Lines': 2},
 {'Meme Name': '2nd-Term-Obama', 'Text Lines': 2},
 {'Meme Name': 'Aaaaand-Its-Gone', 'Text Lines': 1},
 {'Meme Name': 'Ace-Primo', 'Text Lines': 2},
 {'Meme Name': 'Actual-Advice-Mallard', 'Text Lines': 2},
 {'Meme Name': 'Adalia-Rose', 'Text Lines': 1},
 {'Meme Name': 'Admiral-Ackbar-Relationship-Expert', 'Text Lines': 2}]

---

### Sub-Module 1

**Getting example input**: Free-form and Structured.

In [50]:
primary_audience = ["Young Adults", "Millennials"]
secondary_audience = ["Tech Enthusiasts", "Pop Culture Fans"]
industry = ["Tech & Software"]
product_service = ["Tech Products"]
best_platforms = ["Twitter (X)"]
humor_style = ["Relatable", "Satirical"]
emotion_targeted = ["Excitement", "Curiosity"]
engagement_type = ["Comment Bait"]
seasonality = ["Evergreen"]
tone_alignment = ["Playful", "Trendy"]
keywords_prompt = "We run a productivity app that helps users stay organized and get things done. It’s great for students, remote workers, and busy professionals. We want memes that highlight how frustrating staying productive can be and how our app makes it easier."

----

### Sub-Module 2

Perform __Keyword Extraction for free-form__ prompt inputs.

In [51]:
keywords = ["productivity", "organization", "time management", "students", "remote work", "professionals", "task management", "efficiency", "frustration", "workflow"]

In [52]:
columns = ["Keywords", "Primary Audience", "Secondary Audience", "Industry", "Product/Service", "Best Platforms", 
           "Humor Style", "Emotion Targeted", "Engagement Type", "Seasonality", "Tone Alignment"]

user_input_df = pd.DataFrame([[keywords, primary_audience, secondary_audience, industry, product_service, best_platforms, 
                               humor_style, emotion_targeted, engagement_type, seasonality, tone_alignment]], 
                              columns=columns)

In [53]:
user_input_df.head()

Unnamed: 0,Keywords,Primary Audience,Secondary Audience,Industry,Product/Service,Best Platforms,Humor Style,Emotion Targeted,Engagement Type,Seasonality,Tone Alignment
0,"[productivity, organization, time management, ...","[Young Adults, Millennials]","[Tech Enthusiasts, Pop Culture Fans]",[Tech & Software],[Tech Products],[Twitter (X)],"[Relatable, Satirical]","[Excitement, Curiosity]",[Comment Bait],[Evergreen],"[Playful, Trendy]"


----

### Sub-Module 3

In [12]:
memes_dataset_loaded = pd.read_csv(r"C:\Users\hp\Desktop\fyp\mvp\memes_dataset.csv")

In [14]:
memes_dataset_loaded.head()

Unnamed: 0,Meme Name,Keywords,Primary Audience,Secondary Audience,Industry,Product/Service,Best Platforms,Humor Style,Emotion Targeted,Engagement Type,Seasonality,Tone Alignment
0,10-Guy,"['Skepticism', 'Surprise', 'Expression', 'Humo...","['Teens', 'Young Adults', 'Millennials']","['Pop Culture Fans', 'Teens']","['Entertainment', 'General']",['Brand Awareness'],"['Instagram', 'Facebook']","['Relatable', 'Sarcastic']","['Confusion', 'Excitement']","['Shareable', 'Comment Bait']",['Evergreen'],['Playful']
1,1950s-Middle-Finger,"['Rebellion', 'Defiance', 'Vintage', 'Outrageo...","['Young Adults', 'Adults']","['General Audience (All)', 'Professionals']","['Fashion & Beauty', 'Entertainment']","['Luxury Goods', 'Brand Awareness']","['Instagram', 'Twitter (X)']","['Satirical', 'Dark Humor']","['Shock', 'Frustration']","['Controversial', 'Shareable']",['Evergreen'],['Edgy']
2,1990s-First-World-Problems,"['Entitlement', 'Sarcastic', 'Trivial', 'Strug...","['Young Adults', 'Millennials']","['Teens', 'Pop Culture Fans']",['General'],['Brand Awareness'],"['Facebook', 'Twitter (X)']","['Self-Deprecating', 'Relatable']","['Confusion', 'Happiness']","['Self-Identification', 'Comment Bait']",['Evergreen'],['Playful']
3,1st-World-Canadian-Problems,"['Stereotypes', 'Cultural', 'Humor', 'Awarenes...","['Teens', 'Young Adults']","['General Audience (All)', 'Pop Culture Fans']",['General'],['Brand Awareness'],"['Twitter (X)', 'Facebook']","['Relatable', 'Satirical']","['Happiness', 'Confusion']","['Self-Identification', 'Shareable']",['Evergreen'],['Playful']
4,2nd-Term-Obama,"['Politics', 'Satire', 'Public Figure', 'Leade...","['Adults', 'Young Adults']","['Professionals', 'General Audience (All)']","['Entertainment', 'B2B']",['Brand Awareness'],"['Twitter (X)', 'Facebook']","['Satirical', 'Reaction-Based']","['Excitement', 'Confusion']","['Controversial', 'Shareable']",['Evergreen'],['Professional']


__Matching Algorithm:__

In [None]:
def matching(user_input_df, memes_dataset_loaded):
    # Create matches_df with just column names and meme names
    matches_df = pd.DataFrame(columns=memes_dataset_loaded.columns)
    matches_df["Meme Name"] = memes_dataset_loaded["Meme Name"]

    # Iterate through each column (excluding "Meme Name")
    for col in user_input_df.columns:
        if col == "Meme Name":
            continue  # Skip meme name column

        # Extract user input values safely as a scalar (not Series)
        user_values = user_input_df[col].iloc[0]  # `.iloc[0]` ensures a single row value

        # Ensure user values are a list (handle empty cases)
        if isinstance(user_values, (list, set)):  # Already a list
            user_values = list(user_values)
        elif isinstance(user_values, str):  # Convert string representation of a list to actual list
            try:
                user_values = eval(user_values) if user_values.startswith("[") and user_values.endswith("]") else [user_values]
            except:
                user_values = [user_values]  # Treat as a single string
        elif pd.isna(user_values) or user_values is None:  # Handle NaN/None cases
            user_values = []
        else:
            user_values = [user_values]  # Convert any remaining scalars into a list

        # Store matching results
        match_scores = []

        for _, meme_row in memes_dataset_loaded.iterrows():
            meme_values = meme_row[col]  # Values from memes dataset

            # Ensure meme values are also a list
            if isinstance(meme_values, (list, set)):  # Already a list
                meme_values = list(meme_values)
            elif isinstance(meme_values, str):  # Convert string representation of a list to actual list
                try:
                    meme_values = eval(meme_values) if meme_values.startswith("[") and meme_values.endswith("]") else [meme_values]
                except:
                    meme_values = [meme_values]  # Treat as a single string
            elif pd.isna(meme_values) or meme_values is None:  # Handle NaN/None cases
                meme_values = []
            else:
                meme_values = [meme_values]  # Convert scalars to lists

            # **Matching Logic**
            if not user_values:  # No input given
                match_score = 0
            elif len(user_values) == 1:  # Single-value match (1 or 0)
                match_score = 1 if user_values[0] in meme_values else 0
            else:  # Multi-value fractional match
                matched_count = sum(1 for value in user_values if value in meme_values)
                match_score = matched_count / len(user_values) if len(user_values) > 0 else 0

            match_scores.append(match_score)

        # Assign match scores to the corresponding column in matches_df
        matches_df[col] = match_scores

    return matches_df

In [71]:
matches_df = matching(user_input_df, memes_dataset_loaded)

In [79]:
matches_df

Unnamed: 0,Meme Name,Keywords,Primary Audience,Secondary Audience,Industry,Product/Service,Best Platforms,Humor Style,Emotion Targeted,Engagement Type,Seasonality,Tone Alignment
0,10-Guy,0.0,1.0,0.5,0,0,0,0.5,0.5,1,1,0.5
1,1950s-Middle-Finger,0.0,0.5,0.0,0,0,1,0.5,0.0,0,1,0.0
2,1990s-First-World-Problems,0.0,1.0,0.5,0,0,1,0.5,0.0,1,1,0.5
3,1st-World-Canadian-Problems,0.0,0.5,0.5,0,0,1,1.0,0.0,0,1,0.5
4,2nd-Term-Obama,0.0,0.5,0.0,0,0,1,0.5,0.5,0,1,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...
994,Zombie-Overly-Attached-Girlfriend,0.0,0.5,0.0,0,0,1,0.5,0.5,0,1,1.0
995,Zorg,0.0,0.5,0.5,1,1,0,0.5,0.5,1,1,0.5
996,Zuckerberg,0.0,0.5,0.5,1,1,1,0.5,0.5,0,1,0.0
997,Zura-Janai-Katsura-Da,0.0,0.5,0.5,0,0,1,0.0,1.0,0,1,0.5


__Weighting and Scoring Function:__

In [86]:
def calculate_weighted_scores(matches_df):
    WEIGHTS = {
        "Keywords": 0.1613,
        "Humor Style": 0.1613,
        "Emotion Targeted": 0.1613,
        "Tone Alignment": 0.0968,
        "Primary Audience": 0.0968,
        "Seasonality": 0.0968,
        "Best Platforms": 0.0968,
        "Secondary Audience": 0.0323,
        "Engagement Type": 0.0323,
        "Industry": 0.0323,
        "Product/Service": 0.0323
    }

    matches_df["Score"] = 0

    for feature, weight in WEIGHTS.items():
        matches_df["Score"] += matches_df[feature] * weight

    matches_df = matches_df.sort_values(by="Score", ascending=False).reset_index(drop=True)

    return matches_df  

In [87]:
final_score_table = calculate_weighted_scores(matches_df)

__Sorted Score-Table:__

In [88]:
final_score_table

Unnamed: 0,Meme Name,Keywords,Primary Audience,Secondary Audience,Industry,Product/Service,Best Platforms,Humor Style,Emotion Targeted,Engagement Type,Seasonality,Tone Alignment,Score
0,Macklemore-Thrift-Store,0.0,1.0,0.5,0,0,1,1.0,1.0,1,1,1.0,0.75825
1,Obama-Romney-Pointing,0.0,1.0,0.5,0,0,1,1.0,1.0,1,1,0.5,0.70985
2,Cel-Jesuno,0.0,1.0,0.5,0,0,1,1.0,1.0,1,1,0.5,0.70985
3,Terry-Davis,0.0,1.0,0.5,1,1,1,0.5,1.0,1,1,0.5,0.69380
4,RPG-Fan,0.0,1.0,0.5,1,1,1,0.5,1.0,1,1,0.5,0.69380
...,...,...,...,...,...,...,...,...,...,...,...,...,...
994,Scary-Harry,0.0,0.5,0.0,0,0,0,0.0,0.5,0,0,0.0,0.12905
995,Scumbag-Daylight-Savings-Time,0.0,0.0,0.0,0,0,0,0.5,0.0,0,0,0.5,0.12905
996,Grumpy-Cat-Halloween,0.0,0.0,0.0,0,0,0,0.0,0.5,0,0,0.5,0.12905
997,Hohoho,0.0,0.0,0.0,0,0,0,0.0,0.5,0,0,0.5,0.12905


In [89]:
display(final_score_table[["Meme Name", "Score"]].head(10))

Unnamed: 0,Meme Name,Score
0,Macklemore-Thrift-Store,0.75825
1,Obama-Romney-Pointing,0.70985
2,Cel-Jesuno,0.70985
3,Terry-Davis,0.6938
4,RPG-Fan,0.6938
5,Look-At-Me,0.6776
6,Doge,0.6776
7,Dave-Chappelle,0.6776
8,Fifa-E-Call-Of-Duty,0.66155
9,Sweaty-Concentrated-Rage-Face,0.66155


### Sub-Module 4

__Utility Function for OpenAI call:__

In [108]:
def generate_meme_text(meme_template_name, keywords_prompt, structured_inputs):
    prompt = f"""
    You are an AI meme generator that produces structured meme text for different meme templates.
    - Always return the output in **valid JSON format**.
    - Ensure the output follows the specific **meme template's format**.
    - Base your meme text on the provided **keywords** and **structured inputs**.
    - Your response must **ONLY contain the formatted meme text in JSON**.
    - The meme format ALWAYS has exactly **two lines of text**.

    Meme Template: {meme_template_name}

    Keywords Prompt: "{keywords_prompt}"

    Structured Inputs:
    {json.dumps(structured_inputs, indent=2)}

    Format your response strictly as:
    {{
        "meme_text": [
            "Line 1 of meme",
            "Line 2 of meme"
        ]
    }}
    """
    
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        response_format={"type": "json_object"}
    )

    try:
        meme_text_output = json.loads(response.choices[0].message.content.strip())
        return meme_text_output.get("meme_text", ["Error: Invalid JSON response"])
    except json.JSONDecodeError:
        return ["Error: Invalid JSON response"]

In [109]:
def generate_meme_texts_for_top_10(final_score_table, keywords_prompt):
    meme_texts_dict = {}

    for _, row in final_score_table.iterrows():
        meme_template_name = row["Meme Name"]

        structured_inputs = {
            "primary_audience": row["Primary Audience"],
            "secondary_audience": row["Secondary Audience"],
            "industry": row["Industry"],
            "product_service": row["Product/Service"],
            "best_platforms": row["Best Platforms"],
            "humor_style": row["Humor Style"],
            "emotion_targeted": row["Emotion Targeted"],
            "engagement_type": row["Engagement Type"],
            "tone_alignment": row["Tone Alignment"]
        }

        meme_text = generate_meme_text(meme_template_name, keywords_prompt, structured_inputs)
        
        meme_texts_dict[meme_template_name] = meme_text

    return meme_texts_dict

In [110]:
meme_texts = generate_meme_texts_for_top_10(final_score_table.iloc[:10], keywords_prompt)

In [111]:
print(meme_texts)

{'Macklemore-Thrift-Store': ['When you have 10 tasks but also Netflix...', 'Our app helps you stay on track like a thrift store find!'], 'Obama-Romney-Pointing': ["When you're trying to stay productive...", '...but distractions keep calling your name.'], 'Cel-Jesuno': ["When you're trying to stay productive...", '...but your to-do list keeps multiplying!'], 'Terry-Davis': ['When you’re trying to stay productive but distractions keep attacking', 'Good thing I have this app to keep me on track!'], 'RPG-Fan': ['When you try to stay productive...', '...but your to-do list has its own agenda.'], 'Look-At-Me': ['When you try to stay productive without our app...', '...and end up in chaos instead!'], 'Doge': ['Many tasks, very overwhelmed', 'But with productivity app, wow!'], 'Dave-Chappelle': ['When you try to stay productive without our app...', '...and end up doing everything but working.'], 'Fifa-E-Call-Of-Duty': ['Trying to stay productive without our app be like:', 'Fifa-E-Call-Of-Duty 

### Sub-Module 5:

- Create a call function in case user wants to generate more examples of a specific meme that we showed. Provide options for user to select from:
    - Did you like just the template? -> in this case pass only the template name in the call prompt.
    - Did you like the meme as a whole (both template and the message)? -> in this case pass both the template name and the example we generated before.
- Generate 3 more examples for the same meme and display to user.