# American Pizza Project — Lloom Theme Induction Experimentation Notebook

This notebook walks through how lloom themebuilding can be applied to the American Pizza Project:
1) Setup imports, establish API key, and load dataset
2) Preprocess / slice dataset -> Experiment filtering input data to include only particular demographics and questions!
3) Induce themes with LLooM  -> Experiment inputting various seed theme terms!
4) Review results, create data visualizations


Setup imports, establish API key, and load dataset

In [8]:
# If not already done, install packages. May need to restart kernel after.
!pip install text_lloom pyyaml pandas openpyxl
!pip install python-dotenv
!pip install pandas

zsh:1: command not found: pip
zsh:1: command not found: pip
zsh:1: command not found: pip


In [None]:
# Setup: imports used throughout the notebook
import sys
import subprocess
import os
import asyncio
import pandas as pd
import text_lloom.workbench as wb



# Set working directory
os.chdir('/Users/ltraum/Documents/GitHub/AmericanPizzaProject')

# set api key
os.environ["OPENAI_API_KEY"] = "REPLACE W YOUR OWN"

#load data
data_path = "data/pizza_interviews.xlsx"
df = pd.read_excel(data_path)

# Preview data
print(df.columns)
df.head()

ModuleNotFoundError: No module named 'pandas'

Preprocess / slice dataset -> Experiment filtering input data to include only particular demographics and questions!

In [4]:
import sys, os
print("Python:", sys.executable)
print("VIRTUAL_ENV:", os.environ.get("VIRTUAL_ENV"))


Python: /opt/homebrew/opt/python@3.11/bin/python3.11
VIRTUAL_ENV: /Users/ltraum/Documents/GitHub/AmericanPizzaProject/venv311


In [None]:
# Preprocess / slice data
# This filtering step needs work

def filter_demographics(
    df, regions=None, ages=None, income=None, diet=None
):
    df_filtered = df.copy()
    if regions:
        df_filtered = df_filtered[df_filtered["region_of_residence"].isin(regions)]
    if ages:
        df_filtered = df_filtered[df_filtered["age"].isin(ages)]
    if income:
        df_filtered = df_filtered[df_filtered["income"].isin(income)]
    if diet:
        df_filtered = df_filtered[df_filtered["food_restrictions"].isin(diet)]
    return df_filtered.reset_index(drop=True)

# Example: filter to just Northeast region
# filtered = filter_demographics(df, regions=['Northeast'], ages=['18-40'])
# filtered.head()

In [40]:
#all columns of interest
demo_cols = [
    "participant_id",
    "age",
    "city_of_residence",
    "state_of_residence",
    "region_of_residence",
    "income",
    "pizza_consumption",
    "food_restrictions"
]

# All response columns
response_cols = ["q1_response", "q2_response", "q3_response", "q4_response", "q5_response"]

# Add a column that concatenates all Q responses (handles missing by skipping blanks)
df["all_responses"] = df[response_cols].apply(
    lambda row: " ".join([str(r) for r in row if pd.notnull(r) and r.strip() != ""]),
    axis=1
)

# Build q_all_df with all demographics and the concatenated text
q_all_df = df[demo_cols + ["all_responses"]].rename(columns={"all_responses": "text"}) #toggle with responses included!
q_4_df = df[demo_cols + ["q4_response"]].rename(columns={"all_responses": "text"})

# Show the columns and a preview
print(q_all_df.columns)
q_all_df.head()

Index(['participant_id', 'age', 'city_of_residence', 'state_of_residence',
       'region_of_residence', 'income', 'pizza_consumption',
       'food_restrictions', 'text'],
      dtype='object')


Unnamed: 0,participant_id,age,city_of_residence,state_of_residence,region_of_residence,income,pizza_consumption,food_restrictions,text
0,1,32,Boston,Massachusetts,Northeast,$65k,Weekly,No food restrictions,My big pizza moment was trying Regina's in the...
1,2,45,Atlanta,Georgia,South,$85k,Monthly,No food restrictions,I wouldn't call it a turning point exactly. Gr...
2,3,67,Miami,Florida,South,$45k,Occasionally eat pizza,No food restrictions,My relationship with pizza has gone through ph...
3,4,23,Chicago,Illinois,Midwest,$28k,Weekly,No food restrictions,I finally tried deep dish at Lou Malnati's sop...
4,5,41,Birmingham,Alabama,South,$52k,Weekly,Lactose intolerant,"Honestly, I haven't had some big pizza awakeni..."


Induce themes with LLooM  -> Experiment inputting various seed theme terms!

In [None]:
# Prepare the lloom object
l = wb.lloom(q_all_df, text_col="text", id_col="participant_id")

async def extract_lloom_concepts(l, max_concepts=5, seed=""): 
    # Use gen_auto for one-step themes. seed is optional for steering.
    score_df = await l.gen_auto(max_concepts=max_concepts, seed=seed, debug=False)
    # Export: returns a summary per concept, ready for reporting
    export_df = l.export_df()
    return score_df, export_df

score_df, export_df = await extract_lloom_concepts(l, max_concepts=5, seed="convenience") #experiment adding stearing term like "taste" or "convenience" to seed



[1mEstimated cost[0m: $0.12
**Please note that this is only an approximate cost estimate**


[48;5;117mDistill-filter[0m
✅ Done    


[48;5;117mDistill-summarize[0m
✅ Done    


[48;5;117mCluster[0m
✅ Done    


[48;5;117mSynthesize[0m
⠹ Loading 



✅ Done    
✅ Done with concept generation!


[1mActive concepts[0m (n=5):
- [1mFamily Gatherings[0m: Does the text example describe pizza being used as a meal during family gatherings or events?
- [1mConvenience for Families[0m: Does the text example emphasize pizza as a convenient meal option for busy families?
- [1mFamily Gatherings[0m: Does the text describe pizza being used as a central part of family gatherings or events?
- [1mFamily Bonding[0m: Does the text mention pizza as a means to enhance family bonding or shared experiences?
- [1mNostalgic Family Memories[0m: Does the text evoke nostalgic memories related to family and pizza?


Scoring 5 concepts for 50 documents
[1mEstimated cost[0m: $0.02
**Please note that this is only an approximate cost estimate**
100%|██████████| 5/5 [00:44<00:00,  8.84s/it]
✅ Done with concept scoring!


KeyError: 'Family Gatherings'

In [None]:
# main outputs tables
print("Score DataFrame:")
display(score_df.head(10))
print("Exported Concepts/Themes:")
display(export_df.head(10))

Score DataFrame:


Unnamed: 0,doc_id,text,concept_id,concept_name,concept_prompt,score,rationale,highlight,concept_seed
0,1,Pizza is my social food - Friday nights with m...,8e2859c1-21b0-4f55-ac88-17abdb8e8798,Family Bonding,Does the text describe pizza as a means to enh...,1.0,The text clearly describes pizza as a social f...,It's comfort food that brings people together ...,family
1,2,Pizza is weekend or special occasion food for ...,8e2859c1-21b0-4f55-ac88-17abdb8e8798,Family Bonding,Does the text describe pizza as a means to enh...,1.0,The text clearly describes pizza nights as a s...,Pizza is a way to connect with my family.,family
2,3,Pizza is mostly about family time now.\nWhen g...,8e2859c1-21b0-4f55-ac88-17abdb8e8798,Family Bonding,Does the text describe pizza as a means to enh...,1.0,The text clearly emphasizes the importance of ...,Pizza nights with family are about conversatio...,family
3,4,my roommate's parents took us to Lou's\nI'm a ...,8e2859c1-21b0-4f55-ac88-17abdb8e8798,Family Bonding,Does the text describe pizza as a means to enh...,1.0,The text discusses how pizza brings the friend...,Pizza brings our friend group together,family
4,5,Pizza is family dinner once or twice a week wh...,8e2859c1-21b0-4f55-ac88-17abdb8e8798,Family Bonding,Does the text describe pizza as a means to enh...,1.0,The text describes pizza as a regular family d...,Pizza is family dinner once or twice a week wh...,family
5,7,"Pizza happens when the grandkids visit, mostly...",8e2859c1-21b0-4f55-ac88-17abdb8e8798,Family Bonding,Does the text describe pizza as a means to enh...,1.0,The text emphasizes that pizza is primarily or...,"Pizza happens when the grandkids visit, mostly.",family
6,9,"If my son visits with the grandkids, we might ...",8e2859c1-21b0-4f55-ac88-17abdb8e8798,Family Bonding,Does the text describe pizza as a means to enh...,1.0,The text discusses ordering pizza when family ...,"If my son visits with the grandkids, we might ...",family
7,11,My wife and I will pick a neighborhood and try...,8e2859c1-21b0-4f55-ac88-17abdb8e8798,Family Bonding,Does the text describe pizza as a means to enh...,1.0,The text describes a couple's shared experienc...,My wife and I will pick a neighborhood and try...,family
8,13,Pizza is family dinner maybe once or twice a m...,8e2859c1-21b0-4f55-ac88-17abdb8e8798,Family Bonding,Does the text describe pizza as a means to enh...,1.0,The text clearly describes pizza night as a sp...,It's comfort food that brings the family toget...,family
9,14,it was an easy dinner solution and they loved ...,8e2859c1-21b0-4f55-ac88-17abdb8e8798,Family Bonding,Does the text describe pizza as a means to enh...,1.0,The text emphasizes the nostalgic and social a...,It's more about the memories and family time t...,family


Exported Concepts/Themes:


Unnamed: 0,concept,criteria,summary,rep_examples,prevalence,n_matches,highlights
0,Family Bonding,Does the text describe pizza as a means to enh...,"Pizza fosters family bonding, creating cherish...",[Pizza is my social food - Friday nights with ...,0.48,24,[Pizza is family dinner maybe once or twice a ...
1,Family Gatherings,Is pizza described as a common food choice for...,"Pizza is a beloved family meal, often enjoyed ...",[The city has good neighborhood spots that wor...,0.48,24,"[If my son visits with the grandkids, we might..."
2,Family Occasions,Is pizza mentioned as a food associated with s...,"Pizza is a cherished family tradition, often e...",[it was an easy dinner solution and they loved...,0.34,17,"[Pizza happens maybe once every couple months,..."
3,Family Preferences,Does the text discuss specific pizza preferenc...,"Pizza is a family favorite, often chosen for g...",[Pizza is family dinner once or twice a week w...,0.48,24,[It's also easy dinner when both parents are w...
4,Family Traditions,Does the text mention pizza as part of a famil...,"Family traditions revolve around pizza nights,...","[If my son visits with the grandkids, we might...",0.36,18,[Pizza nights with family are about conversati...


In [None]:
# report per-theme results for future app
for i, row in export_df.iterrows():
    print(f"\nTheme {i+1}: {row['concept']}")
    print(f"Criteria: {row['criteria']}")
    print(f"Summary: {row['summary']}")
    print(f"Prevalence: {row['prevalence']*100:.1f}% of participants")
    print("Representative Examples:")
    for ex in row['rep_examples']:
        print("-", ex)
    print("-" * 40)

# Save results to CSV for future dashboard use
export_df.to_csv("lloom_themes_summary.csv", index=False)
score_df.to_csv("lloom_theme_scores.csv", index=False)


Theme 1: Family Bonding
Criteria: Does the text describe pizza as a means to enhance family bonding or togetherness?
Summary: Pizza fosters family bonding, creating cherished memories and connections during casual meals and gatherings.
Prevalence: 48.0% of participants
Representative Examples:
- Pizza is my social food - Friday nights with my girlfriend watching movies, or hitting spots in the North End with friends before Red Sox games.
I rarely eat it alone or on-the-go.
It's comfort food that brings people together and forces us to slow down during busy weeks.
Weekend afternoons we'll walk to our local place and sit outside with slices and beers.
- Pizza is Lions game food, family gatherings, and weekend comfort.
During football season, Buddy's delivery is basically mandatory for watching games.
It's also celebration food - promotions, birthdays, good news gets celebrated with Detroit square pizza.
Sunday dinner with extended family often includes picking up squares from our favori

In [38]:
#l.vis()
l.vis(slice_col="region_of_residence")



MatrixWidget(data='[{"id":"All","value":24,"example":"All","_my_score":0,"concept":"Family Bonding","n":24},{"…