# Mello ML - Fresh Unified Personality System

This notebook demonstrates the complete pipeline:
- **Unified personality approach**: Cultural data informs personality traits
- **768D embeddings**: Interests + 5 personality traits
- **50 archetypes**: Diverse synthetic user generation
- **Real user support**: Load from JSON files
- **2D visualization**: PCA and UMAP plotting

## 📦 Imports and Setup

In [1]:
%load_ext autoreload
%autoreload 2
# Interactive Plotly visualizations
import warnings
warnings.filterwarnings('ignore', category=UserWarning)

# Core imports - Plotly for interactive visualization
import numpy as np
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
import logging
from IPython.display import display, HTML

# Fresh system components
from user import User
from profile_generator import ProfileGenerator
from embedding_generator import EmbeddingGenerator
from population import Population
from visualizer import Visualizer

# Configure logging
logging.basicConfig(level=logging.WARNING)  # Reduce noise

# Initialize components
print("🚀 Initializing Mello ML Components...")
profile_generator = ProfileGenerator()
embedding_generator = EmbeddingGenerator()
population = Population("Mello Campus Population")
visualizer = Visualizer()

print(f"✅ Components initialized:")
print(f"   📝 ProfileGenerator: {profile_generator.model}")
print(f"   🔢 EmbeddingGenerator: {embedding_generator}")
print(f"   📊 Architecture: 768D interests + 5×768D traits = 4608D combined")
print(f"   🎭 Archetypes available: {len(profile_generator.archetypes)}")
print(f"   📏 Similarity metric: Euclidean Distance (L2 norm, normalized, 0=different, 1=identical)")
print(f"   🎨 Visualization: Interactive Plotly interface")

🚀 Initializing Mello ML Components...
✅ Components initialized:
   📝 ProfileGenerator: google/gemini-2.5-flash
   🔢 EmbeddingGenerator: EmbeddingGenerator(model=text-embedding-004, dims=768, requests=0)
   📊 Architecture: 768D interests + 5×768D traits = 4608D combined
   🎭 Archetypes available: 50
   📏 Similarity metric: Euclidean Distance (L2 norm, normalized, 0=different, 1=identical)
   🎨 Visualization: Interactive Plotly interface


## 🎭 Generate 50 Synthetic Users

Creates diverse synthetic users using 50 personality archetypes with unified personality profiling.

42

In [13]:
print("🎭 Generating 50 synthetic users with unified personality approach...")
print("This may take several minutes due to API calls...")
print()


synthetic_users_generated = len(population.users)
target_count = 158 + synthetic_users_generated

for i in range(target_count):
    print(f"   Generating user {i+1}/{target_count}...", end=" ")
    
    try:
        # Generate synthetic user data
        user_data = profile_generator.generate_synthetic_user_data()
        
        if user_data:
            # Create user from generated data
            user = User.from_json_data(user_data)
            
            # Generate unified profiles (interests + personality)
            profiles_success = profile_generator.generate_complete_profiles(user)
            
            if profiles_success:
                # Generate embeddings (768D each)
                embeddings_success = embedding_generator.embed_user_complete(user)
                
                if embeddings_success:
                    population.add_user(user)
                    synthetic_users_generated += 1
                    archetype = user_data.get('metadata', {}).get('archetype', 'Unknown')
                    print(f"✅ {user.name} ({archetype[:30]}...)")
                else:
                    print(f"❌ Failed embeddings")
            else:
                print(f"❌ Failed profiles")
        else:
            print(f"❌ Failed data generation")
    
    except Exception as e:
        print(f"❌ Error: {e}")
    
    # Progress update every 10 users
    if (i + 1) % 10 == 0:
        print(f"\n📈 Progress: {synthetic_users_generated}/{i + 1} users completed\n")

print(f"\n🎉 Synthetic user generation complete!")
print(f"✅ Successfully generated: {synthetic_users_generated}/{target_count} users")
print(f"📊 Success rate: {synthetic_users_generated/target_count*100:.1f}%")
print(f"👥 Population size: {len(population)} users")

🎭 Generating 50 synthetic users with unified personality approach...
This may take several minutes due to API calls...

   Generating user 1/200... ✅ James Vargas
classYear: 2026
major: Computer Science
bio: Future-proofing the present, one line of code at a time. I'm fascinated by the intersection of technology and society, especially how our digital creations are reshaping what it means to be human. When I'm not coding, you can usually find me lost in a cyberpunk novel or researching the latest advancements in AI.
interests: Cyberpunk literature, Artificial Intelligence, Ethical Hacking, Virtual Reality, Sci-Fi films, Software Engineering, Neuromorphic Computing (tech geek drawn to cyberpunk a...)
   Generating user 2/200... ✅ Beth Kelly
classYear: 2026
major: Film Production & Studies
bio: Constantly chasing the next visual or sonic inspiration. I'm fascinated by how stories can be told without words, and the raw emotion found in a perfectly imperfect melody. Currently trying to fig



✅ Donna Miller
classYear: 2025
major: Philosophy
bio: I find beauty in the unadorned. Life's grand narratives often hide in the quiet hum of everyday moments, and I seek to understand them. Give me a good book, a quiet corner, and meaningful conversation over anything else.
interests: existentialism, acoustic guitar, indie folk, hiking, ethical consumption, journaling, classic literature (minimalist who prefers simple,...)
   Generating user 10/200... ✅ Alyssa Roberts
classYear: 2026
major: Anthropology
bio: You never know what's around the next corner, and that's the best part! I'm always down for a spontaneous road trip, exploring new places, or just seeing where the day takes me. Let's make some unforgettable memories.
interests: urban exploration, thrift store diving, hiking unknown trails, independent film, open mic nights, trying new foods, stargazing (spontaneous adventurer open to...)

📈 Progress: 52/10 users completed

   Generating user 11/200... ✅ Tammy Henderson
classYear: 



✅ Dr. Jordan Shannon
classYear: 2025
major: Philosophy and Political Science
bio: I believe in the boundless potential of humanity and the power of collective imagination to build a more just and beautiful world. My days are spent exploring audacious ideas and finding inspiration in the stories of those who dare to dream. Every sunrise is an invitation to contribute to the tapestry of a brighter future.
interests: utopian literature, social justice, sustainable communities, ethical AI, philosophical debates, intentional living, documentary filmmaking (dreamy idealist drawn to utopi...)
   Generating user 24/200... ✅ Dustin Mullen
classYear: 2026
major: Comparative Literature
bio: My natural habitat is a quiet corner of the library, preferably with a well-worn copy of Montaigne or an obscure Russian novel. I find the exploration of human thought across centuries to be the most compelling adventure one can embark on, and I often lose myself in the intricate dance of ideas within classic 



✅ Jennifer Ray
classYear: 2026
major: Digital Media Arts
bio: My sketchbook is basically a portal to another dimension, and I'm always looking for a new medium to bend to my will. Whether it's coding a generative art piece or building a sculpture out of found objects, I love the thrill of making something completely new. There are no rules, just endless possibilities.
interests: generative art, avant-garde film, experimental music, street art, performance art, graphic design, interactive installations (curious experimenter always tr...)
   Generating user 36/200... ✅ Edward Dominguez
classYear: 2026
major: Philosophy, Politics, and Economics (PPE)
bio: I believe a brighter, more equitable world isn't just a dream – it's an achievable goal if we work together with purpose and compassion. My passions lie in exploring radical solutions for social justice and imagining how we can build systems that truly nurture human potential. The future is ours to sculpt, and I'm here to contribute to s



✅ Eleanor Vance
classYear: 2026
major: Art History
bio: I find beauty in the quiet details, often overlooked. My greatest joy comes from contemplating how art, in its many subtle forms, reflects the human experience across time. 
interests: Renaissance portraiture, chiaroscuro lighting, independent film, ambient music, antique bookstores, urban sketching, black and white photography (quiet observer interested in s...)

📈 Progress: 92/50 users completed

   Generating user 51/200... ✅ Spencer White
classYear: 2026
major: Interdisciplinary Arts & Technology
bio: I'm constantly on the hunt for the next creative rabbit hole to dive down, whether it's coding a generative art piece or learning to bind custom books. The intersection of emerging tech and timeless artistic expression is where I really thrive. If it involves a new medium or a wild idea, count me in!
interests: generative art, experimental music production, letterpress printing, interactive installations, speculative fiction, dig

ERROR:profile_generator:Failed to parse JSON from personality response: ```json
{
  "Openness": "This individual exhibits a moderate degree of openness, demonstrating a selective curiosity for new experiences. While comfortable exploring novel artistic expressions, music ...
ERROR:profile_generator:Failed to generate personality profiles for Roxy "Rox" Malone
classYear: 2026
major: Comparative Literature (with a focus on underground press)
bio: Corporate conformity is a disease, and I'm the cure. You can find me lurking in forgotten corners of the city, notebook in hand, sketching the shadows of modern decay. If it's mainstream, I'm out; if it's got grit and a story to tell, I'm all in.
interests: Zine making, experimental music, street art, anarcho-punk history, found poetry, avant-garde cinema, urban exploration


❌ Failed profiles
   Generating user 60/200... ✅ Mitchell Navarro
classYear: 2025
major: Urban Studies and Media Production
bio: My city is my muse—the concrete jungle fuels my creativity and keeps me on my toes. I'm all about capturing the pulse of urban life through film and sound, exploring how our contemporary culture shapes and is shaped by the places we inhabit. When I'm not in class, you'll probably find me at a pop-up gallery or hunting for the best street art.
interests: indie film, street art, electronic music, urban exploration, contemporary art, pop-up events, documentary photography (urban dweller fascinated by ci...)

📈 Progress: 101/60 users completed

   Generating user 61/200... ✅ Denise Robinson
classYear: 2025
major: Sociology and Ethnic Studies (double major)
bio: I'm passionate about amplifying marginalized voices and working towards a more equitable world. Every day presents an opportunity to learn, challenge the status quo, and build bridges through honest dialog



✅ William Frey
classYear: 2026
major: Sociology and Political Science
bio: I believe in the power of collective action to create a more equitable world. My passion lies in amplifying marginalized voices and challenging systemic injustices through education and advocacy. I'm always looking for new ways to engage in meaningful dialogue and translate empathy into tangible change.
interests: social justice, intersectionality, community organizing, documentary filmmaking, spoken word poetry, podcasting, international relations (social activist interested in ...)
   Generating user 76/200... 



✅ Andrew "Aethelred" Sterling
classYear: 2026
major: Computer Science & Digital Media Studies (Dual Major)
bio: My brain runs on lines of code and the gritty aesthetics of neo-Tokyo. I’m endlessly fascinated by the bleeding edge of tech and how it’s shaping our future, for better or for worse. When I’m not debugging or concepting, you can probably find me lost in a cyberpunk novel or tinkering with some bizarre new gadget.
interests: Cyberpunk fiction, AI ethics, generative art, open-source software, retro-futurism, transhumanism, modular synthesizers (tech geek drawn to cyberpunk a...)
   Generating user 77/200... ✅ Sheila Clark
classYear: 2025
major: Business Administration (Concentration in Operations Management)
bio: I'm not here to write a thesis, I'm here to build something that works. Give me the problem, and I'll find the most efficient way to solve it. My brain is wired for practical applications, not theoretical debates.
interests: process improvement, organizational efficien



✅ Samuel Morrison
classYear: 2025
major: Environmental Science
bio: Hey there! I'm Sam, and there's nothing I love more than exploring the great outdoors. If I'm not in class, you can probably find me hiking a new trail, trying to identify local flora, or planning my next backpacking trip. I'm passionate about protecting our planet and learning how we can live more sustainably.
interests: hiking, backpacking, birdwatching, environmental conservation, camping, nature photography, sustainable living (nature lover interested in env...)
   Generating user 79/200... ✅ Ryan Lee
classYear: 2025
major: Theatre Arts
bio: Hey y'all, Ryan here! If you need me, I'm probably on stage, rehearsing a monologue, or brainstorming my next big performance art piece. Life's a stage, and I'm just here for the dramatic entrance (and exit, of course!).
interests: acting, directing, musical theatre, improv comedy, film analysis, costume design, playwriting (extroverted performer drawn to...)
   Generating user



✅ Erik Peterson
classYear: 2025
major: Comparative Literature
bio: Shadows hold more truth than light ever could. I seek beauty in decay, solace in the macabre, and understanding in the convoluted depths of the human psyche. Don't bother me with your trivialities.
interests: gothic literature, film noir, existential philosophy, true crime podcasts, cemeteries, dark academia fashion, psychological thrillers (dark aesthete fascinated by go...)
   Generating user 89/200... ✅ Lisa Mitchell
classYear: 2025
major: Computer Science
bio: I thrive on bringing order to complexity, whether it's optimizing algorithms or meticulously planning my semester. My ideal narrative involves clear goals, logical steps, and a satisfying, well-defined conclusion. I believe thorough preparation is the key to both academic and personal success.
interests: algorithmic puzzles, detailed itinerary planning, classic mystery novels, project management, financial planning, competitive strategy games, organized hiking



✅ Michael Watson
classYear: 2026
major: Classics
bio: I find true solace and profound understanding in the narratives and artistry of antiquity. The enduring wisdom of the Greeks and Romans continues to illuminate the human condition, offering timeless insights that resonate as powerfully today as they did millennia ago. My studies are a journey through the foundations of Western thought and aesthetics, revealing beauty and truth that transcend ephemeral trends.
interests: Roman history, Greek tragedy, classical architecture, ancient philosophy, Latin poetry, historical linguistics, epigraphy (timeless classicist who values...)
   Generating user 107/200... ✅ Linda Garcia
classYear: 2026
major: Philosophy, Politics, and Economics (PPE)
bio: I believe in the power of human ingenuity to build a more just and beautiful world. My heart beats for collective flourishing and exploring pathways to a future where everyone can thrive. I'm always seeking out new ideas and inspirations that push t



✅ Andrew Johnson
classYear: 2025
major: Theatre Arts
bio: Hey there! I'm an absolute force of nature on and off the stage. Give me a spotlight, a dramatic monologue, or a good old-fashioned improv challenge, and I'm in my element. I love pushing boundaries and making people feel something!
interests: acting, improv comedy, film noir, musical theatre, screenwriting, dramatic literature, fashion design (extroverted performer drawn to...)
   Generating user 117/200... ✅ Jamie Woodward
classYear: 2026
major: Classics
bio: Erudition is not merely a pursuit of knowledge, but a dedicated journey into the foundational texts and timeless questions that define our humanity. I am drawn to the profound wisdom embedded in ancient languages and philosophies, believing that true intellectual rigor lies in understanding the past to illuminate the present. My academic life is a devoted exploration of these classical landscapes.
interests: Ancient Greek, Latin, Roman history, Hellenistic philosophy, ety



✅ Sarah Chen
classYear: 2025
major: Forensic Psychology
bio: Give me a good whodunit and a rainy night, and I'm set. I'm endlessly fascinated by the human mind, especially the darker corners, and how it all plays out in the pursuit of justice. My ideal weekend involves a true crime podcast marathon and trying to figure out the killer before the detective does.
interests: true crime podcasts, psychological thrillers, cold case documentaries, investigative journalism, criminology, film noir, escape rooms (mystery lover obsessed with cr...)
   Generating user 123/200... ✅ Rebecca Sanchez
classYear: 2026
major: Conflict Resolution
bio: I believe that understanding each other is the first step towards a more peaceful world. I love finding common ground and helping people connect. Let's make some good things happen, together!
interests: mediation, community gardening, painting, hiking, baking, reading, mindful breathing (peaceful mediator preferring h...)
   Generating user 124/200... ✅ Came



✅ Victoria Martinez
classYear: 2026
major: Marketing
bio: Hey, I'm Victoria! Always scrolling to see what's new and what's next. From the latest TikTok dances to the hottest drops, I'm probably already obsessed and planning my next move. Gotta stay ahead of the curve, right?
interests: TikTok trends, pop music, sustainable fashion, niche beauty brands, social media strategy, indie films, cafe hopping (trend follower who stays curre...)
   Generating user 143/200... ✅ Mindy Lee
classYear: 2026
major: Communications
bio: "Obsessed with all things #PopCulture! You can usually find me scrolling TikTok for the latest trends or binge-watching the newest Netflix sensation. Let's spill some tea and make some memories!"
interests: celebrity gossip, TikTok trends, reality TV, social media, fashion, pop music, influencer culture (social butterfly who loves cel...)
   Generating user 144/200... ✅ Carla Williams
classYear: 2025
major: Classical Studies
bio: As a student of the enduring wisdom of th



✅ Maya Angelou
classYear: 2025
major: Sociology
bio: I'm passionate about amplifying marginalized voices and challenging systemic inequalities. My goal is to use my voice and education to advocate for a more just and equitable world for everyone. Let's build a better future, together!
interests: social justice, intersectionality, community organizing, poetry, documentary filmmaking, global human rights, spoken word (social activist interested in ...)
   Generating user 150/200... ✅ Ashley Austin
classYear: 2026
major: Philosophy
bio: The world is loud, but I'm listening to the whispers. I'm here to challenge the narratives, not just learn them. If it doesn't make you think, really *think*, what's the point?
interests: zine-making, experimental music, urban exploration, radical literature, street art, DIY ethics (rebel nonconformist attracted ...)

📈 Progress: 191/150 users completed

   Generating user 151/200... ✅ Tiffany Goodman
classYear: 2025
major: Communication Studies
bio: I'm a



✅ Anya Petrova
classYear: 2025
major: International Relations and East Asian Studies
bio: Originally from Sofia, Bulgaria, I'm passionate about exploring global communities and understanding different perspectives. When I'm not buried in my textbooks, you'll probably find me trying out a new recipe from a different country or debating current events. I love a good adventure and finding beauty in unexpected places!
interests: global cinema, hiking, trying new cuisines, foreign language podcasts, historical fiction, indie music, art museums (international student with div...)
   Generating user 159/200... ✅ Lindsey Gray
classYear: 2026
major: Music Composition
bio: My real life isn't quite as magical as the worlds I explore in my head, but I'm working on it! My greatest joy is creating the soundtracks to epic adventures, both real and imagined. I dream of one day scoring a sweeping fantasy film.
interests: fantasy novels, orchestral music, Dungeons & Dragons, video game soundtracks, worl



✅ Luna Everhart
classYear: 2026
major: Comparative Literature
bio: My head is almost constantly in the clouds, or more accurately, in another world. I find solace and excitement within the pages of a fantasy novel or the intricate lore of an RPG. Reality is… fine, but the possibilities of the imaginary are boundless.
interests: Fantasy novels, Worldbuilding, Role-playing games (RPGs), Digital art, Mythology, Creative writing, Stargazing (escapist dreamer seeking fanta...)
   Generating user 192/200... ✅ Anna Price
classYear: 2025
major: English Literature
bio: There's nothing quite like a well-worn classic or a story that has stood the test of time. I love delving into tales that explore universal themes and relying on familiar narratives for comfort and insight. Sometimes, the oldest stories are the newest to us.
interests: classic novels, black-and-white movies, folk music, historical fiction, letter writing, baking, antique shops (nostalgic traditionalist prefe...)
   Generating use

ERROR:profile_generator:Failed to parse JSON from personality response: ```json
{
  "Openness": "This individual demonstrates a moderate degree of openness, exhibiting a comfortable balance between novelty and familiarity. While they readily explore new music genres, sugg...
ERROR:profile_generator:Failed to generate personality profiles for Deanna Sanders
classYear: 2026
major: Sociology and Communications
bio: I truly believe in the incredible power of human connection and the boundless potential within each of us. My passion lies in understanding how people thrive and collaborating to build a more empathic and flourishing world for everyone. I'm always on the lookout for stories that highlight resilience, innovation, and the beautiful ways people lift each other up.
interests: community organizing, documentary filmmaking, ethical AI, global social movements, interfaith dialogue, creative writing, sustainable living


❌ Failed profiles
   Generating user 199/200... ✅ Ricardo Warren
classYear: 2025
major: Film & Digital Media
bio: I’m always chasing the next visual story, whether it’s through my own lens or in a hidden gem at a microcinema. My goal is to make films that challenge perception and leave you thinking long after the credits roll. If a song doesn't make me feel something visceral, it’s not for me.
interests: experimental cinema, analog photography, indie music, synthwave, obscure film festivals, screenwriting, DIY zines (creative arts student into exp...)
   Generating user 200/200... ✅ Deborah Johnson
classYear: 2026
major: Philosophy, Politics, and Economics (PPE)
bio: My heart beats for a better world! I love exploring radical ideas for social change and finding inspiration in the stories of those who dared to dream big. The future is ours to build, and I believe we can create something truly beautiful.
interests: utopian literature, social justice, sustainable living, contemplative pra

## 👤 Load Real User

Loads a real user from JSON file and processes through the same pipeline.

In [8]:
# Load real user from JSON file
real_user_path = "sofiia.json"

print(f"👤 Loading real user from: {real_user_path}")

try:
    # Load user from JSON
    real_user = User.from_json_file(real_user_path)
    real_user.special = True  # Mark as special for visualization
    
    print(f"✅ Loaded user: {real_user.name}")
    print(f"   Major: {real_user.profile_data.get('major', 'Unknown')}")
    print(f"   Bio: {real_user.profile_data.get('bio', 'No bio')[:100]}...")
    print(f"   Interests: {', '.join(real_user.profile_data.get('interests', [])[:5])}")
    
    # Generate unified personality profile from cultural data
    print(f"\n🔄 Processing {real_user.name} through unified pipeline...")
    print(f"   1. Generating unified interests profile from cultural preferences...")
    
    profiles_success = profile_generator.generate_complete_profiles(real_user)
    
    if profiles_success:
        print(f"   ✅ Generated unified profiles")
        
        # Show profile preview
        if real_user.interests_profile:
            print(f"   📖 Interests profile: {real_user.interests_profile[:150]}...")
        
        if real_user.personality_profiles:
            print(f"   🧠 Personality traits: {list(real_user.personality_profiles.keys())}")
        
        # Generate embeddings
        print(f"   2. Generating 768D embeddings...")
        embeddings_success = embedding_generator.embed_user_complete(real_user)
        
        if embeddings_success:
            print(f"   ✅ Generated embeddings (6 × 768D)")
            
            # Verify embedding dimensions
            combined = real_user.get_combined_embedding()
            if combined is not None:
                print(f"   🔢 Combined embedding shape: {combined.shape}")
            
            # Add to population
            population.add_user(real_user)
            print(f"   ✅ Added to population")
            
        else:
            print(f"   ❌ Failed to generate embeddings")
    else:
        print(f"   ❌ Failed to generate profiles")

except FileNotFoundError:
    print(f"❌ File not found: {real_user_path}")
    print(f"   Please ensure the JSON file exists in the correct location")
except Exception as e:
    print(f"❌ Error loading real user: {e}")

print(f"\n👥 Final population: {len(population)} users")
print(f"📊 Users with embeddings: {len(population.get_users_with_embeddings())}")

👤 Loading real user from: sofiia.json
✅ Loaded user: Sofiia 
   Major: Psychology 
   Bio: Interested in human psychology, global politics, music, and dance. ...
   Interests: Photography, Video-editing, Dance, Guitar, Reading

🔄 Processing Sofiia  through unified pipeline...
   1. Generating unified interests profile from cultural preferences...
   ✅ Generated unified profiles
   📖 Interests profile: This individual presents as a deeply contemplative and analytically-minded person, driven by a profound curiosity about the intricacies of human exper...
   🧠 Personality traits: ['Openness', 'Conscientiousness', 'Extraversion', 'Agreeableness', 'Neuroticism']
   2. Generating 768D embeddings...
   ✅ Generated embeddings (6 × 768D)
   🔢 Combined embedding shape: (4608,)
   ✅ Added to population

👥 Final population: 263 users
📊 Users with embeddings: 255


In [2]:
population = Population()
population = population.load_from_json("mello_population.json")

## 📊 Population Statistics

Analyze the generated population and embedding quality.

In [3]:
# Get population statistics
stats = population.get_statistics()

print("📊 Population Statistics")
print("=" * 30)
#print(f"Population Name: {stats['population_name']}")
print(f"Total Users: {stats['total_users']}")
print(f"Users with Profiles: {stats['users_with_profiles']}")
print(f"Users with Embeddings: {stats['users_with_embeddings']}")

if stats['embedding_stats']:
    print(f"\n🔢 Embedding Dimensions:")
    for key, value in stats['embedding_stats'].items():
        if isinstance(value, int):
            print(f"   {key}: {value}D")
        elif isinstance(value, dict):
            print(f"   {key}:")
            for trait, dims in value.items():
                print(f"     {trait}: {dims}D")

# Find special users
special_users = [user for user in population.users if user.special]
print(f"\n⭐ Special Users: {len(special_users)}")
for user in special_users:
    print(f"   {user.name} - {user.profile_data.get('major', 'Unknown major')}")

# Embedding summary
embedding_summary = visualizer.create_embedding_summary(population)
print(f"\n🎯 Embedding Modes Available:")
for mode, info in embedding_summary['embedding_modes'].items():
    if 'users_count' in info and info['users_count'] > 0:
        print(f"   {mode}: {info['users_count']} users, {info.get('dimensions', '?')}D")

📊 Population Statistics
Total Users: 248
Users with Profiles: 248
Users with Embeddings: 240

🔢 Embedding Dimensions:
   interests_dims: 768D
   trait_dims:
     Openness: 768D
     Conscientiousness: 768D
     Extraversion: 768D
     Agreeableness: 768D
     Neuroticism: 768D
   combined_dims: 4608D

⭐ Special Users: 9
   Yahya Rahhawi - Computer science, Philosophy
   Einstein - Unknown major
   Mary Curry - Unknown major
   Bruce Wayne - Unknown major
   Jimmy McGill - Unknown major
   Leonardo da Vinci - Unknown major
   Alyosha Karamazov - Unknown major
   Sam Altman - Unknown major
   Donald Trump - Unknown major

🎯 Embedding Modes Available:
   combined: 240 users, 4608D
   interests: 240 users, 768D
   Openness: 240 users, 768D
   Conscientiousness: 240 users, 768D
   Extraversion: 240 users, 768D
   Agreeableness: 240 users, 768D
   Neuroticism: 240 users, 768D


## 🔍 Similarity Analysis

Test similarity search with the real user (if loaded).

In [9]:
# Find the real user for similarity testing
real_user = None
for user in population.users:
    if user.special and user.name == "Sofiia":
        real_user = user
        break

if real_user and len(population.get_users_with_embeddings()) >= 5:
    print(f"🔍 Similarity Analysis for {real_user.name}")
    print("=" * 50)
    
    # Combined similarity (all embeddings)
    print(f"\n🎯 Most Similar Users (Combined Embeddings):")
    similar_combined = population.find_similar_users(real_user, mode='combined', top_k=5)
    
    for i, (similar_user, score) in enumerate(similar_combined, 1):
        archetype = similar_user.metadata.get('original_data', {}).get('metadata', {}).get('archetype', 'Unknown')
        print(f"   {i}. {similar_user.name}: {score:.3f} ({archetype[:40]}...)")
    
    # Interests similarity
    print(f"\n📚 Most Similar Users (Interests Only):")
    similar_interests = population.find_similar_users(real_user, mode='interests', top_k=5)
    
    for i, (similar_user, score) in enumerate(similar_interests, 1):
        archetype = similar_user.metadata.get('original_data', {}).get('metadata', {}).get('archetype', 'Unknown')
        print(f"   {i}. {similar_user.name}: {score:.3f} ({archetype[:40]}...)")
    
    # Trait-specific similarities
    print(f"\n🧠 Trait-Specific Most Similar Users:")
    traits = ['Openness', 'Conscientiousness', 'Extraversion', 'Agreeableness', 'Neuroticism']
    
    for trait in traits:
        try:
            similar_trait = population.find_similar_users(real_user, mode=trait, top_k=1)
            if similar_trait:
                most_similar, score = similar_trait[0]
                print(f"   {trait}: {most_similar.name} ({score:.3f})")
        except Exception as e:
            print(f"   {trait}: Error - {str(e)[:50]}...")

else:
    print(f"⚠️  Cannot perform similarity analysis:")
    if not real_user:
        print(f"   - No real user loaded (special=True)")
    if len(population.get_users_with_embeddings()) < 5:
        print(f"   - Need at least 5 users with embeddings (have {len(population.get_users_with_embeddings())})")

⚠️  Cannot perform similarity analysis:
   - No real user loaded (special=True)


In [10]:
# Interactive Plotly Visualizations
users_with_embeddings = population.get_users_with_embeddings()

print(f"📊 Interactive Plotly Population Visualization")
print(f"Users with embeddings: {len(users_with_embeddings)}")
print(f"Similarity metric: Euclidean Distance (L2 norm, normalized)")
print()

if len(users_with_embeddings) >= 3:
    
    # 📚 Plotly PCA Visualization - Interests Only
    print(f"📚 Creating Plotly PCA - Interests Embeddings (768D → 2D):")
    try:
        fig_interests = visualizer.plot_population_pca(
            population, 
            mode='interests', 
            highlight_special=True, 
            figsize=(12, 8)
        )
        fig_interests.show()
        print(f"✅ Plotly interests PCA complete")
        
    except Exception as e:
        print(f"❌ Interests PCA failed: {e}")
    
    print()
    
    # 🧠 Plotly PCA Visualization - Combined Personality Traits
    print(f"🧠 Creating Plotly PCA - Combined Personality Traits (3840D → 2D):")
    try:
        # Get combined personality embedding (all 5 traits)
        from sklearn.decomposition import PCA
        from sklearn.preprocessing import StandardScaler
        import plotly.graph_objects as go
        import numpy as np
        
        # Get users with complete personality embeddings
        personality_users = []
        personality_embeddings = []
        
        for user in users_with_embeddings:
            trait_embeddings = []
            has_all_traits = True
            
            for trait in ['Openness', 'Conscientiousness', 'Extraversion', 'Agreeableness', 'Neuroticism']:
                trait_emb = getattr(user, f'{trait.lower()}_embedding', None)
                if trait_emb is not None:
                    trait_embeddings.append(trait_emb)
                else:
                    has_all_traits = False
                    break
            
            if has_all_traits:
                personality_users.append(user)
                combined_personality = np.concatenate(trait_embeddings)
                personality_embeddings.append(combined_personality)
        
        if len(personality_embeddings) >= 3:
            personality_matrix = np.array(personality_embeddings)
            
            # Apply PCA to personality embeddings
            scaler = StandardScaler()
            embeddings_scaled = scaler.fit_transform(personality_matrix)
            pca = PCA(n_components=2)
            embeddings_2d = pca.fit_transform(embeddings_scaled)
            
            # Create Plotly figure for personality
            fig_personality = go.Figure()
            
            # Separate special and regular users
            special_indices = []
            regular_indices = []
            
            for i, user in enumerate(personality_users):
                if user.special:
                    special_indices.append(i)
                else:
                    regular_indices.append(i)
            
            # Plot regular users
            if regular_indices:
                regular_coords = embeddings_2d[regular_indices]
                regular_names = [personality_users[i].name for i in regular_indices]
                
                fig_personality.add_trace(go.Scatter(
                    x=regular_coords[:, 0],
                    y=regular_coords[:, 1],
                    mode='markers',
                    marker=dict(size=8, color='lightgreen', opacity=0.7, line=dict(width=1, color='darkgreen')),
                    name=f'Users ({len(regular_indices)})',
                    hovertext=regular_names,
                    hovertemplate='<b>%{hovertext}</b><extra></extra>'
                ))
            
            # Plot special users
            if special_indices:
                special_coords = embeddings_2d[special_indices]
                special_names = [personality_users[i].name for i in special_indices]
                
                fig_personality.add_trace(go.Scatter(
                    x=special_coords[:, 0],
                    y=special_coords[:, 1],
                    mode='markers',
                    marker=dict(size=15, color='red', opacity=0.9, symbol='star', line=dict(width=2, color='darkred')),
                    name=f'Special Users ({len(special_indices)})',
                    hovertext=special_names,
                    hovertemplate='<b>%{hovertext}</b><extra></extra>'
                ))
            
            # Update layout
            total_variance = pca.explained_variance_ratio_[:2].sum()
            fig_personality.update_layout(
                title=f'PCA Visualization - Personality Traits Embeddings<br>{population.name} ({len(personality_users)} users)',
                xaxis_title=f'PC1 ({pca.explained_variance_ratio_[0]:.1%} variance)',
                yaxis_title=f'PC2 ({pca.explained_variance_ratio_[1]:.1%} variance)',
                hovermode='closest',
                showlegend=True,
                width=960,
                height=640,
                annotations=[
                    dict(
                        text=f'Total variance explained: {total_variance:.1%}<br>Similarity metric: Euclidean Distance<br>Dimensions: 5 traits × 768D = 3840D',
                        xref="paper", yref="paper",
                        x=0.02, y=0.98, xanchor='left', yanchor='top',
                        showarrow=False,
                        font=dict(size=12),
                        bgcolor="rgba(255,255,255,0.8)",
                        bordercolor="rgba(0,0,0,0.5)",
                        borderwidth=1
                    )
                ]
            )
            
            fig_personality.show()
            print(f"✅ Plotly personality traits PCA complete")
        else:
            print(f"❌ Need at least 3 users with complete personality embeddings, got {len(personality_embeddings)}")
        
    except Exception as e:
        print(f"❌ Personality PCA failed: {e}")

else:
    print(f"⚠️  Need at least 3 users with embeddings for visualization")
    print(f"   Current: {len(users_with_embeddings)} users")
    print(f"   Generate more synthetic users in the previous cell")

print(f"\n📖 Visualization Guide:")
print(f"   📚 Interests PCA: Cultural preferences embeddings (books, movies, music)")
print(f"   🧠 Personality PCA: Big 5 personality traits embeddings (Openness, Conscientiousness, etc.)")
print(f"   ⭐ Red stars: Real users (special)")
print(f"   🔵 Blue/Green dots: Synthetic users")
print(f"   📏 Similarity: Euclidean distance (1.0 = identical, 0.0 = different)")
print(f"   🎨 Interactive: Hover for names, zoom, pan to explore")

📊 Interactive Plotly Population Visualization
Users with embeddings: 255
Similarity metric: Euclidean Distance (L2 norm, normalized)

📚 Creating Plotly PCA - Interests Embeddings (768D → 2D):


✅ Plotly interests PCA complete

🧠 Creating Plotly PCA - Combined Personality Traits (3840D → 2D):


✅ Plotly personality traits PCA complete

📖 Visualization Guide:
   📚 Interests PCA: Cultural preferences embeddings (books, movies, music)
   🧠 Personality PCA: Big 5 personality traits embeddings (Openness, Conscientiousness, etc.)
   ⭐ Red stars: Real users (special)
   🔵 Blue/Green dots: Synthetic users
   📏 Similarity: Euclidean distance (1.0 = identical, 0.0 = different)
   🎨 Interactive: Hover for names, zoom, pan to explore


## 💾 Save Population

Save the complete population for future use.

In [8]:
# Save population to JSON
save_path = "mello_population.json"

print(f"💾 Saving population to {save_path}...")

try:
    population.save_to_json(save_path)
    print(f"✅ Population saved successfully")
    
    # Show file info
    import os
    file_size = os.path.getsize(save_path)
    print(f"   File size: {file_size:,} bytes ({file_size/1024/1024:.1f} MB)")
    print(f"   Users saved: {len(population)}")
    print(f"   Users with embeddings: {len(population.get_users_with_embeddings())}")
    
except Exception as e:
    print(f"❌ Failed to save population: {e}")

print(f"\n🎉 Notebook complete!")
print(f"📊 Final Statistics:")
print(f"   Population: {len(population)} users")
print(f"   Architecture: 768D interests + 5×768D traits")
print(f"   Approach: Unified personality profiling")
print(f"   Embeddings: {len(population.get_users_with_embeddings())} users ready")

💾 Saving population to mello_population.json...
✅ Population saved successfully
   File size: 29,467,244 bytes (28.1 MB)
   Users saved: 248
   Users with embeddings: 240

🎉 Notebook complete!
📊 Final Statistics:
   Population: 248 users
   Architecture: 768D interests + 5×768D traits
   Approach: Unified personality profiling
   Embeddings: 240 users ready


In [5]:
einstein = ProfileGenerator.generate_profile_from_famous_person(profile_generator,"Einstein")

'They are characterized by an intensely analytical and profoundly curious mind, constantly seeking to understand the fundamental principles governing the universe. Their cognitive style is deeply intuitive, often arriving at insights through imaginative leaps rather than purely linear deduction, yet they meticulously test these ideas against logical rigor. Emotionally, they possess a quiet intensity, valuing inner contemplation over outward display, though they exhibit a profound idealism and a strong sense of social justice. Their motivation stems from an intrinsic desire for truth and a deep appreciation for the elegant simplicity of underlying laws. While often perceived as introverted, they are not averse to engaging with others when intellectual discourse is involved, though they maintain a certain detachment, prioritizing their internal world. They demonstrate remarkable resilience in pursuing their convictions, undeterred by conventional thinking or initial skepticism, driven by

In [6]:
famous_people = [
    "Nikola Tesla",
    "Ada Lovelace",
    "Alan Turing",
    "Aristotle",
    "Nelson Mandela",
    "Mahatma Gandhi",
    "Cleopatra",
    "Steve Jobs",
    "Elon Musk",
    "Sherlock Holmes",
    "Tony Stark",
    "Walter White",
    "Leonardo da Vinci",
    "Greta Thunberg"]
c = 1
for person in famous_people:
    famous_user = profile_generator.create_user_from_famous_person(person, person)
    if hasattr(famous_user, "special"):
        famous_user.special = True
    success = embedding_generator.embed_user_complete(famous_user)
    print(c/len(famous_people) * 100, "%")
    c += 1
    if success:
        population.add_user(famous_user)

7.142857142857142 %
14.285714285714285 %
21.428571428571427 %
28.57142857142857 %
35.714285714285715 %
42.857142857142854 %
50.0 %
57.14285714285714 %
64.28571428571429 %
71.42857142857143 %
78.57142857142857 %
85.71428571428571 %
92.85714285714286 %
100.0 %


In [11]:
# Minimal UMAP + Plotly 3D scatter for interests embeddings
from umap import UMAP
from sklearn.preprocessing import StandardScaler
import numpy as np
import plotly.graph_objects as go

# Collect (name, special, embedding) for users that have interests embeddings
data = [
    (u.name, bool(getattr(u, "special", False)), u.interests_embedding)
    for u in population.get_users_with_embeddings()
    if getattr(u, "interests_embedding", None) is not None
]

if len(data) < 3:
    print(f"Need ≥3 users with interests embeddings, got {len(data)}")
else:
    names, specials, embs = zip(*data)
    X = np.vstack(embs)
    X = StandardScaler().fit_transform(X)

    # 3D UMAP
    umap = UMAP(
        n_components=3,
        n_neighbors=3,
        min_dist=3,
        metric="euclidean",
        random_state=42,
        spread = 3
    )
    X3 = umap.fit_transform(X)

    specials = np.array(specials, dtype=bool)
    reg_idx = np.where(~specials)[0]
    spc_idx = np.where(specials)[0]

    fig = go.Figure()

    if len(reg_idx):
        fig.add_trace(go.Scatter3d(
            x=X3[reg_idx, 0], y=X3[reg_idx, 1], z=X3[reg_idx, 2],
            mode="markers",
            marker=dict(size=5, opacity=0.7, color="lightgreen"),
            name=f"Users ({len(reg_idx)})",
            hovertext=[names[i] for i in reg_idx],
            hovertemplate="<b>%{hovertext}</b><extra></extra>"
        ))

    if len(spc_idx):
        fig.add_trace(go.Scatter3d(
            x=X3[spc_idx, 0], y=X3[spc_idx, 1], z=X3[spc_idx, 2],
            mode="markers",
            marker=dict(size=9, symbol="diamond", color="red", line=dict(width=2)),
            name=f"Special ({len(spc_idx)})",
            hovertext=[names[i] for i in spc_idx],
            hovertemplate="<b>%{hovertext}</b><extra></extra>"
        ))

    fig.update_layout(
        title=f"UMAP – Interests Embeddings 3D ({len(names)} users)",
        scene=dict(
            xaxis_title="UMAP-1",
            yaxis_title="UMAP-2",
            zaxis_title="UMAP-3"
        ),
        width=900, height=700,
        hovermode="closest"
    )

    fig.show()


divide by zero encountered in power

OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.


In [17]:
for user in population.users:
    if "sofiia" in user.name.lower():
        print(user.name)
        print(user.personality_profiles)
        print(user.interests_profile)

        break

Sofiia 
{'Openness': "This individual demonstrates a selective curiosity, enjoying intellectual exploration and a willingness to engage with abstract or experimental art, even when not fully understood. They are keen on learning about unfamiliar topics for personal enrichment and appreciate debating differing viewpoints to broaden their understanding. However, this openness doesn't consistently extend to practical experiences; they show a preference for familiar comforts in areas like media consumption and may not actively seek out new culinary or travel experiences, suggesting a more theoretical than experiential approach to novelty.", 'Conscientiousness': 'This individual exhibits a profound lack of conscientiousness, indicating a generally disorganized and impulsive approach to life. There is no evidence of routine planning, timely execution of tasks, or a proactive stance on responsibilities. They likely struggle with deadlines, maintain cluttered environments, and manage finances 

In [18]:
user.conscientiousness_embedding

array([-3.30094060e-02,  4.77498470e-02, -4.49435150e-02, -2.50706540e-02,
        3.47726870e-02,  6.51938240e-02, -2.96006350e-02,  4.81477980e-02,
       -1.73853470e-02, -2.31483620e-02,  2.77664510e-02,  1.06082890e-02,
        7.29340200e-03,  3.35169470e-02,  2.57603300e-02,  2.23524870e-02,
        4.99686640e-02, -7.83555300e-03,  3.75434300e-02, -5.84648400e-02,
       -2.17503300e-02,  5.06013800e-03, -1.77676900e-02,  1.80181940e-02,
       -1.57817100e-02,  2.67485510e-02,  6.65706000e-02, -4.49625660e-02,
       -8.36638100e-03, -6.65961900e-02,  2.65221330e-02,  3.22378050e-02,
       -1.45648640e-02, -3.60046630e-02,  3.18502800e-02,  1.99449030e-02,
       -2.08784300e-02, -9.54415950e-02,  3.34392640e-02, -2.47405410e-02,
       -3.88559660e-02,  8.86443500e-03, -4.52819700e-02, -4.20741600e-02,
       -5.92751000e-02,  5.78369150e-03, -2.15025200e-02, -5.64406370e-02,
        5.45081200e-02, -2.49045940e-02,  2.24926110e-03,  4.35335640e-02,
       -4.20704370e-02, -

In [20]:
# UMAP 2D for each personality trait embedding across users
# Assumes:
# - `population.get_users_with_embeddings()` returns iterable of user objects
# - Each user may have trait embeddings like `conscientiousness_embedding` (1D vector)
# - Optional boolean flag `user.special` to highlight certain users

from umap import UMAP
from sklearn.preprocessing import StandardScaler
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots

traits = [
    "conscientiousness",
    "openness",
    "agreeableness",
    "neuroticism",
    "extraversion",
]

def collect_trait_data(trait: str):
    attr = f"{trait}_embedding"
    rows = [
        (u.name, bool(getattr(u, "special", False)), getattr(u, attr))
        for u in population.get_users_with_embeddings()
        if getattr(u, attr, None) is not None
    ]
    if not rows:
        return [], [], None
    names, specials, embs = zip(*rows)
    X = np.vstack(embs)
    return list(names), np.array(specials, dtype=bool), X

def umap_2d(X: np.ndarray, metric: str = "manhattan", n_neighbors: int = 10, min_dist: float = 0.15):
    Xs = StandardScaler().fit_transform(X)
    reducer = UMAP(
        n_components=2,
        n_neighbors=n_neighbors,
        min_dist=min_dist,
        metric=metric,
        random_state=42,
    )
    return reducer.fit_transform(Xs)

# Build a subplot grid for all traits (2 rows x 3 cols)
rows, cols = 2, 3
fig = make_subplots(
    rows=rows, cols=cols,
    subplot_titles=[t.title() for t in traits] + ([""] * (rows*cols - len(traits)))
)

for idx, trait in enumerate(traits):
    r = idx // cols + 1
    c = idx % cols + 1

    names, specials, X = collect_trait_data(trait)
    if len(names) < 3:
        fig.add_annotation(
            text=f"Need ≥3 users for {trait.title()}, got {len(names)}",
            row=r, col=c, showarrow=False
        )
        continue

    X2 = umap_2d(X, metric="manhattan", n_neighbors=10, min_dist=0.15)

    reg_idx = np.where(~specials)[0]
    spc_idx = np.where(specials)[0]

    if len(reg_idx):
        fig.add_trace(
            go.Scatter(
                x=X2[reg_idx, 0], y=X2[reg_idx, 1],
                mode="markers",
                marker=dict(size=6, opacity=0.75, color="royalblue"),
                name=f"{trait}-users",
                hovertext=[names[i] for i in reg_idx],
                hovertemplate="<b>%{hovertext}</b><extra></extra>"
            ),
            row=r, col=c
        )

    if len(spc_idx):
        fig.add_trace(
            go.Scatter(
                x=X2[spc_idx, 0], y=X2[spc_idx, 1],
                mode="markers",
                marker=dict(size=9, symbol="diamond", color="crimson", line=dict(width=2)),
                name=f"{trait}-special",
                hovertext=[names[i] for i in spc_idx],
                hovertemplate="<b>%{hovertext}</b><extra></extra>"
            ),
            row=r, col=c
        )

    fig.update_xaxes(title_text="UMAP-1", row=r, col=c)
    fig.update_yaxes(title_text="UMAP-2", row=r, col=c)

fig.update_layout(
    title="UMAP – Personality Trait Embeddings (2D) by Trait",
    width=1200, height=800,
    showlegend=False,
    hovermode="closest",
    margin=dict(l=40, r=20, t=60, b=40)
)

fig.show()