## Personas of Thought
Asking the LLM to generate a crowd of personas to answer a question before aggregating their feedback.

In [1]:
from src.genAIClient import GenerativeAIClient
from src.fnUtils import render_markdown
client = GenerativeAIClient(temperature=.7)


In [2]:
question = "What distinguishes successful startup companies?"
number = 10

system_prompt = "You are a helpful assistant."
question_prompt_template = "Answer this question: {question}. Please respond in a single paragraph. Limit your response to approximately 200 words."

question_prompt = question_prompt_template.format(question=question)
naive_response = client.invoke(prompt=question_prompt, sys_instructions=system_prompt)

print(naive_response)


Successful startups distinguish themselves through a potent combination of factors.  A compelling and validated idea that solves a real problem for a specific target market is crucial, coupled with a strong, adaptable business model that can evolve with market demands.  Effective leadership, capable of navigating uncertainty and inspiring a dedicated team, is paramount.  This team should possess a diverse skillset,  a bias towards action, and a relentless focus on execution.  Furthermore, successful startups demonstrate a deep understanding of their customers and competitors, leveraging data-driven decision-making to iterate and refine their product or service.  Securing adequate funding, managing resources wisely, and building strong relationships with investors and mentors are also essential.  Finally, a healthy dose of resilience, the ability to learn from failures, and a persistent drive to innovate and adapt to the ever-changing market landscape are hallmarks of startups that ulti

In [3]:
experts_prompt_template = """Give me a single paragraph response to the following question: {question}

# Instructions:
1. Name {number} world-class experts (past or present) who would be great at answering this question. Select experts who have directly addressed or studied topics related to the question, based on their published work and professional experience.
2. For each expert, please answer the question critically from their perspective given their background and experience. Avoid relying on stereotypes or generalizations about the experts. Base your responses on specific examples of their work and contributions. Limit each expert response to approximately 100 words.
3. **Synthesize the individual expert responses into a single, unified answer to the original question.  Imagine the experts are collaborating to write a joint statement.  Identify the strongest arguments, address any conflicting viewpoints by proposing a reconciled perspective, and create a cohesive, insightful, and direct answer to the question. The combined response should not simply summarize the individual opinions but rather build upon them to form a more complete and well-supported conclusion. This is the definitive answer to the question, informed by all the experts' perspectives. Limit the final response to approximately 200 words.**

# Output format:
Here's an example of how the output should be structured:
## Expert names:
* **Expert name** (relevant experience)
* **Expert name** (relevant experience)
* **Expert name** (relevant experience)
...

## Expert responses:
* **Expert name**: <response>
* **Expert name**: <response>
* **Expert name**: <response>
...

## Final response:  This is the combined response, synthesizing the expert perspectives.

Note that both individual expert responses and the final combined response should be written anonymously, without explicitly attributing ideas to specific individuals beyond the initial expert list.
"""
experts_prompt = experts_prompt_template.format(question=question, number=number)
# print(experts_prompt)
experts_response = client.invoke(prompt=experts_prompt, sys_instructions=system_prompt)

render_markdown(experts_response)

> ## Expert names:
> * **Peter Thiel** (Co-founder of PayPal, Palantir Technologies; Author of *Zero to One*)
> * **Eric Ries** (Author of *The Lean Startup*)
> * **Steve Blank** (Serial entrepreneur, academic; Author of *The Four Steps to the Epiphany*)
> * **Clayton M. Christensen** (Harvard Business School professor; Author of *The Innovator's Dilemma*)
> * **Paul Graham** (Co-founder of Y Combinator; Essayist)
> * **Reid Hoffman** (Co-founder of LinkedIn; Author of *Blitzscaling*)
> * **Ben Horowitz** (Co-founder of Andreessen Horowitz; Author of *The Hard Thing About Hard Things*)
> * **Ann Miura-Ko** (Co-founding partner of Floodgate; Early investor in Lyft, Twitch)
> * **Sequoia Capital** (Venture capital firm; Early investors in Apple, Google, etc.)
> * **Marc Andreessen** (Co-founder of Netscape, Andreessen Horowitz; Software engineer)
> 
> 
> ## Expert responses:
> * **Peter Thiel**: Successful startups create monopolies by offering something radically new. They dominate niche markets before expanding, focusing on long-term value creation rather than short-term gains.  This requires contrarian thinking and a willingness to challenge conventional wisdom.
> * **Eric Ries**:  Validated learning through rapid experimentation and iterative development is crucial.  Startups must build a "minimum viable product" (MVP) and constantly adapt based on customer feedback, pivoting when necessary.
> * **Steve Blank**: Customer development is paramount.  Startups must "get out of the building" and talk to potential customers to understand their needs and validate their assumptions about the market.
> * **Clayton M. Christensen**: Disruptive innovation allows startups to target underserved markets with simpler, more affordable solutions, eventually displacing incumbent players.
> * **Paul Graham**:  Startups succeed by building things people want, focusing on rapid growth, and having a great team, particularly founders who are adaptable and resilient.
> * **Reid Hoffman**:  Blitzscaling, rapidly scaling a company to dominate a market before competitors can react, is key in winner-take-all markets.  This requires aggressive execution and a tolerance for controlled chaos.
> * **Ben Horowitz**:  Building a strong company culture, managing through adversity, and making tough decisions are essential for long-term success.  This includes building a strong team and effectively managing conflict.
> * **Ann Miura-Ko**:  Investing in founders with unique insights and unwavering conviction is crucial.  These founders are able to identify emerging trends and build companies that capitalize on them.
> * **Sequoia Capital**:  Product-market fit, a strong team, and a large addressable market are critical for success.  Startups must identify a real need and build a product that solves it effectively.
> * **Marc Andreessen**:  Building a software-driven business that leverages network effects and scales rapidly is key in the digital age. This allows startups to create defensible moats and dominate markets.
> 
> ## Final response:  Successful startups are distinguished by a confluence of factors: a unique value proposition that addresses a real market need (often through disruptive innovation), a relentless focus on customer development and validated learning, and a strong, adaptable team capable of executing rapidly and navigating uncertainty.  While contrarian thinking and a long-term vision are important for creating defensible moats, the ability to adapt and pivot based on market feedback is equally crucial.  Finally, achieving product-market fit, scaling rapidly (sometimes through blitzscaling), and fostering a resilient company culture are essential for navigating the challenges of hypergrowth and building a sustainable, enduring business.


In [4]:
personas_prompt_template = """Give me a single paragraph response to the following question: {question}

# Instructions:
1. Name {number} distinct demographic personas who would be relevant for answering this question. Select personas representing diverse backgrounds, experiences, and perspectives relevant to the question.  Focus on specific demographics that significantly influence their point of view (e.g., age, income, location, education, occupation, family status).
2. For each persona, answer the question critically from their perspective, deeply considering their specific background, life experiences, values, and priorities.  Imagine you are fully embodying this persona. Be specific about how their demographic characteristics shape their answer. Limit each persona's response to approximately 100 words.
3. **Synthesize the individual persona responses into a single, unified answer to the original question. Imagine these personas are collaborating in a focus group or town hall to reach a consensus.  Identify common ground and key differences in their viewpoints. Address any conflicting perspectives by proposing a nuanced and inclusive solution that acknowledges and integrates the needs and concerns of each persona. The combined response should not simply summarize the individual opinions but rather build upon them to form a more complete, empathetic, and well-reasoned answer to the question. This is the definitive answer to the question, informed by all the personas' diverse experiences and perspectives. Limit the final response to approximately 200 words.**

# Output format:
## Persona names:
* **Persona name** (relevant demographics: e.g., 35-year-old, urban, single professional)
* **Persona name** (relevant demographics: e.g., 60-year-old, rural, retired teacher)
* **Persona name** (relevant demographics: e.g., 20-year-old, suburban, college student)
...

## Persona responses:
* **Persona name**: <response>
* **Persona name**: <response>
* **Persona name**: <response>
...

## Final response:  <-- **IMPORTANT:  The final synthesized response MUST always begin with the tag "## Final response:"**
This is a single paragraph, synthesizing the expert perspectives.
"""
personas_prompt = personas_prompt_template.format(question=question, number=number)
personas_response = client.invoke(prompt=personas_prompt, sys_instructions=system_prompt)

render_markdown(personas_response)

> ## Persona names:
> * **Aisha** (28-year-old, urban, female, immigrant entrepreneur, bootstrapping her startup)
> * **David** (55-year-old, suburban, male, experienced venture capitalist)
> * **Maria** (22-year-old, rural, female, college student, aspiring social entrepreneur)
> * **James** (40-year-old, urban, male, serial entrepreneur, multiple exits)
> * **Chen** (30-year-old, suburban, male, software engineer, working at a large tech company)
> * **Sophia** (65-year-old, rural, female, retired teacher, investing in local businesses)
> * **Omar** (19-year-old, urban, male, Gen Z, starting a social media-based business)
> * **Elena** (45-year-old, suburban, female, stay-at-home mom, investing in sustainable startups)
> * **Robert** (70-year-old, rural, male, retired farmer, skeptical of new technologies)
> * **Priya** (35-year-old, urban, female, working mother, consultant for non-profits)
> 
> 
> ## Persona responses:
> * **Aisha**: Adaptability and resourcefulness define success.  Limited resources force you to be creative and pivot quickly.  A strong community is crucial for support and early adoption.
> * **David**:  Strong team, large addressable market, and defensible IP.  Traction and scalability are key for attracting investment and achieving significant returns.
> * **Maria**: Social impact and ethical practices are paramount.  Success means solving real-world problems and creating a positive change, even if profits are modest.
> * **James**:  Execution is everything.  A great idea is worthless without a relentless focus on building and iterating quickly.  Speed and adaptability are crucial.
> * **Chen**:  Technological innovation and a strong product are fundamental.  A talented engineering team and a focus on solving complex technical challenges are essential.
> * **Sophia**: Trust and community engagement are vital.  Building relationships with customers and demonstrating genuine care for the local community builds long-term loyalty.
> * **Omar**:  Leveraging social media and building a strong online presence are critical.  Authenticity and engaging content are key to attracting a loyal following.
> * **Elena**: Sustainability and environmental responsibility are non-negotiable.  Successful startups prioritize ethical sourcing, minimize their environmental footprint, and contribute to a better future.
> * **Robert**: Practicality and real-world applications are essential.  Startups must solve tangible problems and offer clear benefits to everyday people to gain widespread adoption.
> * **Priya**:  A strong mission and a commitment to social good are essential for long-term success.  Startups that prioritize purpose and contribute to positive social change attract talent and customers.
> 
> 
> ## Final response:
> Successful startups are multifaceted, reflecting diverse priorities and values.  While a strong team, adaptable strategy, and efficient execution are universally recognized as crucial, the definition of "success" itself varies significantly.  For some, it's measured by financial returns and market dominance, emphasizing scalable business models and defensible IP.  Others prioritize social impact, sustainability, and ethical practices, even if it means sacrificing some profitability.  Building a strong community, both online and offline, emerged as a recurring theme, highlighting the importance of trust, authenticity, and customer engagement.  Ultimately, successful startups must balance these sometimes conflicting priorities, finding a sustainable path that generates value for stakeholders while contributing positively to society.  This requires a nuanced understanding of the target market, a commitment to continuous innovation, and a deep-rooted sense of purpose that resonates with both internal teams and external communities.


In [5]:
import re

def extract_tag_contents(text, tags):
    """
    Extracts the contents of a tag from a given text.

    Args:
        text (str): The input text.
        tags (tuple): A tuple containing the opening and closing tags.

    Returns:
        str: The extracted contents of the tag, or None if not found.
    """
    open_tag, close_tag = tags
    pattern = rf'{re.escape(open_tag)}(.*?){re.escape(close_tag)}|{re.escape(open_tag)}(.*)'
    match = re.search(pattern, text, re.DOTALL)

    if match:
        return (match.group(1) or match.group(2)).strip()
    else:
        return None

def extract_final_answer(text, tag="## Final response:"):
    if text is None:
        return None
    if "## Combined response:" in text:
        text = text.replace("## Combined response:", "## Final response:")
    pattern = rf'{re.escape(tag)}\s*\n*(.*?)(?=\n\n|$)'
    match = re.search(pattern, text, re.DOTALL | re.IGNORECASE)  # Added re.IGNORECASE for robustness

    if match:
        return match.group(1).strip()
    else:
        return None

def extract_choice(text):
    tags = ("<choice>", "</choice>")
    return extract_tag_contents(text, tags)

print(extract_final_answer(personas_response))

Successful startups are multifaceted, reflecting diverse priorities and values.  While a strong team, adaptable strategy, and efficient execution are universally recognized as crucial, the definition of "success" itself varies significantly.  For some, it's measured by financial returns and market dominance, emphasizing scalable business models and defensible IP.  Others prioritize social impact, sustainability, and ethical practices, even if it means sacrificing some profitability.  Building a strong community, both online and offline, emerged as a recurring theme, highlighting the importance of trust, authenticity, and customer engagement.  Ultimately, successful startups must balance these sometimes conflicting priorities, finding a sustainable path that generates value for stakeholders while contributing positively to society.  This requires a nuanced understanding of the target market, a commitment to continuous innovation, and a deep-rooted sense of purpose that resonates with bo

In [6]:
experts_final_response = extract_final_answer(experts_response)
personas_final_response = extract_final_answer(personas_response)

output = f"""### Three personas of thought
**Naive**: {naive_response}
\n**Experts**: {experts_final_response}
\n**Personas**: {personas_final_response}
"""

render_markdown(output)


> ### Three personas of thought
> **Naive**: Successful startups distinguish themselves through a potent combination of factors.  A compelling and validated idea that solves a real problem for a specific target market is crucial, coupled with a strong, adaptable business model that can evolve with market demands.  Effective leadership, capable of navigating uncertainty and inspiring a dedicated team, is paramount.  This team should possess a diverse skillset,  a bias towards action, and a relentless focus on execution.  Furthermore, successful startups demonstrate a deep understanding of their customers and competitors, leveraging data-driven decision-making to iterate and refine their product or service.  Securing adequate funding, managing resources wisely, and building strong relationships with investors and mentors are also essential.  Finally, a healthy dose of resilience, the ability to learn from failures, and a persistent drive to innovate and adapt to the ever-changing market landscape are hallmarks of startups that ultimately thrive.
> 
> 
> **Experts**: Successful startups are distinguished by a confluence of factors: a unique value proposition that addresses a real market need (often through disruptive innovation), a relentless focus on customer development and validated learning, and a strong, adaptable team capable of executing rapidly and navigating uncertainty.  While contrarian thinking and a long-term vision are important for creating defensible moats, the ability to adapt and pivot based on market feedback is equally crucial.  Finally, achieving product-market fit, scaling rapidly (sometimes through blitzscaling), and fostering a resilient company culture are essential for navigating the challenges of hypergrowth and building a sustainable, enduring business.
> 
> **Personas**: Successful startups are multifaceted, reflecting diverse priorities and values.  While a strong team, adaptable strategy, and efficient execution are universally recognized as crucial, the definition of "success" itself varies significantly.  For some, it's measured by financial returns and market dominance, emphasizing scalable business models and defensible IP.  Others prioritize social impact, sustainability, and ethical practices, even if it means sacrificing some profitability.  Building a strong community, both online and offline, emerged as a recurring theme, highlighting the importance of trust, authenticity, and customer engagement.  Ultimately, successful startups must balance these sometimes conflicting priorities, finding a sustainable path that generates value for stakeholders while contributing positively to society.  This requires a nuanced understanding of the target market, a commitment to continuous innovation, and a deep-rooted sense of purpose that resonates with both internal teams and external communities.


In [7]:
judge_prompt_template = """Given the following question: {question}

I need your help to compare the two responses to this question:

Response 1:
{response1}

Response 2:
{response2}

# Instructions:
1. Analyze both responses and choose which one is a better answer to this question.
2. In your analysis, take the following factors into consideration:
- Uniqueness of perspectives
- Impactful and interesting insights
- Sounds human - not like an AI
3. Your choice must be based on <your analysis> of these two responses, avoid any out of context comparisons.
4. <your answer> must be enclosed in <choice> tag.

# Output format:
Analysis: <your analysis>
<choice>1 or 2</choice>
Reason: justification for your choice.
"""

In [8]:
import random
import time
MAX_RETRIES = 3

def delay(lb=1, ub=3):
    seconds = random.uniform(lb, ub)
    print(f"Delaying {round(seconds, 2)} seconds...")
    time.sleep(seconds)

def judge_responses(q, r1, r2, max_retries=MAX_RETRIES):
    judge_message = judge_prompt_template.format(
            question=q,
            response1=r1,
            response2=r2
        )

    # print(judge_messages)

    judge_response = client.invoke(prompt=judge_message, sys_instructions=system_prompt)
    retries = 0
    while judge_response is None and retries < max_retries:
        retries += 1
        print(f"Reattempting judge response - {retries}/{max_retries}")
        delay()
        judge_response = client.invoke(prompt=judge_message, sys_instructions=system_prompt)

    return judge_response


In [9]:
naive_vs_experts = judge_responses(q=question, r1=naive_response, r2=experts_final_response)
print("# Naive vs experts:", naive_vs_experts)

# Naive vs experts: Analysis:Response 1 provides a comprehensive overview of the factors contributing to startup success. It covers key aspects like a strong idea, adaptable business model, effective leadership, skilled team, customer understanding, data-driven decisions, funding, resource management, and resilience.  The language is clear and accessible.

Response 2, while also covering important elements, uses more sophisticated business jargon like "disruptive innovation," "validated learning," "defensible moats," "blitzscaling," and "hypergrowth."  While these terms are relevant in the startup world, they make the response feel less accessible and more like it was generated by an AI pulling buzzwords.  While it touches on important concepts like customer development and pivoting, it lacks the detailed explanation provided in Response 1.

<choice>1</choice>
Reason:Response 1 provides a more well-rounded and accessible answer to the question.  It covers a broader range of factors con

In [10]:
print(extract_choice(naive_vs_experts))

1


In [11]:
naive_vs_personas = judge_responses(q=question, r1=naive_response, r2=personas_final_response)
print("# Naive vs personas:\n", naive_vs_personas)

# Naive vs personas:
 Analysis:Response 1 provides a comprehensive overview of the factors contributing to startup success. It covers key areas such as a strong idea, adaptable business model, effective leadership, skilled team, data-driven decision-making, funding, and resilience.  It's a well-structured and informative response. However, it reads somewhat like a textbook definition, hitting all the expected points but lacking a unique perspective.

Response 2, while touching upon similar themes, offers a more nuanced and insightful perspective. It acknowledges the multifaceted nature of startup success and how the definition of "success" itself can vary.  It introduces the concepts of varying priorities, such as social impact and sustainability, alongside financial returns. The emphasis on community building and the balance between stakeholder value and societal contribution adds depth and a more human element to the response.  It feels less like a checklist and more like a thoughtfu

In [12]:
print(extract_choice(naive_vs_personas))

2


In [13]:
experts_vs_personas = judge_responses(q=question, r1=experts_final_response, r2=personas_final_response)
print("# Experts vs personas:", experts_vs_personas)

# Experts vs personas: Analysis:Response 1 provides a more direct and focused answer to the question of what distinguishes successful startups. It identifies key factors such as a unique value proposition, customer development, adaptability, and strong team execution. These are generally accepted principles in the startup world and provide a concrete framework for understanding success.  While Response 2 touches on important points like team, strategy, and execution, it delves into broader concepts like varying definitions of success and the importance of social impact. While these are valid considerations, they don't directly address the core question of what distinguishes *successful* startups in a practical sense.  Response 2 feels more philosophical and less actionable.  Response 1 sounds more like a seasoned entrepreneur or investor, while Response 2 sounds more like a business school essay.

<choice>1</choice>
Reason:Response 1 provides a more focused, practical, and insightful a

In [14]:
output = f"""## Comparisons
**Naive vs Expert**: {naive_vs_experts}
\n**Naive vs Personas **: {naive_vs_personas}
\n**Expert vs Personas**: {experts_vs_personas}
"""

print(output)


## Comparisons
**Naive vs Expert**: Analysis:Response 1 provides a comprehensive overview of the factors contributing to startup success. It covers key aspects like a strong idea, adaptable business model, effective leadership, skilled team, customer understanding, data-driven decisions, funding, resource management, and resilience.  The language is clear and accessible.

Response 2, while also covering important elements, uses more sophisticated business jargon like "disruptive innovation," "validated learning," "defensible moats," "blitzscaling," and "hypergrowth."  While these terms are relevant in the startup world, they make the response feel less accessible and more like it was generated by an AI pulling buzzwords.  While it touches on important concepts like customer development and pivoting, it lacks the detailed explanation provided in Response 1.

<choice>1</choice>
Reason:Response 1 provides a more well-rounded and accessible answer to the question.  It covers a broader rang

### Bringing it all together

In [15]:
def run_comparisons(questions, number=10, max_retries=MAX_RETRIES):
    def run_test(question, verbose=False):
        question_prompt = question_prompt_template.format(question=question)
        naive_response = client.invoke(prompt=question_prompt, sys_instructions=system_prompt)
        if verbose:
            print("\nNaive Response>>> ", naive_response)

        experts_prompt = experts_prompt_template.format(question=question, number=number)
        experts_response = None
        i = 0
        while experts_response is None and i <= max_retries:
            experts_response = client.invoke(prompt=experts_prompt, sys_instructions=system_prompt)
            if verbose:
                print("\nExpert Response>>> ", experts_response)
            if i > 0 and experts_response is None:
                print(f"Retrying...attempt {i}")
                delay()
            i += 1

        personas_prompt = personas_prompt_template.format(question=question, number=number)
        personas_response = None
        i = 0
        while personas_response is None and i <= MAX_RETRIES:
            personas_response = client.invoke(prompt=personas_prompt, sys_instructions=system_prompt)
            if verbose:
                print(f"\nPersonas Response>>> {personas_response}")
            i += 1
            if i > 1 and personas_response is None:
                print(f"Retrying...attempt {i}")
                delay()

        experts_final_response = extract_final_answer(experts_response)
        personas_final_response = extract_final_answer(personas_response)
        if verbose:
            print("\nNaive: ", naive_response)
            print("-"*100)
            print("Experts: ", experts_final_response)
            print("-"*100)
            print("Personas: ", personas_final_response)
            print("."*100)

        naive_vs_experts = judge_responses(q=question, r1=naive_response, r2=experts_final_response)

        nve_choice = extract_choice(naive_vs_experts)
        print("# Naive vs experts:", nve_choice)

        naive_vs_personas = judge_responses(q=question, r1=naive_response, r2=personas_final_response)
        nvp_choice = extract_choice(naive_vs_personas)
        print("# Naive vs personas:", nvp_choice)

        experts_vs_personas = judge_responses(q=question, r1=experts_final_response, r2=personas_final_response)
        evp_choice = extract_choice(experts_vs_personas)
        print("# Experts vs personas:", evp_choice)

        if verbose:
            print("-"*100)
            print(f"NAIVE vs EXPPERTS: {naive_vs_experts}")
            print("-"*100)
            print(f"NAIVE vs PERSONAS: {naive_vs_personas}")
            print("-"*100)
            print(f"EXPERTS vs PERSONAS: {experts_vs_personas}")
            print("."*100)

        return {
            # "naive_v_experts": naive_vs_experts,
            # "naive_v_personas": naive_vs_personas,
            # "expert_vs_personas": experts_vs_personas,
            "nve_choice": nve_choice,
            "nvp_choice": nvp_choice,
            "evp_choice": evp_choice,
        }

    results = {}
    question_count = len(questions)
    for i, question in enumerate(questions):
        print(f"Processing question {i+1}/{question_count}: {question}")
        results[question] = run_test(question, verbose=True)
        if i+1 < question_count:
            delay()
    return results

def calc_summary(results, question_count):
    # Count wins for each approach
    wins = {"naive": 0, "experts": 0, "personas": 0}
    trials = question_count*2

    for question in results:
        print(question)
        # Naive vs experts
        if results[question]["nve_choice"] == '1':
            wins["naive"] += 1
        else:
            wins["experts"] += 1

        # Naive vs personas
        if results[question]["nvp_choice"] == '1':
            wins["naive"] += 1
        else:
            wins["personas"] += 1

        # Experts vs personas
        if results[question]["evp_choice"] == '1':
            wins["experts"] += 1
        else:
            wins["personas"] += 1

    # Calculate percentages
    percentages = {
        approach: (wins[approach] / trials) * 100
        for approach in wins
    }

    print("\nWin percentages:")
    for approach, percentage in percentages.items():
        print(f"{approach}: {percentage:.1f}%")

In [16]:
questions = [
    "What is the best way to learn a new skill?",
    "What is the best way to stay healthy?",
    "How can I be more productive?",
    "What will AI look like in 10 years?",
    "How do we end world hunger?",
]


In [17]:
results = run_comparisons(questions)

Processing question 1/5: What is the best way to learn a new skill?

Naive Response>>>  The best way to learn a new skill is through a multifaceted approach combining focused practice, diverse learning resources, and consistent effort.  Begin by breaking the skill down into smaller, manageable components, concentrating on mastering each part before moving on.  Active practice, where you actively engage with the material rather than passively observing, is crucial.  Explore various learning resources like online courses, books, tutorials, and mentorship to gain different perspectives and cater to your learning style.  Seek feedback from others and be open to constructive criticism to identify areas for improvement.  Consistency is key; dedicate regular time to practice, even if it's just for short periods, to reinforce learning and build muscle memory.  Don't be afraid to experiment and make mistakes, as they are valuable learning opportunities.  Finally, maintain a growth mindset, beli

2025-03-18 18:07:09,589 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).


# Naive vs personas: 2
Reattempting judge response - 1/3
Delaying 2.01 seconds...
# Experts vs personas: 2
----------------------------------------------------------------------------------------------------
NAIVE vs EXPPERTS: Analysis:Response 1 provides concrete, actionable advice with specific recommendations (e.g., 150 minutes of exercise, 7-9 hours of sleep, 8 glasses of water).  This makes it more practical and helpful for someone looking to improve their health. While Response 2 uses more sophisticated language and mentions broader concepts like the interplay between environment and lifestyle, it lacks the specific, actionable advice that makes Response 1 more useful. Response 2 also feels a bit more generic and less "human" due to phrases like "ancient wisdom with modern scientific advancements."  Response 1 feels more like a conversation with a knowledgeable friend or doctor.

<choice>1</choice>
Reason:Response 1 is better because it offers specific, actionable advice that is 

2025-03-18 18:09:14,485 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).
2025-03-18 18:09:14,523 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).



Expert Response>>>  ## Expert names:
* **Amartya Sen** (Nobel laureate in Economics, focused on poverty, famine, and social choice theory)
* **Esther Duflo** (Development economist, known for randomized controlled trials in poverty alleviation)
* **Norman Borlaug** (Agricultural scientist, "Father of the Green Revolution")
* **Frances Moore Lappé** (Author of "Diet for a Small Planet," advocate for sustainable food systems)
* **Jeffrey Sachs** (Economist, advocate for sustainable development goals)
* **Josette Sheeran** (Former Executive Director of the World Food Programme)
* **Vandana Shiva** (Environmental activist, critic of industrial agriculture)
* **Bill Gates** (Philanthropist, focused on global health and development, including agricultural innovation)
* **Muhammad Yunus** (Founder of Grameen Bank, pioneer of microfinance)
* **George McGovern** (Former US Senator, advocate for hunger relief and food security)


## Expert responses:
* **Amartya Sen**: Ending hunger isn't simpl

2025-03-18 18:09:16,364 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).



Personas Response>>> None
Retrying...attempt 3
Delaying 1.0 seconds...

Personas Response>>> ## Persona names:
* **Aisha** (28-year-old, Kenyan, subsistence farmer)
* **David** (55-year-old, American, CEO of an agribusiness company)
* **Maria** (40-year-old, Brazilian, social worker in a favela)
* **Rajesh** (16-year-old, Indian, student living in poverty)
* **Emily** (25-year-old, Canadian, environmental activist)
* **James** (70-year-old, British, retired economist)
* **Li Wei** (30-year-old, Chinese, urban factory worker)
* **Elena** (60-year-old, Ukrainian, refugee)
* **Pierre** (45-year-old, French, government policy advisor)
* **Chloe** (19-year-old, American, college student studying sustainable agriculture)


## Persona responses:
* **Aisha**: We need better access to drought-resistant seeds, fair prices for our crops, and education on sustainable farming practices.  Climate change is making it harder to grow food, and we need help adapting.
* **David**:  Innovation in agricul

In [18]:
print(results)

{'What is the best way to learn a new skill?': {'nve_choice': '2', 'nvp_choice': '1', 'evp_choice': '1'}, 'What is the best way to stay healthy?': {'nve_choice': '1', 'nvp_choice': '2', 'evp_choice': '2'}, 'How can I be more productive?': {'nve_choice': '1', 'nvp_choice': '1', 'evp_choice': '1'}, 'What will AI look like in 10 years?': {'nve_choice': '1', 'nvp_choice': '1', 'evp_choice': '1'}, 'How do we end world hunger?': {'nve_choice': '2', 'nvp_choice': '1', 'evp_choice': '1'}}


In [19]:
calc_summary(results, len(questions))


What is the best way to learn a new skill?
What is the best way to stay healthy?
How can I be more productive?
What will AI look like in 10 years?
How do we end world hunger?

Win percentages:
naive: 70.0%
experts: 60.0%
personas: 20.0%


In [20]:
client.set_temperature(1)

In [21]:
results2=run_comparisons(questions)

Processing question 1/5: What is the best way to learn a new skill?

Naive Response>>>  The best way to learn a new skill is through a multifaceted approach combining focused practice, diverse learning methods, and consistent effort.  Start by breaking down the skill into smaller, manageable components and setting achievable goals for each.  Active learning, like hands-on practice and project-based learning, is crucial for solidifying understanding. Supplement this with passive learning resources, such as books, videos, or online courses, to gain theoretical knowledge.  Seeking feedback from mentors, peers, or online communities helps identify areas for improvement and refine your technique.  Regularly practicing, even in short bursts, is more effective than sporadic intense sessions.  Embrace mistakes as opportunities to learn and adjust your approach.  Maintain motivation by celebrating milestones and focusing on the intrinsic rewards of acquiring the new skill.  Finally, remember th

2025-03-18 18:12:04,138 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).


# Naive vs experts: 1
Reattempting judge response - 1/3
Delaying 2.04 seconds...
# Naive vs personas: 1
# Experts vs personas: 2
----------------------------------------------------------------------------------------------------
NAIVE vs EXPPERTS: Analysis:Response 1 provides more specific and actionable advice.  It gives concrete recommendations like "150 minutes of moderate-intensity exercise" and "7-9 hours of sleep."  While Response 2 uses more sophisticated vocabulary ("multifaceted approach," "necessitates," "leveraging scientific advancements"), it lacks the practical guidance found in Response 1.  Both responses sound human-written, but Response 1 connects with the reader better by offering clear steps to improve their health. Response 2, while eloquent, comes across as slightly more generic and less impactful. While Response 2 mentions preventative measures like vaccinations, Response 1's focus on diet, exercise, sleep, and stress management addresses more foundational elemen

2025-03-18 18:12:59,543 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).


# Naive vs personas: 1
Reattempting judge response - 1/3
Delaying 2.13 seconds...


2025-03-18 18:13:01,717 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).


Reattempting judge response - 2/3
Delaying 2.57 seconds...
# Experts vs personas: 1
----------------------------------------------------------------------------------------------------
NAIVE vs EXPPERTS: Analysis:Response 1 offers a more practical and actionable approach to increasing productivity. It provides specific techniques like the Eisenhower Matrix and Pomodoro Technique, offering concrete tools the user can immediately implement.  It also emphasizes the importance of self-care, a crucial but often overlooked aspect of productivity. While Response 2 touches on important concepts like prioritizing tasks aligned with values and cultivating deep work habits, it lacks the concrete examples and actionable steps provided in Response 1.  Response 2 feels slightly more generic and less immediately helpful to someone looking for practical advice.  Both responses sound relatively human, but Response 1's specific examples make it feel more grounded and relatable.

<choice>1</choice>
Reaso

2025-03-18 18:13:10,695 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).
2025-03-18 18:13:10,725 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).


Processing question 4/5: What will AI look like in 10 years?

Naive Response>>>  None

Expert Response>>>  None

Expert Response>>>  ## Expert names:
* **Yann LeCun** (Deep Learning, Convolutional Neural Networks)
* **Demis Hassabis** (Reinforcement Learning, AlphaFold)
* **Fei-Fei Li** (Computer Vision, ImageNet)
* **Geoffrey Hinton** (Backpropagation, Neural Networks)
* **Yoshua Bengio** (Natural Language Processing, Attention Mechanisms)
* **Stuart Russell** (AI Safety, Rational Agents)
* **Nick Bostrom** (Superintelligence, Existential Risk)
* **Kate Crawford** (AI Ethics, Bias in Algorithms)
* **Timnit Gebru** (Algorithmic Bias, Fairness)
* **Max Tegmark** (AI and Physics, Life 3.0)


## Expert responses:
* **Yann LeCun**: AI in 10 years will see significant advancements in self-supervised learning, enabling systems to learn from vast amounts of unlabeled data, moving us closer to human-level learning.  We'll see more robust and adaptable systems, but common sense reasoning will r

2025-03-18 18:13:43,653 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).


# Naive vs experts: 2
Reattempting judge response - 1/3
Delaying 2.03 seconds...


2025-03-18 18:13:45,727 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).


Reattempting judge response - 2/3
Delaying 2.86 seconds...


2025-03-18 18:13:48,623 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).


Reattempting judge response - 3/3
Delaying 2.31 seconds...


2025-03-18 18:13:50,968 - ERROR - Error invoking models/gemini-1.5-pro: 429 Resource has been exhausted (e.g. check quota).


TypeError: expected string or bytes-like object, got 'NoneType'

In [90]:
calc_summary(results2, len(questions))

What is the best way to learn a new skill?
What is the best way to stay healthy?
How can I be more productive?
What will AI look like in 10 years?
How do we end world hunger?

Win percentages:
naive: 60.0%
experts: 30.0%
personas: 60.0%


In [2]:
client.list_models()

models/gemini-1.0-pro-latest
models/gemini-1.0-pro
models/gemini-pro
models/gemini-1.0-pro-001
models/gemini-1.0-pro-vision-latest
models/gemini-pro-vision
models/gemini-1.5-pro-latest
models/gemini-1.5-pro-001
models/gemini-1.5-pro-002
models/gemini-1.5-pro
models/gemini-1.5-flash-latest
models/gemini-1.5-flash-001
models/gemini-1.5-flash-001-tuning
models/gemini-1.5-flash
models/gemini-1.5-flash-002
models/gemini-1.5-flash-8b
models/gemini-1.5-flash-8b-001
models/gemini-1.5-flash-8b-latest
models/gemini-1.5-flash-8b-exp-0827
models/gemini-1.5-flash-8b-exp-0924
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-001
models/gemini-2.0-flash-lite-preview
models/gemini-2.0-flash-lite-preview-02-05
models/gemini-2.0-pro-exp
models/gemini-2.0-pro-exp-02-05
models/gemini-exp-1206
models/gemini-2.0-flash-thinking-exp-01-21
models/gemini-2.0-flash-thinking-exp
models/gemini-2.0-flash-thinking-exp-1219
models/learnlm-1.5-pro-experimental


In [20]:
client = GenerativeAIClient(model_name="models/gemini-2.0-pro-exp", temperature=.7)

In [21]:
results3=run_comparisons(questions)

Processing question 1/5: What is the best way to learn a new skill?

Naive Response>>>  The most effective way to learn a new skill involves a combination of focused study, consistent practice, and seeking feedback. Start by understanding the fundamental principles through resources like books or courses. Then, dedicate regular time to practicing the skill hands-on, breaking it down into smaller, manageable steps. Finally, seek feedback from experts or peers to identify areas for improvement and refine your technique. This iterative process of learning, practicing, and refining is key to mastery.


Expert Response>>>  Okay, here's the breakdown:

## Expert names:

*   **Benjamin Franklin** (Polymath, inventor, and autodidact)
*   **Leonardo da Vinci** (Renaissance artist, inventor, and scientist)
*   **Marie Curie** (Pioneering physicist and chemist, two-time Nobel laureate)
*   **Richard Feynman** (Nobel Prize-winning physicist known for his teaching ability)
*   **Barbara Oakley** (P

2025-02-24 16:25:58,725 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).



Naive Response>>>  To stay healthy, prioritize a balanced diet rich in fruits, vegetables, and whole grains, while limiting processed foods, sugary drinks, and excessive saturated and unhealthy fats. Engage in regular physical activity, aiming for at least 150 minutes of moderate-intensity or 75 minutes of vigorous-intensity aerobic exercise per week, along with strength training twice a week. Ensure you get adequate sleep, typically 7-9 hours per night, manage stress through techniques like meditation or yoga, and maintain regular check-ups with your doctor for preventative care and screenings. Avoid smoking and excessive alcohol consumption.


Expert Response>>>  None


2025-02-24 16:26:08,063 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).
2025-02-24 16:26:08,116 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).



Expert Response>>>  Okay, here's the breakdown:

## Expert names:

*   **Hippocrates** (Ancient Greek physician, "Father of Medicine")
*   **Maimonides** (Medieval Jewish philosopher and physician)
*   **Paracelsus** (Renaissance physician, alchemist, and toxicologist)
*   **Florence Nightingale** (Founder of modern nursing)
*   **Louis Pasteur** (Pioneer of microbiology and vaccination)
*   **Joseph Lister** (Pioneer of antiseptic surgery)
*   **John Snow** (Pioneer of epidemiology and public health)
*   **Linus Pauling** (Advocate for vitamin C and orthomolecular medicine)
*   **Ancel Keys** (Pioneer of the Mediterranean Diet and cardiovascular health research)
*   **Walter Willett** (Expert in nutritional epidemiology, advocate for whole foods)

## Expert responses:

*   **Hippocrates**: <"Let food be thy medicine and medicine be thy food. Balance the humors through proper diet, exercise, and fresh air. Observe nature's healing power.">
*   **Maimonides**: <"Moderation in all thing

2025-02-24 16:26:09,600 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).



Personas Response>>> None
Retrying...attempt 3
Delaying 1.12 seconds...


2025-02-24 16:26:21,297 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).



Personas Response>>> Okay, here's the breakdown:

## Persona names:

*   **Medical Doctor** (45, Urban, High Income, Advanced Degree)
*   **Registered Dietitian** (38, Suburban, Middle Income, Advanced Degree)
*   **Personal Trainer** (29, Urban, Middle Income, Certified)
*   **Yoga Instructor** (52, Rural, Low Income, Certified)
*   **Mental Health Counselor** (60, Suburban, High Income, Licensed, Advanced Degree)
*   **Elderly Retiree** (75, Suburban, Fixed Income, High School Diploma)
*   **Single Parent** (32, Urban, Low Income, Some College)
*   **College Student** (20, Urban, Low Income, In College)
*   **High School Athlete** (16, Suburban, Middle Income, In High School)
*    **Truck Driver** (55, Rural, Middle, High School)
## Persona responses:

*   **Medical Doctor**: "Regular check-ups, evidence-based preventative screenings, a balanced diet, regular physical activity, and managing stress are crucial. Vaccinations and avoiding harmful substances are also key."
*   **Registe

2025-02-24 16:26:47,050 - ERROR - Error invoking models/gemini-2.0-pro-exp: Invalid operation: The `response.text` quick accessor requires the response to contain a valid `Part`, but none were returned. Please check the `candidate.safety_ratings` to determine if the response was blocked.



Expert Response>>>  None

Expert Response>>>  Okay, here's the breakdown:

## Expert names:

*   **Benjamin Franklin** (Founding Father, inventor, author, known for his structured daily schedule)
*   **Frederick Winslow Taylor** (Engineer, father of scientific management and efficiency principles)
*   **Henry Ford** (Industrialist, pioneer of assembly line production)
*   **Peter Drucker** (Management consultant, educator, and author, "father of modern management")
*   **Stephen Covey** (Author of "The 7 Habits of Highly Effective People")
*   **David Allen** (Productivity consultant, creator of "Getting Things Done" (GTD) methodology)
*   **Tim Ferriss** (Entrepreneur, author of "The 4-Hour Workweek", focused on efficiency and lifestyle design)
*   **Cal Newport** (Computer science professor, author on focus and "deep work")
*   **Marie Kondo** (Organizing consultant, author of "The Life-Changing Magic of Tidying Up")
*   **Brian Tracy** (Self-help and productivity author, focus on g

2025-02-24 16:27:11,124 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).



Personas Response>>> Okay, here's a breakdown of how to be more productive, incorporating diverse perspectives:

## Persona names:

*   **College Student** (18-22, Full-time education, limited income)
*   **Stay-at-Home Parent** (30-45, Childcare responsibilities, limited personal time)
*   **Corporate Executive** (45-60, High-pressure job, demanding schedule)
*   **Freelancer** (25-40, Self-employed, variable income, flexible schedule)
*   **Retiree** (65+, Fixed income, ample free time, potential health concerns)
*   **Small Business Owner** (30-50, High workload, financial risk, long hours)
*   **Teacher** (25-55, Moderate income, structured schedule, high stress)
*   **Medical Resident** (24-30, Low income, extremely long hours, high stress)
*   **Military Personnel** (20-40, Structured environment, demanding physical and mental tasks)
*   **Factory Worker** (30-60, Physically demanding job, fixed schedule, moderate income)

## Persona responses:

*   **College Student**: "Break d

2025-02-24 16:27:16,692 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).


# Naive vs experts: 1
Reattempting judge response - 1/3
Delaying 1.81 seconds...
# Naive vs personas: 2
# Experts vs personas: 1
----------------------------------------------------------------------------------------------------
NAIVE vs EXPPERTS: Analysis: Response 1 is more direct, practical, and easier to implement immediately. It offers concrete techniques without being overly verbose. Response 2, while containing valuable advice, is presented in a more academic and less accessible style, citing numerous authors and concepts that might overwhelm a user seeking straightforward tips. Response 1 feels more like a helpful suggestion from a peer, while Response 2 reads more like a condensed summary of productivity literature. Response 2's attempt to be comprehensive by referencing multiple authors makes it sound less human and more like an AI summarizing various sources. Response 1 offers unique suggestions in that the combination of them is unique. Response 2 does this also, but the t

2025-02-24 16:28:05,638 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).
2025-02-24 16:28:05,693 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).



Naive Response>>>  Ending world hunger requires a multi-pronged approach focusing on increasing sustainable food production, improving distribution networks, and addressing the root causes of poverty and inequality. This includes investing in agricultural research and technology, empowering smallholder farmers (particularly women), promoting climate-smart agriculture, reducing food waste, strengthening social safety nets, and fostering global cooperation to ensure equitable access to food resources. Conflict resolution and political stability are also crucial, as conflict often disrupts food systems and exacerbates hunger.


Expert Response>>>  None

Expert Response>>>  None
Retrying...attempt 1
Delaying 1.66 seconds...


2025-02-24 16:28:07,413 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).



Expert Response>>>  None
Retrying...attempt 2
Delaying 1.62 seconds...


2025-02-24 16:28:09,087 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).



Expert Response>>>  None
Retrying...attempt 3
Delaying 1.72 seconds...


2025-02-24 16:28:20,693 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).



Personas Response>>> Okay, here's a breakdown of how to end world hunger, incorporating diverse perspectives:

## Persona names:

*   **Subsistence Farmer** (Small landholder, low-income, rural, developing country)
*   **Agricultural Economist** (PhD, focus on global food systems, policy advisor)
*   **Logistics and Supply Chain Manager** (MBA, experience in food distribution, private sector)
*   **International Aid Worker** (Field experience, NGO, humanitarian focus)
*   **Food Scientist** (PhD, focus on crop yield and nutrition, research-oriented)
*   **Government Official** (Politician/bureaucrat, policy-making, national level)
*   **Climate Change Scientist** (PhD, focus on environmental impacts, long-term forecasting)
*   **Social Justice Advocate** (Activist, focus on equity and human rights, community organizer)
*   **Corporate Executive** (CEO of a major food company, profit-driven, global reach)
*   **Nutritionist** (Registered Dietitian, focus on dietary needs and public hea

2025-02-24 16:28:22,475 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).


Reattempting judge response - 2/3
Delaying 1.8 seconds...


2025-02-24 16:28:24,341 - ERROR - Error invoking models/gemini-2.0-pro-exp: 429 Resource has been exhausted (e.g. check quota).


Reattempting judge response - 3/3
Delaying 2.53 seconds...
# Naive vs experts: 1
# Naive vs personas: 2
# Experts vs personas: 2
----------------------------------------------------------------------------------------------------
NAIVE vs EXPPERTS: Analysis: Response 1 provides a comprehensive and insightful overview of the multifaceted approach required to address world hunger. It touches upon various crucial aspects, including sustainable food production, improved distribution, poverty alleviation, and the impact of conflict. Response 2 offers nothing.

<choice>1</choice>
Reason: Response 1 is significantly better because it offers a detailed, realistic, and well-structured answer to a complex question. Response 2 gives no help at all.

----------------------------------------------------------------------------------------------------
NAIVE vs PERSONAS: Analysis: Both responses offer valid approaches to ending world hunger, highlighting sustainable agriculture, improved distribution

In [22]:
calc_summary(results3, len(questions))

What is the best way to learn a new skill?
What is the best way to stay healthy?
How can I be more productive?
What will AI look like in 10 years?
How do we end world hunger?

Win percentages:
naive: 50.0%
experts: 40.0%
personas: 60.0%


In [23]:
results3

{'What is the best way to learn a new skill?': {'nve_choice': '1',
  'nvp_choice': '1',
  'evp_choice': '2'},
 'What is the best way to stay healthy?': {'nve_choice': '1',
  'nvp_choice': '2',
  'evp_choice': '1'},
 'How can I be more productive?': {'nve_choice': '1',
  'nvp_choice': '2',
  'evp_choice': '1'},
 'What will AI look like in 10 years?': {'nve_choice': '2',
  'nvp_choice': '2',
  'evp_choice': '1'},
 'How do we end world hunger?': {'nve_choice': '1',
  'nvp_choice': '2',
  'evp_choice': '2'}}