## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [1]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display

In [2]:
# Always remember to do this!
load_dotenv(override=True)

True

In [4]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('gemini_api_key')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key not set
Anthropic API Key not set (and this is optional)
Google API Key exists and begins AI
DeepSeek API Key not set (and this is optional)
Groq API Key not set (and this is optional)


In [5]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.0-flash"
competitors = []
answers = []

In [6]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [7]:
# openai = OpenAI()
# response = openai.chat.completions.create(
#     model="gpt-4o-mini",
#     messages=messages,
# )
# question = response.choices[0].message.content
# print(question)

response = gemini.chat.completions.create(model=model_name, messages=messages)
question= response.choices[0].message.content
print(question)
messages = [{"role": "user", "content": question}]


Consider a fictional society where individuals are born with a predetermined, publicly known "Potential Score" that quantifies their aptitude for various skills. How would you structure a just and equitable educational system in this society, acknowledging both the predictive power and inherent limitations of the Potential Score, while maximizing individual fulfillment and societal benefit? Justify your design, addressing potential biases, ethical considerations, and unintended consequences.



In [None]:
# The API we know well

# model_name = "gpt-4o-mini"

# response = openai.chat.completions.create(model=model_name, messages=messages)
# answer = response.choices[0].message.content

# display(Markdown(answer))
# competitors.append(model_name)
# answers.append(answer)

In [10]:
# Anthropic has a slightly different API, and Max Tokens is required

# model_name = "claude-3-7-sonnet-latest"

# claude = Anthropic()
# response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
# answer = response.content[0].text

# display(Markdown(answer))
# competitors.append(model_name)
# answers.append(answer)

In [8]:
response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Okay, this is a complex and fascinating challenge! Here's my approach to designing a just and equitable educational system in a society governed by a "Potential Score," along with the justifications, ethical considerations, and potential drawbacks:

**I. Guiding Principles:**

*   **Human Dignity and Intrinsic Worth:** Every individual possesses inherent worth, regardless of their Potential Score. The educational system must respect and nurture this.
*   **Potential as a Guide, Not a Determinant:** The Potential Score should *inform* educational pathways, not dictate them. It's a tool for personalized learning, not a preordained destiny.
*   **Holistic Development:** Education must extend beyond vocational training and focus on critical thinking, creativity, ethical reasoning, emotional intelligence, and well-being.
*   **Equity of Opportunity:** Strive to provide equal access to quality education, resources, and support, regardless of their Potential Score, socioeconomic background, or location.
*   **Flexibility and Adaptability:** The system must allow for reassessment, skill diversification, and career changes, recognizing that potential can evolve over time.
*   **Continuous Improvement:**  Regular evaluation and feedback mechanisms should be integrated to identify biases, refine the system, and ensure it aligns with evolving societal needs.

**II. The Educational System Structure:**

**A. Early Childhood Education (Ages 0-5):**

*   **Focus:**  Universal access to high-quality, play-based education. Emphasis on social-emotional development, early literacy, numeracy, and exploration of diverse interests.
*   **Potential Score Role:**  None publicly accessible.  Observation and assessment by trained professionals (pediatricians, psychologists, educators) to identify potential learning disabilities or developmental delays and provide early intervention.  Data collected is *strictly confidential* and used only for individual support.
*   **Rationale:** Critical foundation for future learning and development.  Shielding young children from potential score labels promotes self-esteem and a love of learning.

**B. Primary Education (Ages 6-12):**

*   **Focus:**  Core curriculum encompassing language arts, mathematics, science, social studies, arts, and physical education. Personalized learning approaches tailored to individual learning styles and pace. Introduction to potential career pathways in broad strokes, such as through exploring different professions, reading about biographies, and engaging in age-appropriate simulations.
*   **Potential Score Role:** The Potential Score (which should be regularly updated and re-assessed) is used *internally* by educators to inform instructional strategies and provide personalized support. Students are grouped *flexibly* based on specific skills and interests, not rigidly by Potential Score.  Emphasis is placed on identifying and nurturing individual strengths.
*   **Rationale:** Provides a broad foundation of knowledge and skills. Personalized learning ensures that all students can reach their potential, regardless of the initial score. Flexible grouping prevents segregation and fosters collaboration.

**C. Secondary Education (Ages 13-18):**

*   **Focus:**  More specialized curriculum with a greater degree of student choice.  Multiple pathways:
    *   **Academic Pathway:**  Focus on preparation for higher education. Advanced courses in specific disciplines.
    *   **Vocational/Technical Pathway:**  Hands-on training in specific trades or technical fields. Apprenticeships and internships.
    *   **Arts/Humanities Pathway:**  In-depth study of art, music, literature, history, philosophy, and other humanities disciplines.
    *   **Hybrid Pathway:**  A blend of academic, vocational, and arts courses tailored to individual interests.
*   **Potential Score Role:** The Potential Score is used as a guide for pathway selection but is *not* the sole determinant.  Students receive comprehensive career counseling and are encouraged to explore different options.  Emphasis is placed on self-assessment, interest inventories, and real-world experiences.  The Potential Score is *transparent* to the student and their parents/guardians, but its limitations are emphasized.  The system includes a clear appeals process if a student believes the score is inaccurate or unfairly limiting.
*   **Rationale:**  Allows students to pursue their passions and develop specialized skills. Multiple pathways ensure that all students have access to a relevant and engaging education. Career counseling helps students make informed decisions about their future.

**D. Higher Education and Lifelong Learning:**

*   **Focus:**  Universities, colleges, vocational schools, and online learning platforms offer a wide range of degree programs, certifications, and continuing education courses.  Scholarships and financial aid are widely available to ensure affordability.  Emphasis on lifelong learning and skill development.
*   **Potential Score Role:**  The Potential Score may be considered as part of the admissions process, but it is *not* the only factor.  Colleges and universities use a holistic review process that considers academic record, extracurricular activities, personal essays, letters of recommendation, and demonstrated skills.  The Potential Score may be used to provide targeted support and mentorship to students who may need additional assistance.
*   **Rationale:**  Provides opportunities for advanced learning and skill development. Financial aid ensures that all students have access to higher education. Lifelong learning is essential for adapting to a rapidly changing world.

**III. Addressing Potential Biases and Ethical Considerations:**

*   **Bias Detection and Mitigation:**
    *   **Data Audits:** Regularly audit the algorithms used to calculate the Potential Score for bias against specific demographic groups (e.g., race, gender, socioeconomic status).
    *   **Expert Review:**  Subject the algorithms to review by independent experts in statistics, education, and ethics.
    *   **Diverse Data Sets:** Ensure that the training data used to develop the algorithms is diverse and representative of the population.
    *   **Multiple Assessments:** Use multiple assessment methods to measure potential, including standardized tests, performance-based assessments, portfolios, and interviews.
    *   **Qualitative Feedback:**  Collect qualitative feedback from students, parents, and educators about their experiences with the educational system.
*   **Ethical Considerations:**
    *   **Transparency:**  Be transparent about how the Potential Score is calculated and how it is used.
    *   **Privacy:**  Protect the privacy of student data.
    *   **Autonomy:**  Respect the autonomy of students and their families to make decisions about their education.
    *   **Fairness:**  Strive for fairness in all aspects of the educational system.
    *   **Social Justice:**  Address systemic inequalities that may affect student outcomes.
*   **Unintended Consequences:**
    *   **Self-Fulfilling Prophecy:**  Guard against the Potential Score becoming a self-fulfilling prophecy, where students are limited by their perceived potential.
    *   **Stratification:**  Prevent the Potential Score from creating a rigid social hierarchy.
    *   **Devaluation of Certain Skills:**  Avoid devaluing skills that are not easily quantifiable or measured by the Potential Score (e.g., creativity, empathy, leadership).
    *   **Reduced Motivation:**  Address the potential for students with lower scores to become demotivated.
*   **Countermeasures:**
    *   **Emphasize Growth Mindset:**  Promote the belief that intelligence and abilities can be developed through effort and learning.
    *   **Celebrate Success:**  Recognize and celebrate the achievements of all students, regardless of their Potential Score.
    *   **Foster Collaboration:**  Encourage students to work together and learn from each other.
    *   **Promote Empathy:**  Teach students to understand and appreciate the perspectives of others.
    *   **Invest in Mental Health:**  Provide access to mental health services to support students' well-being.
    *   **Challenge the System:**  Encourage students to challenge the system and advocate for their needs.

**IV. Monitoring and Evaluation:**

*   **Key Performance Indicators (KPIs):**
    *   Student achievement (measured through standardized tests, grades, and other assessments).
    *   Graduation rates.
    *   College enrollment rates.
    *   Employment rates.
    *   Student satisfaction.
    *   Equity gaps (differences in outcomes between different demographic groups).
    *   Social mobility (the ability of individuals to move up or down the socioeconomic ladder).
*   **Data Collection Methods:**
    *   Surveys.
    *   Interviews.
    *   Focus groups.
    *   Statistical analysis.
    *   Qualitative research.
*   **Regular Evaluation:** Conduct regular evaluations of the educational system to identify areas for improvement.
*   **Stakeholder Engagement:** Involve students, parents, educators, and community members in the evaluation process.

**V. Constant Adaptation and Improvement:**

The educational system must be a living, breathing entity, constantly adapting to new knowledge, changing societal needs, and evolving ethical considerations.  Regular feedback loops, data analysis, and open dialogue are essential for ensuring that the system remains just, equitable, and effective. The Potential Score is merely a tool; the ultimate goal is to empower individuals to reach their full potential and contribute meaningfully to society.

**In Conclusion:**

Designing an educational system in a society defined by a "Potential Score" is a tightrope walk.  It requires acknowledging the potential benefits of personalized learning while mitigating the risks of bias, discrimination, and self-fulfilling prophecies. By prioritizing human dignity, equity of opportunity, and continuous improvement, we can strive to create a system that empowers all individuals to thrive, regardless of their perceived potential.


In [12]:
# deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
# model_name = "deepseek-chat"

# response = deepseek.chat.completions.create(model=model_name, messages=messages)
# answer = response.choices[0].message.content

# display(Markdown(answer))
# competitors.append(model_name)
# answers.append(answer)

In [13]:
# groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
# model_name = "llama-3.3-70b-versatile"

# response = groq.chat.completions.create(model=model_name, messages=messages)
# answer = response.choices[0].message.content

# display(Markdown(answer))
# competitors.append(model_name)
# answers.append(answer)


## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [None]:
!ollama pull llama3.2

In [9]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In a society where individuals are born with publicly known "Potential Scores," I would propose the following structured educational system to ensure that it is just, equitable, and maximizes individual fulfillment while also benefiting society:

**Principles:**

1. **Complementary education**: Recognize that Potential Scores do not dictate success or failure, but rather provide a starting point for exploration. The goal is to help individuals discover their passions, develop talents, and adapt to changing circumstances.
2. **Contextualization**: Account for the potential biases associated with Potential Scores by incorporating contextual factors into the educational system, such as socioeconomic status, location, and access to resources.
3. **Emphasis on skills development**: Focus on building skills that are transferable across various fields, rather than solely emphasizing a student's Potential Score.

**Educational Structure:**

1. **Counseling and Resource Allocation (ages 0-6)**:
	* Provide unbiased counseling to families based on their potential child's score.
	* Allocate resources fairly, considering contextual factors such as family income, location, and access to facilities.
2. **Core Education Program (ages 7-14)**:
	* Offer a standardized curriculum emphasizing core skills, including critical thinking, problem-solving, communication, creativity, and emotional intelligence.
	* Utilize Adaptive Learning Technology (ALT) that adjusts the learning pace and content based on individual Performance Scores.
3. **Clustered Learning Model (ages 15-18)**:
	* Divide students into small clusters based on their Potential Score ranges to create diverse peer groups.
	* Train educators on the potential biases associated with Potential Scores, ensuring a fair assessment of student capabilities and adaptability.
4. **Post-Federation Training and Placement (ages 19+)**:
	* Provide accessible, contextual learning experiences that foster soft skills development and real-world applications.
	* Facilitate connections between individuals with diverse backgrounds and experience.

**Potential Score-Related Considerations:**

1. **Potential Score Analysis**: The system would need to recognize that Potential Scores are just an estimate and may not accurately reflect a person's full potential or the potential that has yet to be fulfilled.
2. **Tracking Progress**: Regular tracking of individual Performance Scores and skill development will help ensure that educators can monitor student progress, adjust instruction techniques accordingly, and provide additional support as needed.

**Mitigating Bias:**

1. **Fairness Review Board**: Establish an independent committee composed of experts from diverse backgrounds to review potential Score-based biases and advocate for adjustments when necessary.
2. **Resource-Based Reforms**: Provide equitable access to resources based on socio-economic status, geographic location, and available opportunity, eliminating or minimizing inherent biases.

**Preventing Unintended Consequences:**

1. **Adaptive and Open Education Systems**: Develop adaptable educational models that adjust to changes in society, addressing the challenges of evolving needs.
2. **Mentoring Programs for Educators**: Implement mentorship programs to educate educators on the limitations of Potential Scores and the benefits of diverse group settings.

The overall system aims to create harmony between quantifying talents with respecting individuality.

In [25]:
# So where are we?
for comp, an in zip(competitors, answers):
    print(f"Competitior: {comp}\n\n{an}\n")

Competitior: gemini-2.0-flash

Okay, this is a complex and fascinating challenge! Here's my approach to designing a just and equitable educational system in a society governed by a "Potential Score," along with the justifications, ethical considerations, and potential drawbacks:

**I. Guiding Principles:**

*   **Human Dignity and Intrinsic Worth:** Every individual possesses inherent worth, regardless of their Potential Score. The educational system must respect and nurture this.
*   **Potential as a Guide, Not a Determinant:** The Potential Score should *inform* educational pathways, not dictate them. It's a tool for personalized learning, not a preordained destiny.
*   **Holistic Development:** Education must extend beyond vocational training and focus on critical thinking, creativity, ethical reasoning, emotional intelligence, and well-being.
*   **Equity of Opportunity:** Strive to provide equal access to quality education, resources, and support, regardless of their Potential Sco

In [26]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [27]:
print(together)

# Response from competitor 1

Okay, this is a complex and fascinating challenge! Here's my approach to designing a just and equitable educational system in a society governed by a "Potential Score," along with the justifications, ethical considerations, and potential drawbacks:

**I. Guiding Principles:**

*   **Human Dignity and Intrinsic Worth:** Every individual possesses inherent worth, regardless of their Potential Score. The educational system must respect and nurture this.
*   **Potential as a Guide, Not a Determinant:** The Potential Score should *inform* educational pathways, not dictate them. It's a tool for personalized learning, not a preordained destiny.
*   **Holistic Development:** Education must extend beyond vocational training and focus on critical thinking, creativity, ethical reasoning, emotional intelligence, and well-being.
*   **Equity of Opportunity:** Strive to provide equal access to quality education, resources, and support, regardless of their Potential Scor

In [28]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{request}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [29]:
print(judge)

You are judging a competition between 2 competitors.
Each model has been given this question:

Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}

Here are the responses from each competitor:

# Response from competitor 1

Okay, this is a complex and fascinating challenge! Here's my approach to designing a just and equitable educational system in a society governed by a "Potential Score," along with the justifications, ethical considerations, and potential drawbacks:

**I. Guiding Principles:**

*   **Human Dignity and Intrinsic Worth:** Every individual possesses inherent worth, regardless o

In [30]:
judge_messages = [{"role": "user", "content": judge}]

In [31]:
# Judgement time!

response = gemini.chat.completions.create(
    model="gemini-2.0-flash",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


{"results": ["1", "2"]}


In [32]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

Rank 1: gemini-2.0-flash
Rank 2: llama3.2


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>