## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [27]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display

In [28]:
# Always remember to do this!
load_dotenv(override=True)

True

In [29]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')
openrouter_api_key = os.getenv('OPENROUTER_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")
if openrouter_api_key:
    print(f"OpenRouter API Key exists and begins {openrouter_api_key[:4]}")
else:
    print("OpenRouter API Key not set (and this is optional)")


OpenAI API Key exists and begins sk-proj-
Anthropic API Key not set (and this is optional)
Google API Key not set (and this is optional)
DeepSeek API Key not set (and this is optional)
Groq API Key not set (and this is optional)
OpenRouter API Key exists and begins sk-o


In [30]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [31]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [32]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


How would you approach designing a solution to minimize the risk of algorithmic bias in a machine learning model that is used for hiring, considering both technical and ethical perspectives?


In [33]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

In [34]:
# The API we know well

model_name = "gpt-4o-mini"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Designing a solution to minimize the risk of algorithmic bias in a machine learning model used for hiring requires a comprehensive approach that incorporates technical, organizational, and ethical considerations. Here’s how I would approach this:

### 1. Understand the Context and Define Objectives
- **Identify the Purpose:** Clearly define the objectives of the hiring model and how it fits into the overall recruitment process.
- **Stakeholder Engagement:** Involve stakeholders (HR professionals, hiring managers, diversity officers, and candidates) to understand their perspectives and concerns about biases in hiring.

### 2. Data Collection and Preparation
- **Diverse Data Sources:** Ensure that the data used to train the model includes a diverse set of candidates, capturing a wide range of backgrounds, experiences, and demographics.
- **Data Auditing:** Conduct a thorough audit of the data for any pre-existing biases. Check for imbalances in gender, race, and other demographic factors.
- **Feature Selection:** Carefully select features to ensure they are relevant to job performance and avoid proxies for sensitive attributes (e.g., zip codes as a proxy for race).
- **Data Augmentation:** Consider augmenting underrepresented groups in the data or using synthetic data generation techniques to ensure balanced datasets.

### 3. Model Development and Training
- **Bias Mitigation Techniques:** Utilize algorithms and techniques designed to reduce bias, such as adversarial debiasing, re-weighting samples, or applying fairness constraints during model training.
- **Fairness Metrics:** Define and implement fairness metrics to evaluate model performance across different demographic groups (e.g., equal opportunity, disparate impact).
- **Transparency:** Design models that are interpretable. Use explainable AI techniques to clarify how decisions are made and ensure stakeholders can understand and trust the model's outcomes.

### 4. Testing and Validation
- **Pre-Deployment Testing:** Rigorously test the model on unseen data to measure performance and fairness. Look for disparities in performance metrics across different demographic groups.
- **Simulation Scenarios:** Create scenarios to simulate how the model performs under different conditions and biases to understand potential impacts and failures.

### 5. Continuous Monitoring and Feedback Loop
- **Monitoring:** Implement a robust monitoring system to continuously track the model’s performance and fairness metrics over time after deployment.
- **Feedback Mechanism:** Establish a process for gathering feedback from candidates and hiring managers regarding the recruitment experience and outcomes of the model.

### 6. Compliance and Governance
- **Legal Considerations:** Stay informed about legal frameworks regarding hiring practices and protect against discrimination (e.g., Equal Employment Opportunity laws).
- **Ethical Guidelines:** Develop and adhere to ethical guidelines for the responsible use of AI in hiring. Establish accountability structures for decision-making involving AI.

### 7. Training and Awareness
- **Stakeholder Training:** Provide training for HR staff and hiring managers on the risks of algorithmic bias, fairness in hiring, and how to effectively use AI tools.
- **Candidate Awareness:** Offer transparency to candidates about how algorithms are used in the hiring process and inform them of their rights regarding personal data usage.

### 8. Organizational Culture
- **Diversity Goals:** Align the use of the model with broader organizational goals for diversity and inclusion in hiring practices.
- **Encourage Inclusivity:** Foster an organizational culture that values and promotes inclusivity and diversity beyond algorithmic outputs.

### Conclusion
In conclusion, minimizing algorithmic bias in hiring requires a multifaceted strategy that blends technical rigor with ethical considerations. Organizations must commit to ongoing evaluation and adaptation of their AI usage, ensuring they prioritize fairness and equity in their hiring practices while fostering an inclusive workplace culture. This approach not only enhances the reliability of the hiring process but also builds trust with candidates and stakeholders.

In [35]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "anthropic/claude-3.7-sonnet"

claude = OpenAI(api_key=openrouter_api_key, base_url="https://openrouter.ai/api/v1")
response = claude.chat.completions.create(
    model=model_name,
    messages=messages
)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

# Designing a Solution to Minimize Algorithmic Bias in Hiring ML Models

## Technical Approach

**Data Collection & Preparation:**
- Ensure diverse and representative training data across demographics, backgrounds, and qualifications
- Audit existing datasets for historical biases before using them
- Consider synthetic data generation to balance underrepresented groups
- Document data provenance and limitations thoroughly

**Model Development:**
- Use techniques like adversarial debiasing where the model is penalized for being able to predict protected attributes
- Implement regularization methods that specifically target fairness metrics
- Select features carefully, avoiding proxies for protected classes
- Consider ensemble models that can balance different fairness constraints

**Testing & Validation:**
- Disaggregate performance metrics across demographic groups
- Implement multiple fairness metrics: demographic parity, equal opportunity, equalized odds
- Conduct counterfactual testing (e.g., changing gender/race in applications)
- Perform sensitivity analysis to understand how small changes affect different groups

## Ethical Considerations

**Transparency & Explainability:**
- Use interpretable models where possible or explainability techniques for complex models
- Document decision boundaries and weightings of different factors
- Create plain-language explanations for candidates about how the system works

**Human Oversight:**
- Design a human-in-the-loop system where algorithmic recommendations are reviewed
- Establish clear protocols for when to override automated decisions
- Create feedback channels for candidates to challenge potentially biased outcomes

**Governance Framework:**
- Develop clear guidelines for acceptable performance disparities
- Establish regular bias audits conducted by independent parties
- Create a cross-functional ethics committee to review the system
- Ensure compliance with relevant laws (e.g., EEOC guidelines in the US)

**Continuous Improvement:**
- Monitor production performance with feedback from hiring outcomes
- Implement a version control system that tracks changes and their impact on bias
- Create a roadmap for regularly updating fairness constraints as societal norms evolve

Would you like me to elaborate on any specific aspect of this approach?

In [36]:
# gemini
model_name = "google/gemini-2.0-flash-001"

gemini = OpenAI(api_key=openrouter_api_key, base_url="https://openrouter.ai/api/v1")
response = gemini.chat.completions.create(
    model=model_name,
    messages=messages
)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

## Minimizing Bias in a Hiring Machine Learning Model: A Holistic Approach

Minimizing algorithmic bias in hiring models requires a multifaceted approach encompassing technical, ethical, and organizational considerations. Here's how I would approach designing such a solution:

**I. Pre-Deployment: Defining Goals and Understanding the Problem**

1. **Clearly Define the Objectives:**
   * **What are we trying to predict?** (e.g., job performance, retention, cultural fit, potential)
   * **What biases are we most concerned about?** (e.g., gender, race, age, socio-economic background, disability)
   * **What are the legal and regulatory constraints?** (e.g., anti-discrimination laws like Title VII in the US)
   * **How will fairness be measured?** (Define specific fairness metrics - see section III below)

2. **Stakeholder Engagement:**
   * **Involve HR professionals, legal counsel, ethicists, and potential employees (candidates) in the design process.** This ensures diverse perspectives and aligns the model with ethical considerations.
   * **Transparency:** Communicate the purpose and operation of the model to stakeholders, fostering trust and understanding.

3. **Data Auditing and Understanding:**
   * **Identify potential bias sources in the data:**  Carefully analyze the available data for indicators that might correlate with protected characteristics (e.g., zip code as a proxy for race or ethnicity).
   * **Investigate historical hiring practices:** Understand if past decisions were influenced by biases, as this will be reflected in the training data.
   * **Document data provenance and limitations:** Maintain a clear record of where the data came from, how it was collected, its limitations, and potential biases.

**II. Design and Development: Technical Mitigation Strategies**

1. **Data Collection & Preprocessing:**
   * **Representative Data:**  Ensure the training data reflects the diversity of the applicant pool and the desired workforce composition, but avoid intentional oversampling to artificially balance groups.  Instead, focus on addressing underlying systemic issues that lead to underrepresentation.
   * **Feature Selection:**
     * **Avoid using protected attributes (e.g., race, gender) as features directly.** This is often illegal and generally unethical. Employ techniques like "blinded" features.
     * **Carefully evaluate proxy variables.** Identify and mitigate features that correlate highly with protected attributes. For example, remove or transform features like "attended historically all-male college" that might disproportionately disadvantage female candidates.
     * **Focus on job-related skills and qualifications.**  Prioritize features that demonstrably predict job performance and are not discriminatory.
   * **Data Cleaning and Anonymization:**
     * **Handle missing data strategically.**  Imputation could introduce biases. Consider using different imputation methods based on the feature type and potential for bias. Document the chosen methods and their rationale.
     * **Anonymize personal information** that is not relevant to the job requirements.
   * **Data Augmentation (Use with Caution):** Synthesize new data instances to balance underrepresented groups. Be aware that this can amplify existing biases if not done carefully.

2. **Model Selection & Training:**
   * **Choose a model with interpretability.**  Explainable AI (XAI) techniques allow for understanding how the model makes predictions and identifying potential sources of bias.
   * **Regularization:** Use regularization techniques to prevent overfitting and ensure the model generalizes well to diverse applicant groups.
   * **Fairness-Aware Training:**  Explore algorithms designed to explicitly optimize for fairness. Examples include:
     * **Adversarial Debiasing:** Train the model to be good at prediction while simultaneously trying to fool a discriminator that predicts protected attributes (e.g., gender).
     * **Reweighting:** Assign different weights to training examples based on their group membership to balance the impact of different groups on the model.
     * **Prejudice Remover Regularizer:** Add a penalty term to the loss function that penalizes the model for making predictions based on protected attributes.

3. **Model Evaluation & Validation:**
   * **Multiple Fairness Metrics:** Use a variety of fairness metrics to evaluate the model's performance across different demographic groups.  No single metric is perfect.  Common metrics include:
     * **Demographic Parity:**  Equal probability of being hired regardless of group membership.  (Potential downside: May lead to lower prediction accuracy).
     * **Equal Opportunity:**  Equal probability of being hired *given* that the candidate is qualified (typically defined as successfully performing the target task).
     * **Predictive Parity:** Equal probability of being qualified *given* that the candidate is hired.
   * **Intersectionality:**  Go beyond group comparisons (e.g., comparing men and women) and consider the intersection of multiple identities (e.g., comparing Black women to other groups).
   * **A/B Testing:** Before fully deploying the model, conduct A/B tests to compare its performance against traditional hiring methods and to identify any unintended consequences. Track relevant metrics like interview rates, offer acceptance rates, and new hire performance across demographic groups.

**III. Post-Deployment: Monitoring and Continuous Improvement**

1. **Ongoing Monitoring:** Continuously monitor the model's performance for signs of bias drift. Data distributions and societal norms change over time, so the model might become biased even if it was initially fair.
2. **Automated Bias Detection:**  Implement automated systems to detect disparities in hiring outcomes across different demographic groups.
3. **Regular Audits:**  Conduct regular independent audits of the model and its impact on hiring practices.
4. **Feedback Mechanisms:**  Establish a system for candidates and employees to report concerns about the model's fairness.  Take these concerns seriously and investigate them thoroughly.
5. **Model Retraining and Updates:**  Retrain the model periodically with updated data, using the latest fairness-aware techniques. Reassess feature selection and data preprocessing strategies.
6. **Documentation and Transparency:**  Maintain thorough documentation of the model's design, development, evaluation, and deployment. Be transparent about the model's limitations and potential biases.
7. **Human Oversight:**  Always involve human decision-makers in the hiring process. The model should only be used as a tool to augment, not replace, human judgment. Provide training to hiring managers on how to interpret the model's output and how to avoid making biased decisions.

**IV. Ethical Considerations:**

* **Justification and Proportionality:** Ensure that the use of machine learning in hiring is justified and that the benefits outweigh the risks.
* **Transparency and Explainability:**  Strive for transparency in the model's operation and explainability in its predictions.  Candidates should understand how the model works and why they were or were not selected.
* **Fairness and Equity:**  Commit to using the model in a way that promotes fairness and equity in hiring decisions.
* **Accountability and Responsibility:**  Clearly define roles and responsibilities for the development, deployment, and monitoring of the model.
* **Candidate Empowerment:**  Provide candidates with opportunities to challenge the model's decisions and provide feedback.
* **Privacy:**  Be mindful of data privacy regulations and ensure that candidates' personal information is protected.
* **Bias Mitigation as a Continuous Process:** Recognize that bias mitigation is an ongoing process that requires continuous effort and vigilance.

**Key Takeaways:**

* **No silver bullet:** There is no single solution to eliminate algorithmic bias. It requires a combination of technical and ethical strategies.
* **Context matters:** The specific strategies that are most effective will depend on the specific context of the hiring decision, the available data, and the legal and regulatory environment.
* **Ongoing learning is essential:** The field of fairness in machine learning is rapidly evolving. Stay up-to-date on the latest research and best practices.
* **Humans in the loop are crucial:** Automating hiring decisions fully is dangerous. Human oversight and intervention are critical to ensuring fairness and avoiding unintended consequences.

By adopting a holistic and iterative approach that addresses both technical and ethical considerations, it is possible to minimize algorithmic bias in hiring models and promote more inclusive and equitable hiring practices.


In [37]:
# deepseek

model_name = "deepseek/deepseek-chat-v3-0324:free"

deepseek = OpenAI(api_key=openrouter_api_key, base_url="https://openrouter.ai/api/v1")
response = deepseek.chat.completions.create(
    model=model_name,
    messages=messages
)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

### Introduction

Designing a machine learning model for hiring processes is a complex task that requires careful consideration of both technical and ethical aspects to minimize algorithmic bias. Algorithmic bias can lead to unfair discrimination against certain groups, perpetuating existing inequalities in the workplace. This comprehensive approach involves understanding the sources of bias, implementing technical strategies to mitigate them, and ensuring ethical oversight throughout the model's lifecycle.

### Step 1: Understanding Algorithmic Bias

**Definition:**
Algorithmic bias occurs when a machine learning model systematically produces unfair outcomes for certain groups of people, often due to biased training data or flawed model design.

**Sources of Bias in Hiring Models:**
1. **Historical Bias:** Past hiring data may reflect discriminatory practices, such as favoring certain demographics over others.
2. **Representation Bias:** Underrepresentation of certain groups in the training data can lead to poor model performance for those groups.
3. **Measurement Bias:** The way job performance or qualifications are measured may be biased.
4. **Aggregation Bias:** Treating all groups the same when they have different underlying distributions.
5. **Evaluation Bias:** Using metrics that don't account for fairness can hide biased outcomes.

### Step 2: Data Collection and Preparation

**a. Diverse and Representative Data:**
- Ensure the training data includes a diverse range of candidates across protected attributes (e.g., gender, race, age).
- Check for representation across different groups to avoid underrepresentation.

**b. Labeling and Feature Selection:**
- Use objective and job-related criteria for labeling (e.g., job performance metrics).
- Avoid including features that directly or indirectly correlate with protected attributes (e.g., names, zip codes).
- Conduct feature importance analysis to identify potential proxies for protected attributes.

**c. Data Augmentation:**
- If certain groups are underrepresented, consider techniques like oversampling (with caution to avoid overfitting) or synthetic data generation.

### Step 3: Model Development

**a. Algorithm Selection:**
- Choose algorithms that are less prone to bias or can be easily audited (e.g., decision trees over complex neural networks).
- Consider fairness-aware algorithms that incorporate fairness constraints during training.

**b. Fairness Constraints:**
- Implement fairness metrics (e.g., demographic parity, equal opportunity) as constraints or objectives during model training.
- Techniques like adversarial debiasing can help reduce bias.

**c. Regularization:**
- Use regularization techniques to prevent the model from overfitting to biased patterns in the data.

### Step 4: Evaluation and Validation

**a. Fairness Metrics:**
- Evaluate the model using multiple fairness metrics (e.g., statistical parity, disparate impact ratio, equalized odds).
- Ensure that the model's performance is equitable across different groups.

**b. Disaggregated Analysis:**
- Analyze model performance (accuracy, false positive/negative rates) separately for different demographic groups.

**c. Benchmarking:**
- Compare the model's decisions with human hiring decisions to identify any new biases introduced by the model.

### Step 5: Post-Deployment Monitoring

**a. Continuous Monitoring:**
- Regularly monitor the model's decisions for signs of bias as new data comes in.
- Set up alerts for significant deviations in fairness metrics.

**b. Feedback Loops:**
- Allow candidates and hiring managers to provide feedback on the model's decisions.
- Use this feedback to identify and correct biases.

**c. Model Updating:**
- Periodically retrain the model with new, more representative data to adapt to changing dynamics.

### Step 6: Ethical Considerations

**a. Transparency:**
- Provide clear explanations of how the model makes decisions (e.g., using interpretability tools like SHAP or LIME).
- Document the model's limitations and potential biases.

**b. Accountability:**
- Establish clear lines of accountability for the model's decisions.
- Ensure there is human oversight to review and override model decisions when necessary.

**c. Legal Compliance:**
- Ensure the model complies with relevant laws and regulations (e.g., GDPR, EEOC guidelines).
- Conduct regular audits to verify compliance.

**d. Stakeholder Involvement:**
- Involve diverse stakeholders (HR, ethicists, employee representatives) in the design and review process.
- Gather input from affected groups to understand their concerns and perspectives.

### Step 7: Mitigation Strategies for Identified Biases

**a. Bias Mitigation Techniques:**
- Pre-processing: Adjust the training data to remove biases (e.g., reweighting samples).
- In-processing: Modify the learning algorithm to incorporate fairness constraints.
- Post-processing: Adjust the model's outputs to ensure fairness (e.g., threshold adjustments for different groups).

**b. Human-in-the-Loop:**
- Incorporate human reviewers to assess the model's recommendations, especially for borderline cases.
- Ensure that humans are trained to recognize and counteract their own biases.

### Step 8: Documentation and Reporting

**a. Model Card:**
- Create a model card that provides an overview of the model's performance, including fairness metrics, intended use, and limitations.

**b. Audit Trails:**
- Maintain detailed records of the model's development, decisions, and any updates for accountability.

**c. Reporting:**
- Regularly report on the model's fairness performance to stakeholders and the public to maintain transparency.

### Step 9: Education and Training

**a. Team Training:**
- Educate the development team on issues of bias and fairness in AI.
- Encourage ongoing learning about best practices in fair machine learning.

**b. User Training:**
- Train hiring managers and HR professionals on how to interpret and use the model's outputs responsibly.

### Step 10: Iterative Improvement

**a. Feedback Integration:**
- Continuously gather feedback from all stakeholders to identify areas for improvement.
- Be prepared to iterate on the model and processes to enhance fairness over time.

**b. Research and Development:**
- Stay informed about new research in fair machine learning to incorporate cutting-edge techniques.

### Potential Challenges and Considerations

1. **Trade-offs Between Fairness and Accuracy:**
   - Sometimes, making a model fairer can reduce its overall accuracy. It's essential to find a balance that aligns with organizational values and legal requirements.

2. **Defining Fairness:**
   - Different fairness metrics may conflict with each other. The choice of which fairness definition to prioritize should be made carefully, considering the context.

3. **Dynamic

In [None]:
# llama 4

model_name = "meta-llama/llama-4-maverick:free"

llama4 = OpenAI(api_key=openrouter_api_key, base_url="https://openrouter.ai/api/v1")
response = llama4.chat.completions.create(
    model=model_name,
    messages=messages
)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [15]:
# used to run commands in the terminal !
!ollama pull llama3.2

zsh:1: command not found: ollama


In [None]:
# mistral

model_name = "nousresearch/deephermes-3-mistral-24b-preview:free"

mistral = OpenAI(api_key=openrouter_api_key, base_url="https://openrouter.ai/api/v1")
response = mistral.chat.completions.create(
    model=model_name,
    messages=messages
)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Designing an ethical framework for AI presents a significant challenge due to the complexity of balancing state-of-the-art technological innovation with safety, while accounting for diverse cultural values and societal norms. To approach this challenge, we can break it down into several key steps:

1. Research and understanding: Comprehensive research into existing cultural values and societal norms is vital. This includes studying historical and current AI policy essays to understand the general attitudes towards AI, and examining case studies of AI applications across different cultures. Utilizing AI-generated historical essays can improve comprehension and awareness of various perspectives.

2. Establishing an AI ethics committee: An international committee consisting of experts from various fields such as AI research, legal, philosophical, social, and cultural perspectives is critical. The inclusion of diverse perspectives ensures a comprehensive understanding of potential implications and possible consequences of AI development. By engaging these experts in the decision-making process, we have a greater chance of foreseeing the impacts of the decisions we make.

3. Formulating principles and guidelines: Utilizing the insights from research and the AI ethics committee, we can establish salient ethical principles and guidelines that act as the backbone of a global ethical framework for AI. For instance, principles such as fairness, transparency, accountability, and privacy could be included to ensure that AI systems are developed and deployed responsibly. Additionally, guidelines that emphasize the importance of cultural sensitivity and context-awareness can help mitigate potential conflicts or misunderstandings that arise from differing cultural norms.

4. International collaboration and dialogue: Implementing an ethical framework on a global scale necessitates international cooperation and mutual understanding. Collaboration among countries, institutions, and industry stakeholders helps foster greater consensus and collective responsibility for the development of AI. Interdisciplinary dialogues that incorporate various fields and cultural perspectives can enrich the discussion and generate innovative solutions that may not have been otherwise considered.

5. Adaptation and flexibility: Recognizing that AI systems, like the societies in which they operate, evolve over time, it is essential to build an adaptable and flexible framework that can accommodate these changes. This involves regular reviews, updates, and adjustments to the ethical principles and guidelines to stay relevant and maintain their effectiveness in guiding AI development.

6. Education and awareness: Widespread education and awareness around AI ethics, especially among AI developers and the general public, will bolster an inclusive and responsible approach. Encouraging dialogue and discussion about AI's potential advantages and risks within diverse communities can foster a sense of shared responsibility and collective ownership in the development and application of AI technologies.

In conclusion, designing an ethical framework for AI requires careful consideration of diverse cultural values and societal norms. The process involves extensive research, international collaboration, interdisciplinary dialogue, flexibility, and ongoing public education. By striking the right balance between innovation and safety, we can ensure that the development and deployment of AI systems contribute positively to society while respecting cultural differences and upholding ethical principles.

In [17]:
# So where are we?

print(competitors)
print(answers)


['gpt-4o-mini', 'anthropic/claude-3.7-sonnet', 'google/gemini-2.0-flash-001', 'deepseek/deepseek-chat-v3-0324:free', 'meta-llama/llama-4-maverick:free', 'nousresearch/deephermes-3-mistral-24b-preview:free']
['Designing an ethical framework for artificial intelligence (AI) that balances innovation and safety, while taking into account diverse cultural values and societal norms, is a complex but essential task. Here’s a structured approach that focuses on inclusivity, adaptability, and practical application:\n\n### 1. **Establish Core Principles**\n\nStart by identifying universal ethical principles that can resonate across cultures, such as:\n\n- **Beneficence**: Ensure that AI systems contribute positively to humanity.\n- **Non-maleficence**: Prevent harm and ensure safety in AI applications.\n- **Justice**: Promote fairness, equity, and accessibility in AI development and application.\n- **Autonomy**: Respect individual rights and support human agency.\n- **Transparency**: Foster open

In [18]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


Competitor: gpt-4o-mini

Designing an ethical framework for artificial intelligence (AI) that balances innovation and safety, while taking into account diverse cultural values and societal norms, is a complex but essential task. Here’s a structured approach that focuses on inclusivity, adaptability, and practical application:

### 1. **Establish Core Principles**

Start by identifying universal ethical principles that can resonate across cultures, such as:

- **Beneficence**: Ensure that AI systems contribute positively to humanity.
- **Non-maleficence**: Prevent harm and ensure safety in AI applications.
- **Justice**: Promote fairness, equity, and accessibility in AI development and application.
- **Autonomy**: Respect individual rights and support human agency.
- **Transparency**: Foster openness and accountability in AI systems.

### 2. **Engage Stakeholders**

Involve a wide range of stakeholders throughout the design process:

- **Diverse Cultural Representation**: Include voices

In [19]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [20]:
print(together)

# Response from competitor 1

Designing an ethical framework for artificial intelligence (AI) that balances innovation and safety, while taking into account diverse cultural values and societal norms, is a complex but essential task. Here’s a structured approach that focuses on inclusivity, adaptability, and practical application:

### 1. **Establish Core Principles**

Start by identifying universal ethical principles that can resonate across cultures, such as:

- **Beneficence**: Ensure that AI systems contribute positively to humanity.
- **Non-maleficence**: Prevent harm and ensure safety in AI applications.
- **Justice**: Promote fairness, equity, and accessibility in AI development and application.
- **Autonomy**: Respect individual rights and support human agency.
- **Transparency**: Foster openness and accountability in AI systems.

### 2. **Engage Stakeholders**

Involve a wide range of stakeholders throughout the design process:

- **Diverse Cultural Representation**: Include v

In [21]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [22]:
print(judge)

You are judging a competition between 6 competitors.
Each model has been given this question:

How would you approach designing an ethical framework for artificial intelligence that balances innovation and safety, considering the diverse cultural values and societal norms around the world?

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}

Here are the responses from each competitor:

# Response from competitor 1

Designing an ethical framework for artificial intelligence (AI) that balances innovation and safety, while taking into account diverse cultural values and societal norms, is a complex but essential task. Here’s a structured approach that focuses on inclusivity, adaptability, and practical application:

### 1. **Establish Core Principles**

S

In [23]:
judge_messages = [{"role": "user", "content": judge}]

In [None]:
# Judgement time!

judge_1 = OpenAI()
response = judge_1.chat.completions.create(
    model="o3-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)

# judge_2 = OpenAI()
# response = judge_2.chat.completions.create(
#     model="o3-mini",
#     messages=judge_messages,
# )
# results = response.choices[0].message.content
# print(results)

# judge_3 = OpenAI()
# response = judge_3.chat.completions.create(
#     model="o3-mini",
#     messages=judge_messages,
# )
# results = response.choices[0].message.content
# print(results)

{"results": ["3", "4", "1", "5", "2", "6"]}
{"results": ["3", "4", "1", "5", "6", "2"]}
{"results": ["3", "1", "4", "5", "2", "6"]}


In [26]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")
print(results_dict)    

Rank 1: google/gemini-2.0-flash-001
Rank 2: gpt-4o-mini
Rank 3: deepseek/deepseek-chat-v3-0324:free
Rank 4: meta-llama/llama-4-maverick:free
Rank 5: anthropic/claude-3.7-sonnet
Rank 6: nousresearch/deephermes-3-mistral-24b-preview:free
{'results': ['3', '1', '4', '5', '2', '6']}


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            and common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>