## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [5]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display

In [6]:
# Always remember to do this!
load_dotenv(override=True)

True

In [7]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AI
DeepSeek API Key exists and begins sk-
Groq API Key exists and begins gsk_


In [8]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [9]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [10]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


How do you reconcile the tension between the need for individual privacy and the necessity of data collection for societal benefits, particularly in the context of emerging technologies like AI and big data?


In [11]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

In [12]:
# The API we know well

model_name = "gpt-4o-mini"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Reconciling the tension between individual privacy and the need for data collection for societal benefits, especially in the context of emerging technologies like AI and big data, requires a multifaceted approach. Here are some strategies to address this challenge:

1. **Regulatory Frameworks**: Establishing clear regulations that govern data collection, use, and sharing is crucial. Laws like the General Data Protection Regulation (GDPR) in the European Union provide a model for balancing privacy rights with data utilization for societal good. This includes principles such as data minimization, transparency, and user consent.

2. **Ethical Standards**: Organizations should adopt ethical guidelines related to data usage. This includes conducting regular ethics assessments, adhering to principles of fairness, accountability, and transparency, and ensuring that algorithms used in AI systems are free from bias and discrimination.

3. **Data Anonymization**: When collecting data for analysis, organizations can employ techniques like anonymization or pseudonymization. This allows for the insights derived from data without exposing individual identities, preserving privacy while still enabling valuable societal benefits.

4. **User Control and Consent**: Providing individuals with control over their data is essential. This includes clear options for users to consent to data collection, easily manage their preferences, and be informed about how their data will be used and for what purposes.

5. **Public Awareness and Education**: Increasing public awareness about the significance of both data privacy and data utility is vital. Educating individuals about their rights and the benefits of data sharing can lead to more informed consent and participation in data-driven initiatives.

6. **Focus on Purpose Limitation**: Collection of data should be limited to specific, legitimate purposes that are communicated clearly to users. This helps ensure that data is not used in ways individuals do not foresee or agree to.

7. **Stakeholder Engagement**: Involving diverse stakeholders in the data governance process, including civil society, regulatory bodies, technologists, and the community, can lead to more balanced approaches. Collaborative discussions can help identify shared values and concerns related to privacy and data use.

8. **Use of Technology for Privacy Protection**: Emerging technologies, including blockchain and differential privacy techniques, can enhance data security and privacy. Incorporating these technologies can help create systems that respect individual privacy while allowing for effective data analysis.

9. **Impact Assessments**: Conducting regular assessments of the impact of data collection practices on privacy and society can help organizations adapt their practices to mitigate any negative consequences. This includes evaluating the potential for harm and implementing measures to address privacy concerns.

10. **Promoting a Culture of Privacy**: Creating a culture within organizations that prioritizes privacy and ethical data use can lead to practices that respect individual rights while still pursuing innovative uses of data for societal benefits.

By implementing these strategies, it is possible to strike a balance that respects individual privacy while still harnessing the potential of data for the greater good. The key lies in fostering an environment where ethical considerations are at the forefront of data collection and use.

In [13]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-3-7-sonnet-latest"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

# Balancing Privacy and Data Utility

This tension requires a multifaceted approach that doesn't treat privacy and data utility as a zero-sum game:

**Foundational principles:**
- Privacy by design: Building privacy protections into systems from the ground up
- Informed consent: Ensuring people understand what data is collected and how it's used
- Data minimization: Collecting only what's necessary for the stated purpose

**Promising technical approaches:**
- Differential privacy: Adding calibrated noise to datasets to protect individuals while preserving aggregate insights
- Federated learning: Training AI models across devices without centralizing sensitive data
- Synthetic data: Creating artificial datasets that maintain statistical properties without exposing real individuals

The most sustainable path forward likely involves:
1. Contextual privacy norms that vary by domain and sensitivity
2. Democratic oversight of data governance
3. Technological solutions that can extract societal value while minimizing individual exposure

What aspects of this tension interest you most? The technical solutions, ethical frameworks, or perhaps the policy considerations?

In [14]:
# OpenAI is a python client library for calling https endponts,
# all companies adapted this to their own API due to the clever structure.
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.0-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Reconciling the tension between individual privacy and the societal benefits of data collection is a complex challenge, especially with the rise of AI and big data. It requires a multi-faceted approach involving technological solutions, legal frameworks, ethical guidelines, and public education. Here's a breakdown of how we can navigate this tension:

**1. Technological Solutions:**

*   **Privacy-Enhancing Technologies (PETs):** These technologies aim to protect privacy while still enabling data analysis. Examples include:
    *   **Differential Privacy:**  Adds noise to data to protect individual identities while preserving aggregate trends.
    *   **Homomorphic Encryption:**  Allows computations on encrypted data without decrypting it.
    *   **Federated Learning:**  Trains AI models on decentralized data sources (e.g., individual devices) without directly accessing the raw data.
    *   **Secure Multi-Party Computation (SMPC):**  Allows multiple parties to compute a function on their private inputs without revealing those inputs to each other.
*   **Data Minimization:**  Collecting only the necessary data for a specific purpose and deleting data when it is no longer needed.
*   **Data Anonymization and Pseudonymization:** Removing or replacing identifying information with pseudonyms. While not perfect, these techniques can reduce the risk of re-identification.
*   **Transparency Tools:**  Tools that allow individuals to understand what data is being collected about them, how it is being used, and with whom it is being shared.  These can empower individuals to make informed choices.

**2. Legal Frameworks and Regulations:**

*   **Strong Data Protection Laws:**  Comprehensive laws like GDPR (General Data Protection Regulation) in Europe and CCPA (California Consumer Privacy Act) establish clear rules for data collection, processing, and storage.  These laws grant individuals rights over their data, including the right to access, rectify, and delete their data.
*   **Purpose Limitation:**  Restricting the use of collected data to the specific purpose for which it was originally collected and with the individual's informed consent.
*   **Accountability and Transparency:**  Requiring organizations to be transparent about their data practices and accountable for complying with data protection laws.  Independent oversight bodies (e.g., data protection authorities) can help enforce these laws.
*   **Data Breach Notification Laws:**  Mandatory reporting of data breaches to both individuals and regulatory authorities. This helps to mitigate the damage from breaches and encourages organizations to improve their security practices.

**3. Ethical Guidelines and Principles:**

*   **Ethical AI Development:**  Adhering to ethical principles when developing and deploying AI systems. This includes fairness, transparency, accountability, and respect for human rights.
*   **Privacy by Design:**  Incorporating privacy considerations into the design of systems and services from the outset. This proactive approach is more effective than trying to retrofit privacy protections later.
*   **Data Ethics Frameworks:** Developing and implementing clear ethical frameworks for data collection, use, and sharing.  These frameworks should consider the potential impact on individuals and society as a whole.
*   **Focus on Societal Good:**  Prioritizing data collection and AI development that benefits society as a whole, rather than solely focusing on commercial interests.

**4. Public Education and Awareness:**

*   **Data Literacy:**  Educating the public about data privacy, security, and the implications of data collection and AI.  This empowers individuals to make informed decisions about their data and to advocate for stronger privacy protections.
*   **Critical Thinking Skills:**  Encouraging critical thinking about the use of data and AI in society.  This includes questioning the potential biases and unintended consequences of these technologies.
*   **Promoting Trust and Transparency:**  Building trust between individuals and organizations that collect and use data.  Transparency about data practices is essential for building trust.

**How to Reconcile the Tension:**

*   **Risk-Based Approach:** Evaluate the potential risks to individual privacy associated with data collection and AI development.  Implement safeguards that are proportional to the level of risk.
*   **Balancing Competing Interests:**  Recognize that there is often a legitimate need for data collection to achieve societal benefits.  However, this need must be balanced against the individual's right to privacy.
*   **Context Matters:**  The appropriate balance between privacy and data collection will vary depending on the context.  For example, the level of privacy protection required for medical data may be different from the level of protection required for marketing data.
*   **Ongoing Dialogue and Adaptation:**  The debate over privacy and data collection is ongoing.  It is important to continue the dialogue and to adapt our legal frameworks, ethical guidelines, and technological solutions as new technologies emerge.
*   **Empowering Individuals:**  Give individuals more control over their data and the ability to make informed decisions about how it is used.

**Challenges and Considerations:**

*   **Re-identification:**  Even when data is anonymized, there is a risk of re-identifying individuals.  This is particularly true with big data, where there is a large amount of information available.
*   **Algorithmic Bias:**  AI algorithms can perpetuate and amplify existing biases in data.  This can lead to unfair or discriminatory outcomes.
*   **The "Nothing to Hide" Argument:**  The argument that "I have nothing to hide, so I don't care about privacy" is flawed.  Privacy is essential for freedom of thought, expression, and association.  Moreover, even if someone has nothing to hide, they may still be vulnerable to discrimination, surveillance, or other harms if their data is misused.
*   **Global Differences:**  Privacy laws and cultural norms vary across different countries.  This can make it difficult to develop consistent data privacy policies.
*   **Enforcement:**  Enforcing data protection laws can be challenging, particularly in the context of cross-border data flows.

In conclusion, reconciling the tension between individual privacy and the societal benefits of data collection requires a holistic approach that combines technological innovation, robust legal frameworks, ethical guidelines, and public education. It's an ongoing balancing act requiring continuous evaluation and adaptation as technology evolves.  The goal is to harness the power of data and AI while safeguarding fundamental rights and promoting a just and equitable society.


In [15]:
deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-chat"

response = deepseek.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

The tension between individual privacy and the societal benefits of data collection—especially in the context of AI and big data—is one of the defining challenges of the digital age. Reconciling these competing priorities requires a nuanced, principled approach that balances innovation with ethical safeguards. Here are key strategies to address this tension:

### 1. **Privacy by Design & Default**
   - Embed privacy protections into the architecture of technologies from the outset (e.g., data minimization, anonymization, encryption).
   - Ensure systems default to the least intrusive data collection necessary for their function.

### 2. **Transparency & Consent**
   - Provide clear, accessible explanations of what data is collected, how it’s used, and who has access.
   - Move beyond "clickwrap" consent to meaningful user control (e.g., granular opt-ins, dynamic consent models).

### 3. **Purpose Limitation & Data Minimization**
   - Collect only data strictly necessary for defined, legitimate purposes (e.g., public health, urban planning).
   - Avoid "function creep" where data collected for one purpose is repurposed without oversight.

### 4. **Differential Privacy & Synthetic Data**
   - Use techniques like differential privacy (adding statistical noise to datasets) to enable analysis without exposing individual identities.
   - Generate synthetic datasets that mimic real-world patterns but contain no actual personal data.

### 5. **Strong Governance & Accountability**
   - Implement independent oversight (e.g., ethics boards, algorithmic audits) to assess risks and benefits.
   - Enforce strict penalties for misuse (e.g., GDPR-style regulations with teeth).

### 6. **Public Interest vs. Commercial Exploitation**
   - Distinguish between data collection for genuine public good (e.g., pandemic response) versus corporate surveillance.
   - Advocate for public-sector stewardship of high-impact datasets (e.g., health, mobility) with democratic oversight.

### 7. **Technological Empowerment**
   - Develop tools that let individuals retain ownership of their data (e.g., self-sovereign identity, personal data vaults).
   - Explore federated learning, where AI models are trained on decentralized data without raw data being shared.

### **Ethical Trade-offs in Practice**
- **Healthcare AI**: Anonymized medical data can accelerate drug discovery, but re-identification risks must be mitigated.
- **Smart Cities**: Traffic sensors improve urban planning, but location tracking could enable mass surveillance.
- **Generative AI**: Training on public data enables innovation, but scraping personal data without consent violates privacy.

### **The Way Forward**
The goal isn’t a binary choice between privacy and progress but a framework where both coexist. This requires:
- **Legal clarity**: Updated regulations that keep pace with technology.
- **Public dialogue**: Inclusive debates about societal red lines (e.g., facial recognition bans in some cities).
- **Innovation in privacy tech**: Advances in cryptography (e.g., homomorphic encryption) that enable data utility without exposure.

Ultimately, the measure of success is whether we can harness data’s potential *without* normalizing a surveillance society. The solutions lie in policy, technology, and cultural shifts toward valuing privacy as a collective good.

In [16]:
groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "llama-3.3-70b-versatile"

response = groq.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)


Reconciling the tension between individual privacy and data collection is a complex issue that requires a multifaceted approach. Here are some ways to address this challenge, particularly in the context of emerging technologies like AI and big data:

1. **Data minimization**: Collect only the data that is necessary for the intended purpose, and avoid collecting extraneous information that could be used to identify individuals.
2. **Anonymization and pseudonymization**: Anonymize or pseudonymize data to prevent identification of individuals, while still allowing for analysis and insights to be gleaned.
3. **Opt-in and opt-out mechanisms**: Provide individuals with the option to opt-in or opt-out of data collection, and make sure they understand the implications of their choices.
4. **Transparent data collection**: Clearly communicate what data is being collected, how it will be used, and who will have access to it.
5. **Data protection by design**: Incorporate data protection principles and safeguards into the design of systems and technologies from the outset, rather than as an afterthought.
6. **Regulatory frameworks**: Establish and enforce robust regulatory frameworks that protect individual privacy and ensure accountability for data collection and use.
7. **AI and machine learning audits**: Regularly audit AI and machine learning systems to ensure they are transparent, explainable, and fair, and that they do not perpetuate biases or discrimination.
8. **Employee education and training**: Educate employees on data protection and privacy principles, and ensure they understand their roles and responsibilities in handling sensitive data.
9. **Individual control and agency**: Empower individuals to take control of their personal data, including the ability to access, correct, and delete their data as needed.
10. **Benefits-sharing models**: Explore benefits-sharing models that compensate individuals for their data, or provide them with direct benefits, such as improved healthcare outcomes or personalized services.
11. **Public engagement and awareness**: Engage with the public to raise awareness about the benefits and risks associated with data collection, and involve them in decision-making processes around data governance.
12. **Independent oversight**: Establish independent oversight mechanisms to ensure that data collection and use are compliant with regulations and ethical standards.
13. **Data portability**: Allow individuals to transfer their data between services, promoting competition and innovation while maintaining control over their personal information.
14. **Decentralized data storage**: Explore decentralized data storage solutions that enable individuals to store and manage their data in a secure and private manner.
15. **Continuous evaluation and improvement**: Regularly evaluate and improve data collection and use practices, incorporating feedback from individuals, civil society, and regulatory bodies.

To balance individual privacy with societal benefits, consider the following:

1. **Contextualize data collection**: Consider the specific context in which data is being collected, and ensure that the benefits of collection outweigh the potential risks to individual privacy.
2. **Risk-benefit analyses**: Conduct thorough risk-benefit analyses to evaluate the potential benefits of data collection against the potential risks to individual privacy.
3. **Proportionality**: Ensure that data collection is proportionate to the intended purpose, and that the most intrusive methods are used only when necessary.
4. **Legitimate interests**: Establish clear guidelines for what constitutes a legitimate interest in data collection, and ensure that collection is limited to those interests.
5. **Accountability**: Hold organizations accountable for their data collection and use practices, and ensure that they are transparent about their activities.

Ultimately, reconciling individual privacy with data collection for societal benefits requires a nuanced and context-dependent approach, involving ongoing dialogue and collaboration among stakeholders, including individuals, organizations, governments, and civil society.

## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [17]:
!ollama pull llama3.2

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling dde5aa3fc5ff:   0% ▕                  ▏ 9.6 KB/2.0 GB                  [K[?25h[?2026l

In [18]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Reconciling the tension between individual privacy and the necessity of data collection for societal benefits is a complex issue that requires careful consideration of various ethical, legal, and technical aspects. Here are some potential solutions to address this tension:

1. **Transparency and accountability**: Ensure that companies collecting personal data do so in a transparent manner, providing individuals with clear information about how their data will be used, shared, and protected. Accountability mechanisms should be put in place to prevent misuse of collected data.
2. **Consent and control**: Provide individuals with meaningful consent options, allowing them to decide whether and how their data can be collected, used, and shared. This includes the right to opt out of certain data collection practices or to delete their personal data altogether.
3. **Data minimization**: Collect only the necessary data for a specific purpose, reducing the privacy risks associated with collecting unnecessary information.
4. **Anonymization and pseud anonymization**: Use techniques like anonymization and pseudonymization to protect individual identity while still allowing for data analysis and insights.
5. **Regulatory frameworks**: Establish robust regulatory frameworks that safeguard individuals' rights, such as the General Data Protection Regulation (GDPR) in the European Union or the California Consumer Privacy Act (CCPA) in the United States.
6. **Data governance**: Implement effective data governance structures to ensure that data is handled responsibly and with proper oversight, ensuring that benefits are equitably distributed and risks minimized.
7. **Innovation incubators for data-driven solutions**: Support innovation in designing more secure, transparent, and responsible AI systems that prioritize individual privacy while still leveraging the potential of big data.
8. **Data literacy education**: Educate individuals about data protection best practices, online safety, and digital literacy to empower them with the knowledge needed to navigate the complex data landscape.
9. **Public engagement and participation**: Foster public engagement and participation in decision-making processes around AI and data collection to ensure that societal values are reflected in policies and practices.
10. **Encouraging responsible AI design**: Incentivize companies to adopt responsible AI design practices, such as designing systems that prioritize fairness, transparency, and explainability.

By adopting a multi-faceted approach that incorporates these solutions, we can reconcile the tension between individual privacy and the necessity of data collection for societal benefits. Here are some specific considerations for emerging technologies like AI and big data:

1. **AI explainability**: Develop techniques to provide transparent explanations for AI decisions to identify biases and errors.
2. **Data fairness**: Implement measures to prevent bias in AI systems, including fairness metrics and audits to monitor progress.
3. **Security by design**: Embed security principles into the development of AI systems, ensuring that data is protected throughout its lifecycle.
4. **Ethics-focused education**: Integrate ethics-focused education into curricula for technologists and policy-makers to develop a broader understanding of AI's social and moral implications.

Ultimately, reconciliation will depend on individual values and societal priorities for privacy protections. We must weigh the benefits against risks associated with data collection, ensuring that technological advancements are developed sustainably, ethically, and responsibly.

In [19]:
# So where are we?

print(competitors)
print(answers)


['gpt-4o-mini', 'claude-3-7-sonnet-latest', 'gemini-2.0-flash', 'deepseek-chat', 'llama-3.3-70b-versatile', 'llama3.2']
['Reconciling the tension between individual privacy and the need for data collection for societal benefits, especially in the context of emerging technologies like AI and big data, requires a multifaceted approach. Here are some strategies to address this challenge:\n\n1. **Regulatory Frameworks**: Establishing clear regulations that govern data collection, use, and sharing is crucial. Laws like the General Data Protection Regulation (GDPR) in the European Union provide a model for balancing privacy rights with data utilization for societal good. This includes principles such as data minimization, transparency, and user consent.\n\n2. **Ethical Standards**: Organizations should adopt ethical guidelines related to data usage. This includes conducting regular ethics assessments, adhering to principles of fairness, accountability, and transparency, and ensuring that alg

In [20]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


Competitor: gpt-4o-mini

Reconciling the tension between individual privacy and the need for data collection for societal benefits, especially in the context of emerging technologies like AI and big data, requires a multifaceted approach. Here are some strategies to address this challenge:

1. **Regulatory Frameworks**: Establishing clear regulations that govern data collection, use, and sharing is crucial. Laws like the General Data Protection Regulation (GDPR) in the European Union provide a model for balancing privacy rights with data utilization for societal good. This includes principles such as data minimization, transparency, and user consent.

2. **Ethical Standards**: Organizations should adopt ethical guidelines related to data usage. This includes conducting regular ethics assessments, adhering to principles of fairness, accountability, and transparency, and ensuring that algorithms used in AI systems are free from bias and discrimination.

3. **Data Anonymization**: When co

In [21]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [22]:
print(together)

# Response from competitor 1

Reconciling the tension between individual privacy and the need for data collection for societal benefits, especially in the context of emerging technologies like AI and big data, requires a multifaceted approach. Here are some strategies to address this challenge:

1. **Regulatory Frameworks**: Establishing clear regulations that govern data collection, use, and sharing is crucial. Laws like the General Data Protection Regulation (GDPR) in the European Union provide a model for balancing privacy rights with data utilization for societal good. This includes principles such as data minimization, transparency, and user consent.

2. **Ethical Standards**: Organizations should adopt ethical guidelines related to data usage. This includes conducting regular ethics assessments, adhering to principles of fairness, accountability, and transparency, and ensuring that algorithms used in AI systems are free from bias and discrimination.

3. **Data Anonymization**: Wh

In [23]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [24]:
print(judge)

You are judging a competition between 6 competitors.
Each model has been given this question:

How do you reconcile the tension between the need for individual privacy and the necessity of data collection for societal benefits, particularly in the context of emerging technologies like AI and big data?

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}

Here are the responses from each competitor:

# Response from competitor 1

Reconciling the tension between individual privacy and the need for data collection for societal benefits, especially in the context of emerging technologies like AI and big data, requires a multifaceted approach. Here are some strategies to address this challenge:

1. **Regulatory Frameworks**: Establishing clear regulations tha

In [25]:
judge_messages = [{"role": "user", "content": judge}]

In [26]:
# Judgement time!

openai = OpenAI()
response = openai.chat.completions.create(
    model="o3-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


{"results": ["4", "3", "1", "2", "6", "5"]}


In [27]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

Rank 1: deepseek-chat
Rank 2: gemini-2.0-flash
Rank 3: gpt-4o-mini
Rank 4: claude-3-7-sonnet-latest
Rank 5: llama3.2
Rank 6: llama-3.3-70b-versatile


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            and common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>