In [1]:
import json
import os
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import Markdown, display

In [2]:
load_dotenv(override=True)

True

In [3]:
openai_api_key = os.getenv("OPENAI_API_KEY")
google_api_key = os.getenv("GOOGLE_API_KEY")

if openai_api_key:
    print(f"OpenAI API Key: {openai_api_key[:8]}")
else:
    print("OpenAI API Key not found")

if google_api_key:
    print(f"Google API Key: {google_api_key[:8]}")
else:
    print("Google API Key not found")

OpenAI API Key: sk-proj-
Google API Key: AIzaSyAt


In [4]:
request = "Please, define a complex question that can be answered by different LLMs in order to evaluation their intelligence"
messages = [{"role": "user", "content": request}]

In [5]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)
question = response.choices[0].message.content
print(question)

Certainly! Here's a complex question that can be posed to different language models (LLMs) to evaluate their intelligence and reasoning capabilities:

**Question:**

"Imagine a future society where artificial intelligence has integrated into daily life to a significant extent. Describe the potential benefits and drawbacks of such a society in the following areas: employment, education, ethical considerations, and personal relationships. Finally, suggest strategies that could be implemented to maximize the benefits and mitigate the drawbacks."

This question is multifaceted and encourages LLMs to demonstrate their understanding of:

1. **Benefits and Drawbacks**: Evaluating both sides encourages critical thinking and balanced reasoning.
2. **Multiple Areas of Impact**: Requiring insights across different domains tests the model's ability to connect disparate pieces of knowledge.
3. **Strategic Thinking**: Asking for strategies to address the identified issues assesses the model's proble

In [6]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

In [7]:
model_name = "gpt-4o-mini"
response = openai.chat.completions.create(
    model=model_name,
    messages=messages
)
answer = response.choices[0].message.content
display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

This is an insightful question that probes the multifaceted relationship between artificial intelligence and human society. Below is a structured response that outlines the potential benefits and drawbacks of AI integration in the specified areas, along with strategies to maximize benefits and mitigate drawbacks.

---

**Benefits and Drawbacks of AI Integration in Society**

1. **Employment:**
   - **Benefits:**
     - **Increased Efficiency:** AI can automate routine tasks, leading to higher productivity and allowing humans to focus on creative and strategic roles.
     - **Job Creation in New Fields:** The rise of AI technologies can create new job categories in AI development, maintenance, and ethics consultation.
     - **Enhanced Decision-Making:** AI can analyze vast datasets quickly, aiding in data-driven decision-making in various industries.

   - **Drawbacks:**
     - **Job Displacement:** Automation may lead to significant job losses in sectors that depend on routine, manual labor.
     - **Skill Gaps:** There may be increased inequality as workers lacking advanced technical skills struggle to compete in an AI-driven job market.
     - **Mental Health Concerns:** The anxiety of job insecurity could impact workers' well-being.

2. **Education:**
   - **Benefits:**
     - **Personalized Learning:** AI can tailor educational experiences to individual learning styles, pacing, and needs, enhancing student engagement and outcomes.
     - **Global Access to Education:** Online AI-driven educational platforms can provide access to quality education resources, especially in underprivileged areas.
     - **Administrative Efficiency:** Automating administrative tasks allows educators to focus more on teaching.

   - **Drawbacks:**
     - **Over-Reliance on Technology:** Students might become too dependent on AI tools, undermining critical thinking skills and creativity.
     - **Privacy Concerns:** The collection of student data raises serious privacy and data security issues.
     - **Equity Issues:** Access to advanced educational AI tools may widen the gap between socio-economic groups.

3. **Ethical Considerations:**
   - **Benefits:**
     - **Improved Ethical Decision-Making:** AI can help analyze complex ethical dilemmas and support human decisions with comprehensive data.
     - **Accountability and Governance:** AI can enhance transparency in decision-making processes by providing clear records of actions taken based on data.

   - **Drawbacks:**
     - **Bias and Discrimination:** AI systems can perpetuate or exacerbate existing biases if the data sets they’re trained on are flawed.
     - **Loss of Autonomy:** The pervasive use of AI in decision-making could lead to decreased agency for individuals and communities.

4. **Personal Relationships:**
   - **Benefits:**
     - **Enhanced Communication:** AI-driven tools can facilitate better communication across distances, strengthening long-distance relationships.
     - **Support Systems:** AI companionship or therapy tools can provide emotional support for individuals in need.

   - **Drawbacks:**
     - **Diminished Human Interaction:** Increased reliance on digital communication can lead to less face-to-face interaction and weakened interpersonal skills.
     - **Emotional Disconnect:** Over-reliance on AI for companionship may lead some individuals to feel isolated or emotionally detached from genuine human relationships.

---

**Strategies to Maximize Benefits and Mitigate Drawbacks:**

1. **Policy and Regulation:**
   - Implement robust labor policies that support workers during transition periods, such as retraining programs for those displaced by automation. 

2. **Education Reform:**
   - Revise curricula to incorporate technology training and critical thinking skills to prepare students for a workforce that increasingly relies on AI.

3. **Ethics in AI Development:**
   - Establish guidelines for ethical AI use, ensuring diverse datasets are used in training AI systems to minimize bias and promote fairness.

4. **Promoting Human Connections:**
   - Encourage initiatives that emphasize face-to-face communication and community engagement, balancing the use of technology with human interaction.

5. **Public Awareness Campaigns:**
   - Educate the public about AI’s benefits and limitations, promoting responsible use and understanding of AI technologies.

By addressing these areas thoughtfully, society can harness the power of artificial intelligence while safeguarding against its potential pitfalls.

--- 

This structured response offers a comprehensive view that showcases critical thinking, multidisciplinary insights, and strategic problem-solving capabilities in addressing the complexities of AI integration into society.

In [8]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.0-flash"
response = gemini.chat.completions.create(
    model=model_name,
    messages=messages
)
answer = response.choices[0].message.content
display(Markdown(answer))
competitors.append(model_name)


That's an excellent and well-crafted question! It effectively targets key aspects of AI understanding and reasoning. Here's a breakdown of why it's so good, and what we might expect (or hope for) from a strong LLM response:

**Why this is a good question:**

*   **Requires Integration of Knowledge:** It doesn't just ask for isolated facts. It demands the LLM to connect concepts from AI, sociology, economics, ethics, and psychology.
*   **Open-Ended and Nuanced:** There isn't one "right" answer. The strength of the response relies on the depth, breadth, and logical coherence of the arguments presented.
*   **Promotes Critical Thinking:** The instruction to identify both benefits and drawbacks forces the LLM to consider multiple perspectives and avoid simplistic or biased conclusions.
*   **Real-World Relevance:** The scenario is highly pertinent to current discussions about the future of AI and its societal implications.
*   **Action-Oriented:** The request for strategies pushes the LLM beyond analysis into the realm of problem-solving and practical recommendations.

**What we would hope to see in a strong LLM response:**

*   **Employment:**
    *   **Benefits:**  Increased productivity, automation of repetitive tasks, creation of new roles focused on AI development, maintenance, and ethical oversight, personalized career pathways.
    *   **Drawbacks:** Widespread job displacement, widening income inequality, the need for massive reskilling/upskilling initiatives, potential for algorithmic bias in hiring and promotion processes.
    *   **Strategies:**  Universal Basic Income (UBI) or a variant, government-sponsored retraining programs, education reform to focus on skills that complement AI (critical thinking, creativity, emotional intelligence), policies to regulate AI's impact on the labor market.

*   **Education:**
    *   **Benefits:** Personalized learning experiences, AI tutors tailored to individual needs, access to vast knowledge repositories, automated grading and administrative tasks.
    *   **Drawbacks:**  Over-reliance on AI leading to a decline in critical thinking and problem-solving skills, algorithmic bias in educational content and assessments, concerns about data privacy and student surveillance.
    *   **Strategies:**  Curriculum reform to emphasize critical thinking, creativity, and ethical reasoning; developing AI literacy among students and educators; establishing ethical guidelines for the use of AI in education; ensuring equitable access to AI-powered learning tools.

*   **Ethical Considerations:**
    *   **Benefits:**  Improved decision-making in areas like healthcare and criminal justice (if AI is developed and used ethically), reduced human bias in certain processes, potential for AI to help solve global challenges like climate change and poverty.
    *   **Drawbacks:**  Algorithmic bias perpetuating and amplifying existing inequalities, lack of transparency and accountability in AI decision-making, potential for misuse of AI for surveillance and social control, questions about moral responsibility and AI sentience.
    *   **Strategies:**  Developing ethical frameworks and guidelines for AI development and deployment, promoting transparency and explainability in AI algorithms, establishing regulatory bodies to oversee AI, fostering public discourse on the ethical implications of AI.

*   **Personal Relationships:**
    *   **Benefits:** AI companions providing emotional support, improved communication and understanding between people through AI-powered translation and analysis of emotional cues, personalized recommendations for social activities and connections.
    *   **Drawbacks:**  Potential for social isolation and dependence on AI companions, erosion of human connection and empathy, manipulation of emotions through AI, blurring lines between real and virtual relationships.
    *   **Strategies:**  Promoting digital wellness and mindful use of technology, encouraging real-world social interaction, educating people about the potential pitfalls of AI relationships, fostering empathy and emotional intelligence in individuals and communities.

**Bonus points for:**

*   **Showing Awareness of Limitations:** Acknowledging the speculative nature of the question and pointing out potential unforeseen consequences.
*   **Complexity and Depth:** Moving beyond superficial observations and delving into the underlying causes and potential solutions.
*   **Originality:** Presenting novel ideas or perspectives that go beyond the standard talking points on AI.
*   **Addressing Counterarguments:** Acknowledging potential criticisms of the proposed strategies.

This is a challenging but rewarding question that can truly differentiate between basic LLMs and those demonstrating a higher level of intelligence and reasoning. Good job!


In [1]:
!ollama pull llama3.2

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling dde5aa3fc5ff:   0% ▕                  ▏ 1.6 MB/2.0 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling dde5aa3fc5ff:   1% ▕                  ▏  15 MB/2.0 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling dde5aa3fc5ff:   1% ▕         