## Lab 4: OpenAI for non-OpenAI

In [1]:
import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import display, Markdown, update_display
import requests
load_dotenv(override=True)

False

### We're now going to ask a hard question to lots of models

In [2]:
message = "In 1 sentence, describe a rainbow to someone who's never been able to see. \
Then in 1 sentence, describe the imaginary number i to someone who doesn't understand math. \
Then in 1 sentence, find a connection between rainbows and imaginary numbers. \
Then end by stating how many words are in your answer."

messages = [{"role": "user", "content": message}]

In [3]:
openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')
grok_api_key = os.getenv('GROK_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

if grok_api_key:
    print(f"Grok API Key exists and begins {grok_api_key[:4]}")
else:
    print("Grok API Key not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AI
DeepSeek API Key exists and begins sk-
Groq API Key not set (and this is optional)
Grok API Key not set (and this is optional)


#### <span style="color: green;">Question: what's going on below? Why are we calling OpenAI with details about Anthropic?</span>

In [4]:
anthropic_url = "https://api.anthropic.com/v1/"
gemini_url = "https://generativelanguage.googleapis.com/v1beta/openai/"
deepseek_url = "https://api.deepseek.com"
groq_url = "https://api.groq.com/openai/v1"
grok_url = "https://api.x.ai/v1"
ollama_url = 'http://localhost:11434/v1'

openai = OpenAI()
anthropic = OpenAI(api_key=anthropic_api_key, base_url=anthropic_url)
gemini = OpenAI(api_key=google_api_key, base_url=gemini_url)
deepseek = OpenAI(api_key=deepseek_api_key, base_url=deepseek_url)
groq = OpenAI(api_key=groq_api_key, base_url=groq_url)
grok = OpenAI(api_key=grok_api_key, base_url=grok_url)
ollama = OpenAI(base_url=ollama_url, api_key='ollama')


In [5]:
models = []
answers = []

def answer(client, model):
    stream = client.chat.completions.create(model=model, messages=messages, stream=True)
    prefix = f"### Response from {model}:\n\n"
    reply = ""
    display_handle = display(Markdown(prefix), display_id=True)
    for chunk in stream:
        reply += chunk.choices[0].delta.content or ''
        update_display(Markdown(prefix+reply), display_id=display_handle.display_id)
    words = reply.split('</think>')[1] if '</think>' in reply else reply
    reply += f"\n\n#### Calculated true word count: {len(words.split())}"
    update_display(Markdown(prefix+reply), display_id=display_handle.display_id)
    
    models.append(model)
    answers.append(reply)

In [6]:
answer(openai, "gpt-4.1-mini")

### Response from gpt-4.1-mini:

A rainbow is a graceful arc of shimmering colors that appears in the sky when sunlight passes through raindrops, creating a beautiful spectrum. The imaginary number i is a special number that, when multiplied by itself, equals negative one, allowing us to solve problems involving square roots of negative numbers. Both rainbows and imaginary numbers reveal hidden patterns and possibilities beyond everyday experience, blending reality with imagination. My answer contains 56 words.

#### Calculated true word count: 72

In [7]:
answer(openai, "gpt-5-nano")

### Response from gpt-5-nano:

A rainbow is the curved shape of light in the air after rain, felt not by sight but as a cool breeze, a gentle warmth, and a sense of wonder.  
An imaginary number i is a kind of number that, when multiplied by itself, gives a negative one, and it helps solve equations that real numbers alone can't.  
Both invite us to think beyond ordinary sense; rainbows reveal hidden patterns in light and water, while imaginary numbers reveal hidden patterns in math.  
There are 89 words in this answer.

#### Calculated true word count: 89

In [8]:
answer(openai, "gpt-5")

BadRequestError: Error code: 400 - {'error': {'message': 'Your organization must be verified to stream this model. Please go to: https://platform.openai.com/settings/organization/general and click on Verify Organization. If you just verified, it can take up to 15 minutes for access to propagate.', 'type': 'invalid_request_error', 'param': 'stream', 'code': 'unsupported_value'}}

In [None]:
answer(anthropic, "claude-sonnet-4-20250514")

In [None]:
answer(gemini, "gemini-2.5-flash-lite")

In [9]:
answer(gemini, "gemini-2.5-pro")

BadRequestError: Error code: 400 - [{'error': {'code': 400, 'message': 'API key not valid. Please pass a valid API key.', 'status': 'INVALID_ARGUMENT', 'details': [{'@type': 'type.googleapis.com/google.rpc.ErrorInfo', 'reason': 'API_KEY_INVALID', 'domain': 'googleapis.com', 'metadata': {'service': 'generativelanguage.googleapis.com'}}, {'@type': 'type.googleapis.com/google.rpc.LocalizedMessage', 'locale': 'en-US', 'message': 'API key not valid. Please pass a valid API key.'}]}}]

In [10]:
# DeepSeek 3.1 Terminus - not on reasoning mode or it takes too long

answer(deepseek, "deepseek-chat")

### Response from deepseek-chat:

A rainbow is a vast, gentle arch in the sky painted with the soft, layered light of the sun showing all its hidden colors at once. The imaginary number i is a mathematical idea that answers the question "what number, when multiplied by itself, equals -1?" Both concepts reveal a hidden dimension of reality—colors in pure light and solutions in negative numbers—that is beautiful and real, yet initially invisible to our direct perception. There are 100 words.

#### Calculated true word count: 77

In [None]:
answer(groq, "deepseek-r1-distill-llama-70b")

#### <span style="color: orange;">Question: what's the difference between Grok and Groq and why do they have such similar names?</span>

In [None]:
answer(grok, "grok-4")

In [None]:
!ollama pull llama3.2
!ollama pull gpt-oss

In [12]:
from langchain_ollama import OllamaLLM
ollama_llm = OllamaLLM(
    model="llama3.2:latest",
    base_url="http://host.docker.internal:11434",  # Connect to Ollama on host
    temperature=0.0,  # Low temperature for consistent structured output
    num_predict=2048,  # Adjust based on your needs
    format="json"  # Force JSON output format
)
        

ModuleNotFoundError: No module named 'langchain_ollama'

In [11]:
requests.get("http://localhost:11434").content

ConnectionError: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xfbea77e68170>: Failed to establish a new connection: [Errno 111] Connection refused'))

In [13]:
answer(ollama, "llama3.2")

APIConnectionError: Connection error.

In [14]:
answer(ollama, "gpt-oss:20b")

APIConnectionError: Connection error.

In [None]:
answer(groq, "openai/gpt-oss-120b")

In [None]:
len(models)

## LLM as a Judge

In [15]:
together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [16]:
display(Markdown(together))

# Response from competitor 1

A rainbow is a graceful arc of shimmering colors that appears in the sky when sunlight passes through raindrops, creating a beautiful spectrum. The imaginary number i is a special number that, when multiplied by itself, equals negative one, allowing us to solve problems involving square roots of negative numbers. Both rainbows and imaginary numbers reveal hidden patterns and possibilities beyond everyday experience, blending reality with imagination. My answer contains 56 words.

#### Calculated true word count: 72

# Response from competitor 2

A rainbow is the curved shape of light in the air after rain, felt not by sight but as a cool breeze, a gentle warmth, and a sense of wonder.  
An imaginary number i is a kind of number that, when multiplied by itself, gives a negative one, and it helps solve equations that real numbers alone can't.  
Both invite us to think beyond ordinary sense; rainbows reveal hidden patterns in light and water, while imaginary numbers reveal hidden patterns in math.  
There are 89 words in this answer.

#### Calculated true word count: 89

# Response from competitor 3

A rainbow is a vast, gentle arch in the sky painted with the soft, layered light of the sun showing all its hidden colors at once. The imaginary number i is a mathematical idea that answers the question "what number, when multiplied by itself, equals -1?" Both concepts reveal a hidden dimension of reality—colors in pure light and solutions in negative numbers—that is beautiful and real, yet initially invisible to our direct perception. There are 100 words.

#### Calculated true word count: 77



In [17]:
judge = f"""You are judging a competition between {len(models)} competitors.
Each model has been given this question:

{message}

Your job is to evaluate each response for clarity and strength of argument and accuracy of word count, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""

In [18]:
display(Markdown(judge))

You are judging a competition between 3 competitors.
Each model has been given this question:

In 1 sentence, describe a rainbow to someone who's never been able to see. Then in 1 sentence, describe the imaginary number i to someone who doesn't understand math. Then in 1 sentence, find a connection between rainbows and imaginary numbers. Then end by stating how many words are in your answer.

Your job is to evaluate each response for clarity and strength of argument and accuracy of word count, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}

Here are the responses from each competitor:

# Response from competitor 1

A rainbow is a graceful arc of shimmering colors that appears in the sky when sunlight passes through raindrops, creating a beautiful spectrum. The imaginary number i is a special number that, when multiplied by itself, equals negative one, allowing us to solve problems involving square roots of negative numbers. Both rainbows and imaginary numbers reveal hidden patterns and possibilities beyond everyday experience, blending reality with imagination. My answer contains 56 words.

#### Calculated true word count: 72

# Response from competitor 2

A rainbow is the curved shape of light in the air after rain, felt not by sight but as a cool breeze, a gentle warmth, and a sense of wonder.  
An imaginary number i is a kind of number that, when multiplied by itself, gives a negative one, and it helps solve equations that real numbers alone can't.  
Both invite us to think beyond ordinary sense; rainbows reveal hidden patterns in light and water, while imaginary numbers reveal hidden patterns in math.  
There are 89 words in this answer.

#### Calculated true word count: 89

# Response from competitor 3

A rainbow is a vast, gentle arch in the sky painted with the soft, layered light of the sun showing all its hidden colors at once. The imaginary number i is a mathematical idea that answers the question "what number, when multiplied by itself, equals -1?" Both concepts reveal a hidden dimension of reality—colors in pure light and solutions in negative numbers—that is beautiful and real, yet initially invisible to our direct perception. There are 100 words.

#### Calculated true word count: 77



Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks.

In [19]:
judge_messages = [{"role": "user", "content": judge}]
response = groq.chat.completions.create(model="openai/gpt-oss-120b", messages=judge_messages)
results = response.choices[0].message.content
results


AuthenticationError: Error code: 401 - {'error': {'message': 'Invalid API Key', 'type': 'invalid_request_error', 'code': 'invalid_api_key'}}

In [None]:
results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = models[int(result)-1]
    print(f"Rank {index+1}: {competitor}")