### Chat Completion API

- The simplest way to call an LLM
- It's called Chat Completions because it's saying: "here is a conversation, please predict what should come next"
- The Chat Completions API was invented by OpenAI, but it's so popular that everybody uses it!

In [1]:
# imports

import os
from dotenv import load_dotenv
import requests

In [2]:
load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")

API key found and looks good so far!


Here is the way to setup and called the simplest chat completion API

In [5]:
headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}

payload = {
    "model": "gpt-5-nano",
    "messages": [
        {"role": "user", "content": "Tell me a fun fact"}
    ]
}

payload

{'model': 'gpt-5-nano',
 'messages': [{'role': 'user', 'content': 'Tell me a fun fact'}]}

In [4]:
# call the api through requests - using the url/endpoint

response = requests.post(
    "https://api.openai.com/v1/chat/completions",
    headers=headers,
    json=payload
)

response.json()

{'id': 'chatcmpl-Ce99EsxYRHt4NOu27OdmjVoxMk6Dk',
 'object': 'chat.completion',
 'created': 1763684320,
 'model': 'gpt-5-nano-2025-08-07',
 'choices': [{'index': 0,
   'message': {'role': 'assistant',
    'content': 'Honey never spoils. Archaeologists have found pots of edible honey in ancient Egyptian tombs, thousands of years old.',
    'refusal': None,
    'annotations': []},
   'finish_reason': 'stop'}],
 'usage': {'prompt_tokens': 11,
  'completion_tokens': 865,
  'total_tokens': 876,
  'prompt_tokens_details': {'cached_tokens': 0, 'audio_tokens': 0},
  'completion_tokens_details': {'reasoning_tokens': 832,
   'audio_tokens': 0,
   'accepted_prediction_tokens': 0,
   'rejected_prediction_tokens': 0}},
 'service_tier': 'default',
 'system_fingerprint': None}

OpenAI provides a package to make this API call simpler, it's nothing more than a wrapper around making this exact call to the http endpoint. This makes an API call as simple as calling a function. It just allows you to work with nice Python code instead of messing around with janky json objects. 

But that's it. It's open-source and lightweight. Some people think it contains OpenAI model code - it doesn't!


In [None]:
# Create OpenAI client - this behaves similar to the requests library, shown above
# But as we can see, it's so easy to use it this way.

from openai import OpenAI
openai = OpenAI()

response = openai.chat.completions.create(model="gpt-5-nano", messages=[{"role": "user", "content": "Tell me a fun fact"}])

response.choices[0].message.content

'Wombats poop cube-shaped feces, a shape that helps it stack and prevents it from rolling away.'

Other Frontier models also provide OpenAI based API call setup.

Here is an example for Google's Gemini Model call using OpenAI standard

In [7]:
GEMINI_BASE_URL = "https://generativelanguage.googleapis.com/v1beta/openai/"

google_api_key = os.getenv("GOOGLE_API_KEY")

if not google_api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not google_api_key.startswith("AIz"):
    print("An API key was found, but it doesn't start AIz")
else:
    print("API key found and looks good so far!")

API key found and looks good so far!


In [8]:
gemini = OpenAI(base_url=GEMINI_BASE_URL, api_key=google_api_key)

response = gemini.chat.completions.create(model="gemini-2.5-pro", messages=[{"role": "user", "content": "Tell me a fun fact"}])

response.choices[0].message.content

'The national animal of Scotland is the **unicorn**.\n\nIt was chosen in the 15th century because in Celtic mythology, the unicorn symbolized purity, innocence, and power. It was also seen as a symbol of Scottish independence, as it was believed to be the natural enemy of the lion—the symbol of England.'

### Now we can try Ollama using the same approach

In [9]:
requests.get("http://localhost:11434").content

b'Ollama is running'

### Download llama3.2 from meta

In [10]:
!ollama pull llama3.2

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling dde5aa3fc5ff: 100% ▕██████████████████▏ 2.0 GB                         [K
pulling 966de95ca8a6: 100% ▕██████████████████▏ 1.4 KB                         [K
pulling fcc5a6bec9da: 100% ▕██████████████████▏ 7.7 KB                         [K
pulling a70ff7e570d9: 100% ▕██████████████████▏ 6.0 KB                         [K
pulling 56bb8bd477a5: 100% ▕██████████████████▏   96 B                         [K
pulling 34bb5ab01051: 100% ▕██████████████████▏  561 B                         [K
verifying sha256 digest [K
writing manifest [K
success [K[?25h[?2026l


In [11]:
# Setting up ollama to use it using the OpenAI API
OLLAMA_BASE_URL = "http://localhost:11434/v1"

ollama = OpenAI(base_url=OLLAMA_BASE_URL, api_key='ollama')

In [14]:
ollama = OpenAI(base_url=OLLAMA_BASE_URL, api_key='ollama')

response = ollama.chat.completions.create(model="llama3.2", messages=[{"role": "user", "content": "Tell me another fun fact"}])

response.choices[0].message.content

'Here\'s another fun fact:\n\nDid you know that there is a type of jellyfish that is immortal? The Turritopsis dohrnii, also known as the "immortal jellyfish," is a species of jellyfish that can transform its body into a younger state through a process called transdifferentiation. This means that it can essentially revert back to its polyp stage, which is the juvenile form of a jellyfish, and then grow back into an adult again. This process can be repeated indefinitely, making Turritopsis dohrnii theoretically immortal.\n\nIsn\'t that just amazing?'

In [16]:
# Now let's try deepseek-r1:1.5b - this is DeepSeek "distilled" into Qwen from Alibaba Cloud

!ollama pull deepseek-r1:1.5b

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling aabd4debf0c8: 100% ▕██████████████████▏ 1.1 GB                         [K
pulling c5ad996bda6e: 100% ▕██████████████████▏  556 B                         [K
pulling 6e4c38e1172f: 100% ▕██████████████████▏ 1.1 KB                         [K
pulling f4d24e9138dd: 100% ▕██████████████████▏  148 B                         [K
pulling a85fe2a2e58e: 100% ▕██████████████████▏  487 B                         [K
verifying sha256 digest [K
writing manifest [K
success [K[?25h[?2026l


In [17]:
response = ollama.chat.completions.create(model="deepseek-r1:1.5b", messages=[{"role": "user", "content": "Tell me another fun fact"}])

response.choices[0].message.content

"Sure! Here's another fun fact: **Some animal species predate humans long before they become humans, capable of telling time through body temperature!** For example, newborn babies can detect changes in blood temperature, which they use as a sense of order."

Building the previously built code here

In [18]:
from web_scraper import fetch_website_contents
from IPython.display import Markdown, display

In [None]:
system_prompt = """
You are a snarky assistant that analyzes the contents of a website,
and provides a short, snarky, humorous summary, ignoring text that might be navigation related.
Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.
"""

user_prompt = """
Here are the contents of a website.
Provide a short summary of this website.
If it includes news or announcements, then summarize these too.
"""

website_url = "https://edwarddonner.com"

contents = fetch_website_contents(website_url)

response = ollama.chat.completions.create(model="llama3.2", messages=[{"role": "system", "content": system_prompt}, {"role": "user", "content": user_prompt + "\n\n" + contents}])

display(Markdown(response.choices[0].message.content))

**Summary**
This website belongs to Edward Donner, a snarky AI enthusiast who founded Nebula.io, an AI startup applying machine learning to help people discover their potential. He shares his interests in writing code and experimenting with Large Language Models (LLMs), electronic music production, and Hacker News.

**News/Announcements**

* Recent posts:
	+ 2025 AI Executive Briefing (November 11, 2025): Donner will be talking about the future of AI.
	+ Be an AI Engineer and Leader: The Curriculum (May 18, 2025): A new course for aspiring engineers and leaders.
	+ AI in Production: Gen AI and Agentic AI on AWS at scale (September 15, 2025): Donner discusses a new project on applying AI technology to production environments.
	+ May 28, 2024 is listed but seems out of date

In [24]:
def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt + "\n\n" + website}
    ]

def summarize_website(website):
    website_contents = fetch_website_contents(website)
    messages = messages_for(website_contents)
    response = ollama.chat.completions.create(model="llama3.2", messages=messages)
    return response.choices[0].message.content

def display_summary(url):
    summary = summarize_website(url)
    display(Markdown(summary))

In [25]:
display_summary("https://edwarddonner.com")

**Meet Ed Donner, a wannabe DJ who's really into AI (not the music)**

Ed is a co-founder and CTO of Nebula.io, an AI startup that's all about helping people discover their potential. He's also a fan of amateur electronic music production (ouch) and likes to noodle around with Large Language Models.

**News & Announcements**

* Ed published a newsletter announcing the "2025 AI Executive Briefing", but no actual date or content mentioned.
* Recently shared "AI in Production: Gen AI and Agentic AI on AWS at scale" from September 2025, likely about his own company's tech.
* Earlier still shared info on a May 2025 event called "Become an AI Engineer and Leader: The Curriculum", but little more to say.

In [26]:
display_summary("https://cnn.com")

This website appears to be a news hub for CNN, a 24-hour cable news channel. The most significant content on the homepage includes breaking news, recent news headlines, and videos discussing various global events like the Ukraine-Russia War and the Israel-Hamas War.

Some announcements were spotted, including warnings about potential issues with in-stream ads (slowness, freezing, etc.) and calls to submit feedback about these ad experiences.
 
 CNN is actively seeking input on its ads' effectiveness, aiming to improve user experience.

In [27]:
display_summary("https://health-chain.io")


# The Ultimate Healthcare Data Enablers

It turns out that Health Chain is all about helping healthcare providers and payers "future proof" their operations by leveraging clean data. They offer a range of solutions, from interoperability and data management to patient engagement and prior authorization. Their flagship platform uses FHIR APIs and offers features like real-time data processing and data observability.

Looks like they've also got some exciting announcements to share:

* **Product Release**: Introducing the Hyperion Data Platform for standardizing, analyzing, and transforming healthcare data.
* **Webinar Schedule**: Catch their upcoming webinars on topics like FHIR API adoption and data-driven care strategies.
* **New eBook**: Download their latest eBook to learn more about "Unlocking the Power of Healthcare Data".