Alternate way to call the OpenAI API (not recommended)

In [1]:
import os
from dotenv import load_dotenv
from IPython.display import Markdown, display

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")


API key found and looks good so far!


In [2]:
import requests

headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}

payload = {
    "model": "gpt-5-nano",
    "messages": [
        {"role": "user", "content": "Tell me a fun fact"}]
}

payload

{'model': 'gpt-5-nano',
 'messages': [{'role': 'user', 'content': 'Tell me a fun fact'}]}

In [3]:
response = requests.post(
    "https://api.openai.com/v1/chat/completions",
    headers=headers,
    json=payload
)

response.json()

{'id': 'chatcmpl-CbVh7SUNKA9ek8FQFO9aygk77hAWT',
 'object': 'chat.completion',
 'created': 1763056005,
 'model': 'gpt-5-nano-2025-08-07',
 'choices': [{'index': 0,
   'message': {'role': 'assistant',
    'content': 'Fun fact: Bananas are technically berries, while strawberries aren‚Äôt. Botanically, a berry comes from a single ovary with seeds embedded in the flesh‚Äîwhich applies to bananas, but not strawberries. Want another fun fact?',
    'refusal': None,
    'annotations': []},
   'finish_reason': 'stop'}],
 'usage': {'prompt_tokens': 11,
  'completion_tokens': 566,
  'total_tokens': 577,
  'prompt_tokens_details': {'cached_tokens': 0, 'audio_tokens': 0},
  'completion_tokens_details': {'reasoning_tokens': 512,
   'audio_tokens': 0,
   'accepted_prediction_tokens': 0,
   'rejected_prediction_tokens': 0}},
 'service_tier': 'default',
 'system_fingerprint': None}

In [4]:
funfact = response.json()["choices"][0]["message"]["content"]
display(Markdown(funfact))

Fun fact: Bananas are technically berries, while strawberries aren‚Äôt. Botanically, a berry comes from a single ovary with seeds embedded in the flesh‚Äîwhich applies to bananas, but not strawberries. Want another fun fact?

OpenAI Python package(openai) ‚Äì This is the official SDK for accessing OpenAI‚Äôs API (including GPT-4, GPT-5, DALL¬∑E, Whisper, etc.).
This OpenAI Python package is nothing more than a wrapper around the HTTP requests we made earlier.
So, it internally does the same work as we did with the HTTP request.

In [5]:
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(model="gpt-4o-mini", messages=[{"role": "user", "content": "Tell me a fun fact"}])
response.choices[0].message.content

"Sure! Did you know that honey never spoils? Archaeologists have discovered pots of honey in ancient Egyptian tombs that are over 3,000 years old and still perfectly edible! Honey's high sugar content, low moisture, and natural acidity create an inhospitable environment for bacteria and other microorganisms, allowing it to last indefinitely."

# The best part about the OpenAI Python package :

The best part about the OpenAI Python package is that it not only works for GPT models, but it allows us to use it for other Models as well like Gemini.

In [6]:
GEMINI_BASE_URL = "https://generativelanguage.googleapis.com/v1beta/openai/"

google_api_key = os.getenv("GOOGLE_API_KEY")

if not google_api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not google_api_key.startswith("AIz"):
    print("An API key was found, but it doesn't start AIz")
else:
    print("API key found and looks good so far!")

No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!


In [7]:
# It won't work if GOOGLE_API_KEY is not added in your ".env" file :
gemini = OpenAI(base_url=GEMINI_BASE_URL, api_key=google_api_key)

response = gemini.chat.completions.create(model="gemini-2.5-pro", messages=[{"role": "user", "content": "Tell me a fun fact"}])

response.choices[0].message.content

BadRequestError: Error code: 400 - [{'error': {'code': 400, 'message': 'API key not valid. Please pass a valid API key.', 'status': 'INVALID_ARGUMENT', 'details': [{'@type': 'type.googleapis.com/google.rpc.ErrorInfo', 'reason': 'API_KEY_INVALID', 'domain': 'googleapis.com', 'metadata': {'service': 'generativelanguage.googleapis.com'}}, {'@type': 'type.googleapis.com/google.rpc.LocalizedMessage', 'locale': 'en-US', 'message': 'API key not valid. Please pass a valid API key.'}]}}]

We can also use the OpenAI Python package to call the local LLM server like Ollama.

First let's check if the local LLM server(Ollama) is running.

It should print the response from the local LLM server as follows :
# b'Ollama is running'

In case the local LLM server is not running, you can start it using the following command :
# ollama serve

In [8]:
requests.get("http://localhost:11434").content

b'Ollama is running'

# Pull a Ollama model :
We need to download a Ollama model before using it. Otherwise, it will throw an error.

We can download a Ollama model using following command.
# !ollama pull llama3.2

In [9]:
!ollama pull llama3.2

[?2026h[?25l[1Gpulling manifest ‚†ã [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†ô [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†π [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†∏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†º [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†¥ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†¶ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†ß [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†á [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†è [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†ã [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†ô [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†π [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†∏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†º [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†¥ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†¶ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†ß [K[?25h

In [10]:
OLLAMA_BASE_URL = "http://localhost:11434/v1"

# You can provide the api_key as 'ollama' or 'ollama:11434' as it is not used for authentication. But it is required by the OpenAI Python package.
ollama = OpenAI(base_url=OLLAMA_BASE_URL, api_key='ollama')

response = ollama.chat.completions.create(model="llama3.2", messages=[{"role": "user", "content": "Tell me a fun fact"}])

response.choices[0].message.content

"Here's one:\n\nDid you know that honey never spoils? Archaeologists have found pots of honey in ancient Egyptian tombs that are over 3,000 years old and still perfectly edible! Honey's unique properties, such as its low moisture content and acidity, make it virtually immortal. Is that sweet or what?"

Noe let's try another open source LLM model called :

### üß© deepseek-r1:1.5b

DeepSeek-R1 is a language model (from the DeepSeek project ‚Äî a set of open-source AI models).

1.5b means the model has 1.5 billion parameters, i.e., its approximate size and capacity.

So deepseek-r1:1.5b is a relatively small but capable DeepSeek model.

# DeepSeek is ‚Äòdistilled‚Äô into Qwen from Alibaba Cloud.

Let's learn about Distillation first. 
# Distillation (or knowledge distillation) :
Training a smaller model (the student) to mimic the behavior of a larger, more capable model (the teacher).

# How they did it :
They took the 1.5 billion size version of Qwen(The smaller model), and they trained it more with extra data. And that extra data was data that was generated by deep seek(The big model). It generated tons and tons of data to sort of show its intelligence, and it trained the little Qwen to be able to try and replicate deep seq. That's how it worked. This process is known as "Distillation".

So, the DeepSeek model‚Äôs capabilities or style were compressed or transferred into another model - Qwen from Alibaba Cloud.

# Detailed explanation of how it works :

The Deep Seek model created its own training or reasoning examples ‚Äî it produced outputs that show step-by-step reasoning, explanations, or problem-solving.

This self-generated data was then used to train or refine a smaller model Qwen.

Example: DeepSeek-R1 might have used a large model to generate reasoning traces (detailed thought steps) and then distilled those into a smaller ‚Äústudent‚Äù model (like DeepSeek-R1:1.5B).

The large amount of generated reasoning data helps make the smaller model appear more intelligent, because it has ‚Äúlearned‚Äù from those reasoning patterns.


# Pull deepseek-r1:1.5b model :
In order to use deepseek-r1:1.5b model we need to pull it using command :

# !ollama pull deepseek-r1:1.5b

In [11]:
!ollama pull deepseek-r1:1.5b

[?2026h[?25l[1Gpulling manifest ‚†ã [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†ô [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†π [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†∏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†º [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†¥ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†¶ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†ß [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ‚†á [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling aabd4debf0c8: 100% ‚ñï‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñè 1.1 GB                         [K
pulling c5ad996bda6e: 100% ‚ñï‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñè  556 B                         [K
pulling 6e4c38e1172f: 100% ‚ñï‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñè 1.1 KB                         [K
pulling f4d24e9138dd: 100% ‚ñï‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñè 

In [12]:
OLLAMA_BASE_URL = "http://localhost:11434/v1"

# You can provide the api_key as 'ollama' or 'ollama:11434' as it is not used for authentication. But it is required by the OpenAI Python package.
ollama = OpenAI(base_url=OLLAMA_BASE_URL, api_key='ollama')

response = ollama.chat.completions.create(model="deepseek-r1:1.5b", messages=[{"role": "user", "content": "Tell me a fun fact"}])

response.choices[0].message.content

"Sure! Here's a fun fact for you:\n\nDid you know that it takes over 30 years to shovel dirt in a full-circle wheel (called a shovellah) in the dense mud of an avalanche? And even then, it could take thousands of rotations!\n\nThat‚Äôs pretty awesome when you consider how challenging the terrain can be! üòä"

# A Life Saver Trick:
### Use "Think Step by Step" for better accuracy ###

In the Chat prompt if you add the word "Think step by step" you will get better result as the model will enter into reasoning(thinking) mode and will perform deliberate, multi-step logical reasoning before producing an answer. So, it will improve accuracy but the tradeoff will be of speed.