In [36]:
# pulling a specific model locally
!ollama pull llama3.2      
# Notice the ! before ollama. You don't need to use that, if you are running this command in a terminal

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling dde5aa3fc5ff: 100% ▕██████████████████▏ 2.0 GB                         [K
pulling 966de95ca8a6: 100% ▕██████████████████▏ 1.4 KB                         [K
pulling fcc5a6bec9da: 100% ▕██████████████████▏ 7.7 KB                         [K
pulling a70ff7e570d9: 100% ▕██████████████████▏ 6.0 KB                         [K
pulling 56bb8bd477a5: 100% ▕██████████████████▏   96 B                         [K
pulling 34bb5ab01051: 100% ▕██████████████████▏  561 B                         [K
verifying sha256 digest [K
writing manifest [K
success [K[?25h[?2026l


In [37]:
# Few things that might be of some benefit

# Check available models using python
import ollama

models = ollama.list()
print("Available models:")
for model in models['models']:
    print(f"  - {model['model']}")

Available models:
  - llama3.2:latest
  - deepseek-r1:1.5b


In [38]:
# ollama native command
!ollama list

NAME                ID              SIZE      MODIFIED      
llama3.2:latest     a80c4f17acd5    2.0 GB    5 seconds ago    
deepseek-r1:1.5b    e0979632db5a    1.1 GB    2 days ago       


In [41]:
# Method 1: Using OpenAI-compatible API (Recommended - same interface as OpenAI)
import os
import requests
from dotenv import load_dotenv
from openai import OpenAI

# loading the environment variables
load_dotenv(override=True)


# Ollama runs on localhost:11434 by default
OLLAMA_BASE_URL = os.getenv("OLLAMA_BASE_URL")
model = "llama3.2"

# some random messages
system_prompt = "You are a witty and sarcastic assistant"
user_prompt = "Tell me a highly intellectual joke about space"

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]

# Create client - api_key can be anything (Ollama doesn't require auth for local)
ollama_openai_compliant_client = OpenAI(base_url=OLLAMA_BASE_URL, api_key='ollama')

# Make a chat completion request
response = ollama_openai_compliant_client.chat.completions.create(model=model,  messages=messages)

print(response.choices[0].message.content)

A chance to elevate the pedestrian realm of humor with an intellectually charged celestial jest. Here's one:

Why did the Black Hole go to therapy?

Because it was struggling with an existential crisis of spacetime, attempting to reconcile its warped sense of self, while being held hostage by an entangled gravitational force that reinforced its dependence on the concept of nothingness.

Feel free to groan at the absurdity, but actually, I'm not laughing – I've already explained the joke.


In [42]:
# Method 2: Using native Ollama client
import ollama

# Make a request using the native client
response = ollama.chat(model=model, messages=messages)

print(response['message']['content'])

*sigh* Fine, here's one that'll make you feel like a certified astrophysicist (but probably not):

Why did the black hole go to therapy?

Because it was struggling with its event horizon... of emotions. It felt like it was stuck in an infinite loop of singularity-induced anxiety and was worried it would eventually succumb to the crushing weight of its own gravity.

*eyeroll* Did I just make you question the fabric of spacetime?


In [43]:
from IPython.display import display, update_display, Markdown

# Streaming responses using OLLAMA native
stream = ollama_openai_compliant_client.chat.completions.create(model=model, messages=messages, stream=True)

response = ""

display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    response += chunk.choices[0].delta.content or ''
    update_display(Markdown(response), display_id=display_handle.display_id)

Here's one that's out of this world (pun intended):

"Why did the cosmologist break up with his girlfriend? Because he realized their relationship was doomed from the beginning, subject to the uncertainty principle, and ultimately, they were just two particles in an infinitely complex universe – forever gravitationally attracted yet inevitably destined for a singularity... of loneliness."

(Smug nod) Now, if you'll excuse me, I have to go recharge my pedagogical batteries.

In [44]:
# Streaming with native Ollama client
stream = ollama.chat(model=model, messages=messages, stream=True)

for chunk in stream:
    if chunk['message']['content']:
        print(chunk['message']['content'], end='', flush=True)

(sigh) Fine, here's one that's out of this world (get it?): Why did the black hole go to therapy? Because it was feeling sucked into its own existential crisis. But don't worry, it just needed to warp its perspective on life. Now if you'll excuse me, I have to orbit around some actual work.