# LOCAL LLM EXAMPLE

**Advantages of using local LLM**
- No API charges - open-source
- Data doesn't leave your box

**Disadvantages**
- Less powerful than the Frontier Models like ChatGPT, Gemini, etc
- Limited training and parameters (2 billion on Ollama vs 10 trillion on ChatGPT)

In [5]:
# imports

import requests
from IPython.display import Markdown, display

In [7]:
# Constants
OLLAMA_API = "http://localhost:11434/api/chat"
HEADERS = {"Content-Type": "application/json"}
MODEL = "llama3.2"

In [9]:
# Create a user promot and messages to send to llama
user_prompt = """
Suggest a list of domain names for a company that provides AI-based innovative solutions worldwide.\
Limit the domain names to .com and .ai domains, the number of domains to 10, and the length of a domain to 10 characters
"""

messages = [
    {"role": "user", "content": user_prompt}
]

payload = {
        "model": MODEL,
        "messages": messages,
        "stream": False
    }

In [12]:
# send a request to ollama api
response = requests.post(OLLAMA_API, json=payload, headers=HEADERS)
print(response.json()['message']['content'])

# Send message using the ollama package

- Instead of a direct HTTP call, use the elegant ollama python package.
- Under the hood, it's making the same call as above to the ollama server running at localhost:11434

In [15]:
import ollama

response = ollama.chat(model=MODEL, messages=messages)
print(response['message']['content'])

## Use OpenAI python library to connect to Ollama

In [18]:
from openai import OpenAI
ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')

response = ollama_via_openai.chat.completions.create(
    model=MODEL,
    messages=messages
)

print(response.choices[0].message.content)

## Use alternate models in Ollama (e.g. DeepSeek)
- There are different models available to use.
- For example, Deepseek has models from fewer parameters (1.5B parameters) to very high, ~700B parameters. See [here](https://ollama.com/library/deepseek-r1) 

In [21]:
!ollama pull deepseek-r1:1.5b

In [25]:
response = ollama_via_openai.chat.completions.create(
    model="deepseek-r1:1.5b",
    messages=[{"role": "user", "content": "Please give definitions of some core concepts behind LLMs: a neural network, attention and the transformer"}]
)

print(response.choices[0].message.content)