# Local-LLM Examples

Simply choose your favorite model of choice from the models list and paste it into the `model` variable on the API calls. You can get a list of models below.

**Note, you do not need an OpenAI API Key, the API Key is your own API Key for the server if you defined one.**


In [1]:
import requests

models = requests.get("http://localhost:8091/v1/models")
print(models.json())

['Mistralic-7B-1', 'Llama-2-7B-vietnamese-20k', 'Dans-TotSirocco-7B', 'Dans-AdventurousWinds-7B', 'airoboros-mistral2.2-7B', 'TinyLlama-1.1B-intermediate-step-480k-1T', 'em_german_mistral_v01', 'TinyLlama-1.1B-python-v0.1', 'TinyLlama-1.1B-Chat-v0.3', 'Nous-Hermes-13B', 'Mistral-7B-OpenOrca', 'dolphin-2.0-mistral-7B', 'Nous-Capybara-7B', 'Inkbot-13B-8k-0.2', 'em_german_7b_v01', 'em_german_70b_v01', 'em_german_13b_v01', 'UltraLM-13B-v2.0', 'MythoMakiseMerged-13B', 'lince-zero', 'sheep-duck-llama-2-70B-v1.1', 'MegaMix-T1-13B', 'MegaMix-S1-13B', 'Megamix-A1-13B', 'Kimiko-Mistral-7B', 'Pandalyst_13B_V1.0', 'Pandalyst-7B-V1.1', 'samantha-mistral-instruct-7B', 'samantha-mistral-7B', 'Synthia-7B-v1.3', 'NexusRaven-13B', 'Mistral-7B-Instruct-v0.1', 'Mistral-7B-v0.1', 'leo-hessianai-7B', 'leo-hessianai-7B-chat', 'leo-hessianai-7B-chat-bilingual', 'leo-hessianai-13B', 'leo-hessianai-13B-chat', 'leo-hessianai-13B-chat-bilingual', 'Emerhyst-20B', 'Emerhyst-13B', 'openbuddy-openllama-7B-v12-bf16', 

Install OpenAI


In [1]:
%pip install openai==0.28.1

Defaulting to user installation because normal site-packages is not writeable
[33mDEPRECATION: nb-black 1.0.7 has a non-standard dependency specifier black>='19.3'; python_version >= "3.6". pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of nb-black or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0mNote: you may need to restart the kernel to use updated packages.


## Completions

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/completions)


In [2]:
import openai

openai.api_base = "http://localhost:8091/v1"
openai.api_key = ""
prompt = "What is the capital of Ohio?"

response = openai.Completion.create(
    model="Mistral-7B-OpenOrca",
    prompt=prompt,
    temperature=1.31,
    max_tokens=8192,
    top_p=1.0,
    frequency_penalty=0,
    presence_penalty=0,
    stream=False,
)
print(response)

{
  "id": "cmpl-e0d62942-0cc0-4f5a-9cba-ec18efe98792",
  "object": "text_completion",
  "created": 1696438643,
  "model": "Mistral-7B-OpenOrca",
  "choices": [
    {
      "text": "The capital of Ohio is Columbus.",
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 62,
    "completion_tokens": 18,
    "total_tokens": 80
  }
}


## Chat Completion

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/chat)


In [3]:
import openai

openai.api_base = "http://localhost:8091/v1"
openai.api_key = ""
prompt = "What is the capital of Ohio?"
messages = [{"role": "system", "content": prompt}]

response = openai.ChatCompletion.create(
    model="Mistral-7B-OpenOrca",
    messages=messages,
    temperature=1.31,
    max_tokens=8192,
    top_p=1.0,
    n=1,
    stream=False,
)
print(response)

{
  "id": "cmpl-b0b718e5-4a1c-4a92-bd7e-458219ccfbaf",
  "object": "text_completion",
  "created": 1696438651,
  "model": "Mistral-7B-OpenOrca",
  "usage": {
    "prompt_tokens": 62,
    "completion_tokens": 128,
    "total_tokens": 190
  },
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of Ohio?"
    },
    {
      "role": "assistant",
      "content": "The capital of Ohio is Columbus."
    }
  ]
}


## Embeddings

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/embeddings)

The embeddings endpoint it currently uses is an ONNX embedder with 256 max tokens.


In [4]:
import openai

openai.api_base = "http://localhost:8091/v1"
openai.api_key = ""
prompt = "Columbus is the capital of Ohio."

response = openai.Embedding.create(
    input=prompt,
    model="Mistral-7B-OpenOrca",
)

print(response)

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        3.403564453125,
        -4.5200324058532715,
        2.4046950340270996,
        -4.322293758392334,
        2.5373435020446777,
        4.239781379699707,
        4.84455680847168,
        0.49894267320632935,
        5.877126693725586,
        -3.241041660308838,
        -3.259755849838257,
        0.8434664011001587,
        -1.2886568307876587,
        1.1771708726882935,
        0.5773480534553528,
        -3.9502012729644775,
        0.6329552531242371,
        0.5341236591339111,
        2.8664753437042236,
        -0.07409809529781342,
        -1.8799495697021484,
        -2.0990095138549805,
        -1.5911470651626587,
        7.454763412475586,
        -7.549945831298828,
        3.36108136177063,
        3.672377824783325,
        0.664371907711029,
        -8.769789695739746,
        0.7213315367698669,
        0.8416232466697693,
        -4.0401291847229,
        -3.338854074

## Python Examples

Matching the examples above, see the direct Python examples below.


In [None]:
%pip install local-llm --upgrade

In [None]:
from local_llm import LLM

LLM().models()

In [None]:
from local_llm import LLM

ai = LLM(
    models_dir="./models",
    model="Mistral-7B-OpenOrca",
)
ai.completion(prompt="What is the capital of Ohio?")

In [None]:
from local_llm import LLM

messages = [{"role": "system", "content": "What is the capital of Ohio?"}]

ai = LLM(
    models_dir="./models",
    model="Mistral-7B-OpenOrca",
)
ai.chat(messages=messages)

In [None]:
from local_llm import LLM

ai = LLM(
    models_dir="./models",
    model="Mistral-7B-OpenOrca",
)
ai.embedding("Columbus is the capital of Ohio.")