# Local-LLM Examples

Simply choose your favorite model of choice from the models list and paste it into the `model` variable on the API calls. You can get a list of models below.

**Note, you do not need an OpenAI API Key, the API Key is your own API Key for the server if you defined one.**


In [None]:
import requests

models = requests.get("http://localhost:8091/v1/models")
print(models.json())

Install OpenAI


In [None]:
%pip install openai==0.28.1

# Completion

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/completions/create)


In [3]:
import openai

openai.api_base = "http://localhost:8091/v1"
openai.api_key = ""
prompt = "Tell me a short story about dragons."

response = openai.Completion.create(
    model="Mistral-7B-OpenOrca",
    prompt=prompt,
    temperature=1.31,
    max_tokens=8192,
    top_p=1.0,
    n=1,
    stream=False,
)
print(response)

{
  "id": "cmpl-d21f4dab-fa94-4c6d-8f39-5f8c8fd6035e",
  "object": "text_completion",
  "created": 1705594518,
  "model": "Mistral-7B-OpenOrca",
  "choices": [
    {
      "text": "Once upon a time in a land filled with marvelous creatures, there was a magnificent dragon named Fireball. He was known for his unparalleled fiery breath, vast collection of hidden treasures, and an insatiable desire to build legendary castles in abandoned caves. Each day, he would scavenge for precious gems and coins, all while combating countless mythical adversaries that dared to intrude upon his kingdoms.\n\nOne enchanted evening, Fireball flew across a breathtaking panorama of rolling meadows as the sun dipped down below the horizon, painting the clouds with shades of amethyst and gold. In search of new treasures, he noticed a small cave hiding among the boulders a hundred meters beneath him. Excited by this unexpected find, his keen intuition suggested that the serene cavern was uncharted territory. Fi

## Chat Completion

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/chat)


In [2]:
import openai

openai.api_base = "http://localhost:8091/v1"
openai.api_key = ""
prompt = "What is the capital of Ohio?"
messages = [{"role": "system", "content": prompt}]

response = openai.ChatCompletion.create(
    model="Mistral-7B-OpenOrca",
    messages=messages,
    temperature=1.31,
    max_tokens=8192,
    top_p=1.0,
    n=1,
    stream=False,
)
print(response)

{
  "id": "cmpl-f3952fdb-40bf-47ad-8333-8dbe79751cb9",
  "object": "text_completion",
  "created": 1705594403,
  "model": "Mistral-7B-OpenOrca",
  "usage": {
    "prompt_tokens": 62,
    "completion_tokens": 17,
    "total_tokens": 79
  },
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of Ohio?"
    },
    {
      "role": "assistant",
      "content": "The capital of Ohio is Columbus."
    }
  ]
}


## Embeddings

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/embeddings)

The embeddings endpoint it currently uses is an ONNX embedder with 256 max tokens.


In [4]:
import openai

openai.api_base = "http://localhost:8091/v1"
openai.api_key = ""
prompt = "Columbus is the capital of Ohio."

response = openai.Embedding.create(
    input=prompt,
    engine="Mistral-7B-OpenOrca",
)

print(response)

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        3.3848612308502197,
        -4.3109331130981445,
        2.2499334812164307,
        -4.221327781677246,
        2.5889885425567627,
        4.153291702270508,
        4.620810031890869,
        0.45830175280570984,
        5.6938252449035645,
        -3.1267006397247314,
        -3.21240496635437,
        0.8036205172538757,
        -1.1429270505905151,
        1.1674814224243164,
        0.8567964434623718,
        -3.5292932987213135,
        0.7877870798110962,
        0.5426158905029297,
        2.838111162185669,
        -0.09259846061468124,
        -1.9333360195159912,
        -2.2051005363464355,
        -1.7372665405273438,
        6.985084533691406,
        -7.461802959442139,
        3.2471537590026855,
        3.647839307785034,
        0.7010669112205505,
        -8.589995384216309,
        0.8142480254173279,
        0.771185576915741,
        -4.17358922958374,
        -3.5