# Local-LLM Tests and Examples

Simply choose your favorite model of choice from the models list and paste it into the `model` variable on the API calls. You can get a list of models below.

Install OpenAI and requests:

```bash
pip install openai requests
```

**Note, you do not need an OpenAI API Key, the API Key is your `LOCAL_LLM_API_KEY` for the server if you defined one in your `.env` file.**

## Models

Get a list of models to choose from if you don't already know what model you want to use.


In [1]:
import requests

models = requests.get("http://localhost:8091/v1/models")
print(models.json())

['bakllava-1-7b', 'llava-v1.5-7b', 'llava-v1.5-13b', 'FusionNet_34Bx2_MoE', 'Everyone-Coder-33B-Base', 'FusionNet_34Bx2_MoE', 'WestLake-7B-v2', 'WestSeverus-7B-DPO', 'DiscoLM_German_7b_v1', 'Garrulus', 'DareVox-7B', 'NexoNimbus-7B', 'Lelantos-Maid-DPO-7B', 'stable-code-3b', 'Dr_Samantha-7B', 'NeuralBeagle14-7B', 'tigerbot-13B-chat-v5', 'Nous-Hermes-2-Mixtral-8x7B-SFT', 'Thespis-13B-DPO-v0.7', 'Code-290k-13B', 'Nous-Hermes-2-Mixtral-8x7B-DPO', 'Venus-120b-v1.2', 'LLaMA2-13B-Estopia', 'medicine-LLM', 'finance-LLM-13B', 'Yi-34B-200K-DARE-megamerge-v8', 'phi-2-orange', 'laser-dolphin-mixtral-2x7b-dpo', 'bagel-dpo-8x7b-v0.2', 'Everyone-Coder-4x7b-Base', 'phi-2-electrical-engineering', 'Cosmosis-3x34B', 'HamSter-0.1', 'Helion-4x34B', 'Bagel-Hermes-2x34b', 'deepmoney-34b-200k-chat-evaluator', 'deepmoney-34b-200k-base', 'TowerInstruct-7B-v0.1', 'PiVoT-SUS-RP', 'Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss', 'TenyxChat-7B-v1', 'UNA-TheBeagle-7B-v1', 'WhiteRabbitNeo-33B-v1', 'WinterGoliath-123b', '

## Completion

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/completions/create)


In [2]:
import openai

# Modify this prompt to generate different outputs
prompt = "Write a haiku about dragons, then end with </s>."

openai.base_url = "http://localhost:8091/v1/"
openai.api_key = "Your LOCAL_LLM_API_KEY from your .env file"

completion = openai.completions.create(
    model="phi-2-dpo",
    prompt=prompt,
    temperature=0.3,
    max_tokens=1024,
    top_p=0.90,
    n=1,
    stream=False,
)
print(completion.choices[0].text)

Majestic wings unfold,
  Fire-breathing dragon roars loud,
  Ancient wisdom reigns.


## Chat Completion

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/chat)


In [3]:
import openai

# Modify this prompt to generate different outputs
prompt = "Write a haiku about Taco Bell's Doritos Locos Tacos, then end with </s>."

openai.api_key = "Your LOCAL_LLM_API_KEY from your .env file"
openai.base_url = "http://localhost:8091/v1/"

messages = [{"role": "system", "content": prompt}]
response = openai.chat.completions.create(
    model="phi-2-dpo",
    messages=messages,
    temperature=0.3,
    max_tokens=1024,
    top_p=0.90,
    stream=False,
)
print(response.messages[1]["content"])

Crunchy shells burst,
  Luscious fillings dance inside,
  Doritos' Locos Tacos.


## Embeddings

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/embeddings)


In [4]:
import openai

# Modify this prompt to generate different outputs
prompt = "Tacos are great."

openai.base_url = "http://localhost:8091/v1/"
openai.api_key = "Your LOCAL_LLM_API_KEY from your .env file"

response = openai.embeddings.create(
    input=prompt,
    model="phi-2-dpo",
)
print(response.data[0].embedding)

[0.3308050036430359, 1.2592976093292236, 2.400346279144287, -0.4338093101978302, 0.9918550252914429, -0.43417197465896606, 1.0468409061431885, 1.3336766958236694, -0.9791282415390015, 1.4600270986557007, 0.6578080058097839, -0.03904183581471443, 0.7186003923416138, -0.0029883310198783875, -0.5159909725189209, -0.3166074752807617, 0.7350625395774841, 1.6164231300354004, 0.14988793432712555, -0.45527321100234985, -1.9803181886672974, -0.838558554649353, -0.37335023283958435, -0.39204782247543335, 0.7431477904319763, 1.3098536729812622, 0.501659631729126, 1.0154064893722534, 0.23192773759365082, -0.40380898118019104, -0.8571327328681946, -1.2263654470443726, -0.09275045245885849, 0.4803667962551117, 0.8520298600196838, -0.49855491518974304, -0.8620067238807678, -0.6521469950675964, 1.5076617002487183, -1.9226653575897217, 0.5018820762634277, 0.6102170348167419, 0.7899760603904724, 1.1262562274932861, -0.5613922476768494, 0.4535752832889557, 0.7436857223510742, -0.5657961368560791, 0.09088

## Cloning Text to Speech

Any `wav` file in the `voices` directory can be used as a voice.


In [9]:
import requests

voices = requests.get("http://localhost:8091/v1/audio/voices")
print(voices.json())

{'voices': ['default', 'DukeNukem', 'Hal9000_Mono', 'Hal_voice_9000_Synthetic', 'SyntheticStarTrekComputerVoice', 'Synthetic_DukeNukem', 'Synthetic_Female_Hybrid_4_Phonetics_0001', 'Synthetic_Female_Phonetics_0001']}


In [26]:
import requests

voices = requests.post(
    "http://localhost:8091/v1/audio/generation",
    json={
        "text": "I'm sorry Dave, I'm afraid I can't do that.",
        "voice": "default",
        "language": "en",
    },
)
voice_response = voices.json()
print(f"Voice response: {voice_response}\n")

Voice response: {'status': 'success', 'data': 'UklGRkgkBwBXQVZFZm10IBAAAAADAAEAwF0AAAB3AQAEACAAZmFjdAQAAAAAyQEAUEVBSxAAAAABAAAAI6OzZZsCVj9acQAAZGF0YQAkBwAHELc7YCyoO5j64TtN1+k7V9IAPMDgAjzdIAU8clTuO81o1DsUZbw7hMyjO/mfmTvv5HY7QudSO2uhPDsbk4w6WFOxOSCtkbiBurG6VEjxugCPPLsYkX27GAeYu5QvxbucGu+7bWX8uz7KFrxlPBi8zhccvFlPHbz/Tia81ZkxvHfINbxxm0G8EpNHvOKoTrzcW1W8ozVmvCiQZbxuwWq8XA9+vCKPhLyteIW874GEvOPihbz41Ie8ukqHvMHkiLyYWoq8okyHvDJZgLyzyYC8yy15vBt8a7w2wle8B/hBvGGcLbxm+Bi8YysMvKoQ87vtf9C7cpSnuyYceLu2sie7fAnOuu/d5bkx7lw6+VLuOg+NOTtuFHQ7a26SO8lwtjtfdeE7T1YFPLcBETyy8BY8v+0gPM7mLjypUjw8R3Y8PA+IRTxrx1o8OH5oPKClgDzNGok8iAKNPCfelDwTTpk8WWKfPBUSoDySL6A8aCykPHVmpDxoV6I8mhSkPD2NnzzvnJ48JTejPD+qoDyoYKI88S2lPORhoDzvcpw8eGqWPPO8kzyqtIo84rOAPIHefDxO7Gc8xttUPNpfUTww4Uc8w1hAPHDxKjyZFRc8Wqz2OzTTpzv9hX87BOwKO0AZYTkI4ai601gOu8o4TLtL+327UIKgu2PrsLsWMcO7AXUEvAoJGbxtwCe8ASpFvGAkUryEJ2q8CJaHvHLGjbyp+Je84XGdvKN/oLwqoqS87vClvPO2rLwjgLO8/fu3vAk1vbyb6sC82qrCvCdCwLwR6Me8E7XIvFtVx7w8kca8kAu7vO9tuLwXVK68TBqkvBAQl7xomIe8FlOIvFMUg

## Text to Speech


In [None]:
# Work in progress.
"""
transcription = requests.post(
    "http://localhost:8091/v1/audio/transcriptions",
    json={"file": voice_response["data"], "audio_format": "wav", "model": "base.en"},
)
print(transcription.json())
"""