# Local-LLM Tests and Examples

Simply choose your favorite model of choice from the models list and paste it into the `model` variable on the API calls. You can get a list of models below.

Install OpenAI and requests:

```bash
pip install openai requests
```

**Note, you do not need an OpenAI API Key, the API Key is your `LOCAL_LLM_API_KEY` for the server if you defined one in your `.env` file.**

## Language Models

Get a list of models to choose from if you don't already know what model you want to use.


In [61]:
import requests
import time

# Wait for server to come up instead of timing out.
while True:
    try:
        models = requests.get("http://localhost:8091/v1/models")
        if models.status_code == 200:
            break
    except:
        pass
    time.sleep(1)

print(models.json())

['bakllava-1-7b', 'llava-v1.5-7b', 'llava-v1.5-13b', 'EstopianMaid-13B', 'Etheria-55b-v0.1', 'EstopianMaid-13B', 'Everyone-Coder-33B-Base', 'FusionNet_34Bx2_MoE', 'WestLake-7B-v2', 'WestSeverus-7B-DPO', 'DiscoLM_German_7b_v1', 'Garrulus', 'DareVox-7B', 'NexoNimbus-7B', 'Lelantos-Maid-DPO-7B', 'stable-code-3b', 'Dr_Samantha-7B', 'NeuralBeagle14-7B', 'tigerbot-13B-chat-v5', 'Nous-Hermes-2-Mixtral-8x7B-SFT', 'Thespis-13B-DPO-v0.7', 'Code-290k-13B', 'Nous-Hermes-2-Mixtral-8x7B-DPO', 'Venus-120b-v1.2', 'LLaMA2-13B-Estopia', 'medicine-LLM', 'finance-LLM-13B', 'Yi-34B-200K-DARE-megamerge-v8', 'phi-2-orange', 'laser-dolphin-mixtral-2x7b-dpo', 'bagel-dpo-8x7b-v0.2', 'Everyone-Coder-4x7b-Base', 'phi-2-electrical-engineering', 'Cosmosis-3x34B', 'HamSter-0.1', 'Helion-4x34B', 'Bagel-Hermes-2x34b', 'deepmoney-34b-200k-chat-evaluator', 'deepmoney-34b-200k-base', 'TowerInstruct-7B-v0.1', 'PiVoT-SUS-RP', 'Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss', 'TenyxChat-7B-v1', 'UNA-TheBeagle-7B-v1', 'WhiteRabbi

## Voices

Any `wav` file in the `voices` directory will be available to use as a voice.


In [62]:
import requests

voices = requests.get("http://localhost:8091/v1/audio/voices")
print(voices.json())

{'voices': ['default', 'DukeNukem', 'Hal9000_Mono', 'Hal_voice_9000_Synthetic', 'SyntheticStarTrekComputerVoice', 'Synthetic_DukeNukem', 'Synthetic_Female_Hybrid_4_Phonetics_0001', 'Synthetic_Female_Phonetics_0001']}


## Completion

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/completions/create)


In [63]:
import openai

# Modify this prompt to generate different outputs
prompt = "Write a haiku about dragons, then end with </s>."

openai.base_url = "http://localhost:8091/v1/"
openai.api_key = "Your LOCAL_LLM_API_KEY from your .env file"

completion = openai.completions.create(
    model="phi-2-dpo",
    prompt=prompt,
    temperature=0.3,
    max_tokens=1024,
    top_p=0.90,
    n=1,
    stream=False,
    extra_body={"system_message": "You are a creative assistant."},
)
print(completion.choices[0].text)

Golden wings unfold,
  Fire breathes from ancient soul,
  Dragon's lair stands tall.


## Chat Completion

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/chat)


In [64]:
import openai

# Modify this prompt to generate different outputs
prompt = "Write a haiku about Taco Bell's Doritos Locos Tacos, then end with </s>."

openai.api_key = "Your LOCAL_LLM_API_KEY from your .env file"
openai.base_url = "http://localhost:8091/v1/"

messages = [{"role": "system", "content": prompt}]
response = openai.chat.completions.create(
    model="phi-2-dpo",
    messages=messages,
    temperature=0.3,
    max_tokens=1024,
    top_p=0.90,
    stream=False,
    extra_body={"system_message": "You are a creative assistant."},
)
print(response.messages[1]["content"])

Crunchy Doritos shells
  Locos Tacos burst with flavor
  Fast food delight


## Embeddings

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/embeddings)


In [65]:
import openai

# Modify this prompt to generate different outputs
prompt = "Tacos are great."

openai.base_url = "http://localhost:8091/v1/"
openai.api_key = "Your LOCAL_LLM_API_KEY from your .env file"

response = openai.embeddings.create(
    input=prompt,
    model="phi-2-dpo",
)
print(response.data[0].embedding)

[0.3090707063674927, 1.1029287576675415, 2.3656630516052246, -0.4048413932323456, 1.0616954565048218, -0.38123324513435364, 1.000983476638794, 1.3528428077697754, -0.8648228645324707, 1.5190966129302979, 0.675586998462677, -0.19846737384796143, 0.5568936467170715, 0.06692102551460266, -0.31205207109451294, -0.31372374296188354, 0.7479310035705566, 1.7122050523757935, 0.11156223714351654, -0.38010597229003906, -2.0285604000091553, -0.9324357509613037, -0.26883846521377563, -0.41842585802078247, 0.7723556160926819, 1.2539050579071045, 0.4468786418437958, 1.082922339439392, 0.3584991693496704, -0.42915067076683044, -0.6801806092262268, -1.2013731002807617, 0.0065938811749219894, 0.5070878267288208, 0.6012413501739502, -0.36088356375694275, -0.8070286512374878, -0.42242664098739624, 1.5350435972213745, -1.8791346549987793, 0.43988990783691406, 0.705127477645874, 0.6156477332115173, 1.2618300914764404, -0.5392323732376099, 0.47247421741485596, 0.51447594165802, -0.5469154715538025, 0.130050

## Cloning Text to Speech

Any `wav` file in the `voices` directory can be used as a voice.


In [66]:
import requests

voices = requests.post(
    "http://localhost:8091/v1/audio/generation",
    json={
        "text": "I'm sorry Dave, I'm afraid I can't do that.",
        "voice": "default",
        "language": "en",
    },
)
voice_response = voices.json()
print(f"Voice response: {voice_response}\n")

Voice response: {'data': 'UklGRkY0AgBXQVZFZm10IBAAAAABAAEAwF0AAIC7AAACABAATElTVBoAAABJTkZPSVNGVA4AAABMYXZmNTguNzYuMTAwAGRhdGEANAIABQD2/xUAFAAmAC0AKQAdABkACgDz////9f/x/+//4//g/93/3f/o//P/7P/h/8X/s/+2/8D/3v/+/yoASgBbAF8AcABoAEQAUwBXAE4AWAA1ADUARwBjAIMAagBnAJQAggBoADYADAAWABgAMgAyADAAMgAxAAkAEwAKAPD/DAAOAAUAIQAlABcA///1/xMABgDt/+H/vP+7/6T/if9S/zL/Nf9E/0L/Sf9b/0z/Vf9C/0P/Y/91/3//h/+S/7D/vf/e/+D//f8iADsAagBwAIgAhwCPAJkAcwBMAD8ATQBXAH4AmAC6AL4A5QABAeUA5ADMALQAuwDGAMgAvwCMAIsAkQB/AHUAjACAAFcAKgAYAN3/qf+f/4L/jP+A/2b/Wv9y/37/af9Q/0L/Uf9S/1X/Rf88/wr/8f7g/s7+uv60/qD+uP7e/iD/KP9G/3T/qf+2/77/6//f/+L/6//4//b/GwBEAG8AdABrAJUAvAC7AN8A/AAKAREB8QDaAMkAsQC/AMgArwCkAK4AkQBeAGcAegB/AJIAkgBmAC8AAgDz//r/6f/R/7z/wf/Z/8r/7P/p/9H/4f/O/6X/iv9O/0z/QP82/z3/H/8t/xP/Iv86/1//k/+h/8f/7f8aAA8ABgDr//X/CwAiAEkAagB/AKsArgCmAJsAjQCIAIEAaQBrAJYAjQCWAKYAiACWAIQApAC5AJcAjgBcAD4APgBBAGkAcQCPAKoAggBxAFcATQBVADsANgA1ADIAFwALAOL/vf92/yz/8v64/rX+uf6Z/p/+pP6Q/pv+qP67/sb+o/6t/uH+Jv9D/1X/Uv9N/2T/cP+J/5z/pP+4/9n/8P/g/+b/8//m/+n/IQAvAD

## Text to Speech


In [67]:
import requests

transcription = requests.post(
    "http://localhost:8091/v1/audio/transcriptions",
    json={
        "file": voice_response["data"],
        "audio_format": "wav",
        "model": "base.en",
    },
)
print(transcription.json())

{'data': " I'm sorry Dave, I'm afraid I can't do that."}
