# Local-LLM Tests and Examples

Simply choose your favorite model of choice from the models list and paste it into the `model` variable on the API calls. You can get a list of models below.

Install OpenAI and requests:

```bash
pip install openai requests
```

**Note, you do not need an OpenAI API Key, the API Key is your `LOCAL_LLM_API_KEY` for the server if you defined one in your `.env` file.**

## Global definitions and helpers


In [131]:
import openai
import requests
import time

# Set your LOCAL_LLM_SERVER and LOCAL_LLM_API_KEY here for using the notebook.
LOCAL_LLM_SERVER = "http://localhost:8091"
LOCAL_LLM_API_KEY = "Your LOCAL_LLM_API_KEY from your .env file"


# ------------------- DO NOT EDIT BELOW THIS LINE IN THIS CELL ------------------- #
openai.base_url = f"{LOCAL_LLM_SERVER}/v1/"
openai.api_key = LOCAL_LLM_API_KEY if LOCAL_LLM_API_KEY else LOCAL_LLM_SERVER
HEADERS = {
    "Content-Type": "application/json",
    "Authorization": f"{LOCAL_LLM_API_KEY}",
}


def display_content(content):
    global LOCAL_LLM_SERVER
    outputs_url = f"{LOCAL_LLM_SERVER}/outputs/"
    try:
        from IPython.display import Audio, display, Image, Video
    except:
        print(content)
        return
    if "<audio controls>" in content or " " not in content:
        import base64
        from datetime import datetime

        try:
            audio_response = content.split("data:audio/wav;base64,")[1].split('" type')[
                0
            ]
        except:
            audio_response = content
        file_name = f"{datetime.now().strftime('%Y-%m-%d-%H-%M-%S')}.wav"
        with open(file_name, "wb") as fh:
            fh.write(base64.b64decode(audio_response))
        display(Audio(filename=file_name, autoplay=True))
    if outputs_url in content:
        file_name = content.split(outputs_url)[1].split('"')[0]
        url = f"{outputs_url}{file_name}"
        if url.endswith(".jpg") or url.endswith(".png"):
            content = content.replace(url, "")
            display(Image(url=url))
        elif url.endswith(".mp4"):
            content = content.replace(url, "")
            display(Video(url=url, autoplay=True))
        elif url.endswith(".wav"):
            content = content.replace(url, "")
            print(f"URL: {url}")
            display(Audio(url=url, autoplay=True))
    print(content)

## Language Models

Get a list of models to choose from if you don't already know what model you want to use.


In [124]:
# Wait for server to come up instead of timing out.
while True:
    try:
        models = requests.get(f"{LOCAL_LLM_SERVER}/v1/models", headers=HEADERS)
        if models.status_code == 200:
            break
    except:
        pass
    time.sleep(1)

print(models.json())

['bakllava-1-7b', 'llava-v1.5-7b', 'llava-v1.5-13b', 'yi-vl-6b', 'Goliath-longLORA-120b-rope8-32k-fp16', 'Goliath-longLORA-120b-rope8-32k-fp16', 'Etheria-55b-v0.1', 'EstopianMaid-13B', 'Everyone-Coder-33B-Base', 'FusionNet_34Bx2_MoE', 'WestLake-7B-v2', 'WestSeverus-7B-DPO', 'DiscoLM_German_7b_v1', 'Garrulus', 'DareVox-7B', 'NexoNimbus-7B', 'Lelantos-Maid-DPO-7B', 'stable-code-3b', 'Dr_Samantha-7B', 'NeuralBeagle14-7B', 'tigerbot-13B-chat-v5', 'Nous-Hermes-2-Mixtral-8x7B-SFT', 'Thespis-13B-DPO-v0.7', 'Code-290k-13B', 'Nous-Hermes-2-Mixtral-8x7B-DPO', 'Venus-120b-v1.2', 'LLaMA2-13B-Estopia', 'medicine-LLM', 'finance-LLM-13B', 'Yi-34B-200K-DARE-megamerge-v8', 'phi-2-orange', 'laser-dolphin-mixtral-2x7b-dpo', 'bagel-dpo-8x7b-v0.2', 'Everyone-Coder-4x7b-Base', 'phi-2-electrical-engineering', 'Cosmosis-3x34B', 'HamSter-0.1', 'Helion-4x34B', 'Bagel-Hermes-2x34b', 'deepmoney-34b-200k-chat-evaluator', 'deepmoney-34b-200k-base', 'TowerInstruct-7B-v0.1', 'PiVoT-SUS-RP', 'Noromaid-v0.4-Mixtral-Ins

## Voices

Any `wav` file in the `voices` directory will be available to use as a voice.


In [125]:
voices = requests.get(f"{LOCAL_LLM_SERVER}/v1/audio/voices", headers=HEADERS)
print(voices.json())

{'voices': ['default', 'DukeNukem', 'Hal9000_Mono', 'Hal_voice_9000_Synthetic', 'SyntheticStarTrekComputerVoice', 'Synthetic_DukeNukem', 'Synthetic_Female_Hybrid_4_Phonetics_0001', 'Synthetic_Female_Phonetics_0001']}


## Embeddings

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/embeddings)


In [126]:
# Modify this prompt to generate different outputs
prompt = "Tacos are great."

response = openai.embeddings.create(
    input=prompt,
    model="phi-2-dpo",
)
print(response.data[0].embedding)

[0.3090707063674927, 1.1029287576675415, 2.3656630516052246, -0.4048413932323456, 1.0616954565048218, -0.38123324513435364, 1.000983476638794, 1.3528428077697754, -0.8648228645324707, 1.5190966129302979, 0.675586998462677, -0.19846737384796143, 0.5568936467170715, 0.06692102551460266, -0.31205207109451294, -0.31372374296188354, 0.7479310035705566, 1.7122050523757935, 0.11156223714351654, -0.38010597229003906, -2.0285604000091553, -0.9324357509613037, -0.26883846521377563, -0.41842585802078247, 0.7723556160926819, 1.2539050579071045, 0.4468786418437958, 1.082922339439392, 0.3584991693496704, -0.42915067076683044, -0.6801806092262268, -1.2013731002807617, 0.0065938811749219894, 0.5070878267288208, 0.6012413501739502, -0.36088356375694275, -0.8070286512374878, -0.42242664098739624, 1.5350435972213745, -1.8791346549987793, 0.43988990783691406, 0.705127477645874, 0.6156477332115173, 1.2618300914764404, -0.5392323732376099, 0.47247421741485596, 0.51447594165802, -0.5469154715538025, 0.130050

## Chat Completion

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/chat)


In [127]:
# Modify this prompt to generate different outputs
prompt = "Write a haiku about Taco Bell's Doritos Locos Tacos."


response = openai.chat.completions.create(
    model="phi-2-dpo",
    messages=[{"role": "user", "content": prompt}],
    temperature=0.2,
    max_tokens=128,
    top_p=0.90,
    stream=False,
    extra_body={
        "system_message": "Act as a creative writer. After each request is fulfilled, end with </s> before further explanation.",
    },
)
display_content(response.messages[1]["content"])

Crunchy shells ring,
  Doritos locos in hand,
  Taco Bell delight.


**Note: A haiku is a traditional Japanese form of poetry consisting of three lines with 5 syllables in the first line, 7 syllables in the second line, and 5 syllables in the third line. The haiku above does not strictly follow this structure but attempts to capture the essence of Taco Bell's Doritos Locos Tacos.**


## Completion

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/completions/create)


In [128]:
# Modify this prompt to generate different outputs
prompt = "Write a haiku about Taco Bell's Doritos Locos Tacos."

completion = openai.completions.create(
    model="phi-2-dpo",
    prompt=prompt,
    temperature=0.2,
    max_tokens=128,
    top_p=0.90,
    n=1,
    stream=False,
    extra_body={
        "system_message": "Act as a creative writer. After each request is fulfilled, end with </s> before further explanation.",
    },
)
display_content(completion.choices[0].text)

Crunchy shells burst,
  Cheesy salsa and meat sizzle,
  Loyalty never fades.


Taco Bell's Locos Tacos,
A guilty pleasure for many,
But true love remains strong.


## Cloning Text to Speech

Any `wav` file in the `voices` directory can be used as a voice.


In [129]:
prompt = "Write a haiku about Taco Bell's Doritos Locos Tacos."
response = requests.post(
    f"{LOCAL_LLM_SERVER}/v1/audio/generation",
    headers=HEADERS,
    json={
        "text": prompt,
        "voice": "DukeNukem",
        "language": "en",
    },
)
audio_response = response.json()
display_content(audio_response["data"])

UklGRkY6AwBXQVZFZm10IBAAAAABAAEAwF0AAIC7AAACABAATElTVBoAAABJTkZPSVNGVA4AAABMYXZmNTguNzYuMTAwAGRhdGEAOgMAFQAMABoAKgAtADMAPABIAEUASgBYAFwAYQBdAGUAYABcAGEAWwBYAFAAWABWAFcAUwBXAFYAUgBRAE8ARgBFAEQAPAA7AEQARAA9ADcAKQAlAB0AFgAVAAsADAAOAAkADAALABQADgALAAQABAAFAAIAAQAAAAUAAwANAAoAAQD7/wEAAQD//wQACwAQAAoAFAAOAAwAEwARAAMABQAVABAADgAEAAoABgAWABcAGAAZABMAGQAdABwAHgAaABgAGwAcACcAJgAeACUAIwAfABoAIQAnACUALwAvACoANQA2AC4ALQAtACoAJAAYABcAFQAfACcAIAAUAA8ADwAKAAgAFAAUABwAHQAXABkAIQAqABwAFgAZAA0AEQAaABkAGQATAAoACQAOABQAEQAHAAAAAQAAAP7/AwAKAPz/AwAFAP7//P8EAP7/BAAEAPv/9f/x/+X/2f/U/9z/1//O/8z/zP/S/8z/0f/e/+D/5//4//P/9f/0/+//9v/y/+X/6v/b/9z/5P/e/9n/1P/a/9r/3P/b/9b/0//Z/9b/4v/g/+L/3v/W/9D/yf/A/8b/zP/F/8T/u//U/9L/2v/p/+v/8//s/+3/+f/+//v//P////7//f8NACIAIQAmAC8AMgBGAFAAWwBbAFwAWwBZAFcAXABjAGcAZwBvAHgAdwCBAH0AhQCDAIAAgQB4AHYAcwBqAG0AcgB/AHYAcgB1AF4AUQBdAGEAawCGAJIAiACFAI4AkACOAH8AdgBjAGEAUwBHAEsAQAA5AEUAPAA1ACEAHwAeADQANAA1AC8AOwBEAD0ALQAkAB0AHAAPAP//BwD2//f/5f/X/9X/2v/h/+X/5f/0/xYAIwAvAEMATgBXAGAAbwBtAHwAkgCnAKUA

## Text to Speech


In [130]:
import requests

transcription = requests.post(
    "http://localhost:8091/v1/audio/transcriptions",
    json={
        "file": audio_response["data"],
        "audio_format": "wav",
        "model": "base.en",
    },
)
print(transcription.json())

{'data': " Ride a haiku about Taco Bell's Doritos Locos Tacos."}


## Voice Completion Example


In [132]:
# We will use the audio response from a couple of cells back.
completion = openai.completions.create(

    model="phi-2-dpo",
    prompt=audio_response["data"],
    temperature=0.2,
    max_tokens=256,
    top_p=0.90,
    n=1,
    stream=False,
    extra_body={
        "system_message": "Act as a creative writer Duke Nukem. All of your responses are transcribed to audio and sent to the user. After each request is fulfilled, end with </s> before further explanation.",
        "audio_format": "wav",
        "voice": "DukeNukem",
    },
)

response_text = completion.choices[0].text
display_content(response_text)

URL: http://localhost:8091/outputs/8cb84e5b0c1b4f70bdd2b43eef6d21e5.wav


Crunchy shells ring,
  Cheesy salsa and meat delight,
  Taco Bell's Locos Tacos.

A taste that ignites the soul,
Spicy, savory, and oh so good,
Doritos Locos Tacos!

