# ezlocalai Tests and Examples

Simply choose your favorite model of choice from the models list and paste it into the `model` variable on the API calls. You can get a list of models below.

Install OpenAI and requests:

```bash
pip install openai requests python-dotenv
```

**Note, you do not need an OpenAI API Key, the API Key is your `EZLOCALAI_API_KEY` for the server if you defined one in your `.env` file.**

## Global definitions and helpers

Confirm that your `DEFAULT_MODEL` is set to the model you want to use in your `.env` file.


In [19]:
import openai
import requests
import time
import os
import re
from dotenv import load_dotenv

load_dotenv()

# Set your system message, max tokens, temperature, and top p here, or use the defaults.
SYSTEM_MESSAGE = "The assistant is acting as a creative writer. All of your text responses are transcribed to audio and sent to the user. Be concise with all responses. After the request is fulfilled, end with </s>."
DEFAULT_MAX_TOKENS = 256
DEFAULT_TEMPERATURE = 0.5
DEFAULT_TOP_P = 0.9

# ------------------- DO NOT EDIT BELOW THIS LINE IN THIS CELL ------------------- #
EZLOCALAI_SERVER = os.getenv("EZLOCALAI_SERVER", "http://localhost:8091")
EZLOCALAI_API_KEY = os.getenv("EZLOCALAI_API_KEY", "none")
DEFAULT_LLM = os.getenv("DEFAULT_LLM", "unsloth/Qwen3-VL-4B-Instruct-GGUF")
openai.base_url = f"{EZLOCALAI_SERVER}/v1/"
openai.api_key = EZLOCALAI_API_KEY if EZLOCALAI_API_KEY else EZLOCALAI_SERVER
HEADERS = {
    "Content-Type": "application/json",
    "Authorization": f"{EZLOCALAI_API_KEY}",
    "ngrok-skip-browser-warning": "true",
}


def display_content(content):
    global EZLOCALAI_SERVER
    global HEADERS
    outputs_url = f"{EZLOCALAI_SERVER}/outputs/"
    try:
        from IPython.display import Audio, display, Image, Video
    except:
        print(content)
        return
    if "http://localhost:8091/outputs/" in content:
        if outputs_url != "http://localhost:8091/outputs/":
            content = content.replace("http://localhost:8091/outputs/", outputs_url)
    if outputs_url in content:
        urls = re.findall(f"{re.escape(outputs_url)}[^\"' ]+", content)
        urls = urls[0].split("\n\n")
        for url in urls:
            file_name = url.split("/")[-1]
            url = f"{outputs_url}{file_name}"
            data = requests.get(url, headers=HEADERS).content
            if url.endswith(".jpg") or url.endswith(".png"):
                content = content.replace(url, "")
                display(Image(url=url))
            elif url.endswith(".mp4"):
                content = content.replace(url, "")
                display(Video(url=url, autoplay=True))
            elif url.endswith(".wav"):
                content = content.replace(url, "")
                display(Audio(url=url, autoplay=True))
    print(content)

## Language Models

Get a list of models to choose from if you don't already know what model you want to use.


In [20]:
# Wait for server to come up instead of timing out.
while True:
    try:
        models = requests.get(f"{EZLOCALAI_SERVER}/v1/models", headers=HEADERS)
        if models.status_code == 200:
            break
    except:
        pass
    time.sleep(1)
print(models.json())

[{'id': 'unsloth/Qwen3-VL-2B-Instruct-GGUF', 'object': 'model', 'owned_by': 'ezlocalai'}]


## Voices

Any `wav` file in the `voices` directory will be available to use as a voice.


In [21]:
voices = requests.get(f"{EZLOCALAI_SERVER}/v1/audio/voices", headers=HEADERS)
print(voices.json())

{'voices': ['DukeNukem', 'StarTrekComputer1', 'default', 'Morgan_Freeman', 'HAL9000']}


## Vision Test


In [22]:
response = openai.chat.completions.create(
    model=DEFAULT_LLM,  # Uses Qwen3-VL-4B which supports vision
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe each stage of this image."},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"https://www.visualwatermark.com/images/add-text-to-photos/add-text-to-image-3.webp"
                    },
                },
            ],
        },
    ],
    max_tokens=DEFAULT_MAX_TOKENS,
    temperature=DEFAULT_TEMPERATURE,
    top_p=DEFAULT_TOP_P,
)
display_content(response.choices[0].message.content)

This image captures a serene and uplifting scene at sunset over a body of water, with a flock of birds in flight. The overall mood is one of freedom, peace, and natural beauty.

Here is a description of each stage of the image:

- **The Sky and Horizon:** The sky is the dominant feature, painted with a beautiful gradient of colors. It begins with a soft, pale blue at the top, which gradually deepens into a warm, glowing orange and pink near the horizon. This coloration is characteristic of a sunset, with the sun itself appearing as a bright, glowing orb just above the horizon line. The sky is clear and unblemished, suggesting calm weather.

- **The Water:** The body of water, likely a bay or sea, stretches across the lower half of the image. The surface is calm, with gentle ripples that reflect the warm colors of the sunset. The water acts as a mirror, capturing the soft glow of the sun and the soft silhouettes of the birds. The horizon line is clearly visible, separating the water fro

## Chat Completion

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/chat)


In [23]:
# Modify this prompt to generate different outputs
prompt = "Write a short poem about Pikachu with a picture."


response = openai.chat.completions.create(
    model=DEFAULT_LLM,
    messages=[{"role": "user", "content": prompt}],
    temperature=DEFAULT_TEMPERATURE,
    max_tokens=DEFAULT_MAX_TOKENS,
    top_p=DEFAULT_TOP_P,
    stream=False,
    extra_body={"system_message": SYSTEM_MESSAGE},
)
display_content(response.choices[0].message.content)

**Pikachu's Spark**

A tiny, fuzzy, electric flash—  
A flash of *pikachu*, bright and bold!  
With ears that twitch and tail that wags,  
A little lightning in the sky.

It hops with joy, a spark in the air,  
A sunbeam's glow, a burst of pure,  
*Pikachu! Pikachu!*  
A spark in the sky, a spark in the heart.

It's not just a Pokémon, it's a spark,  
A little friend, a spark in the night.  
With every leap, a new light,  
A spark in the sky, a spark in the heart.

*Pikachu!*  
The spark that never fades,  
The spark that never fades.


## Completion

[OpenAI API Reference](https://platform.openai.com/docs/api-reference/completions/create)


In [24]:
# Modify this prompt to generate different outputs
prompt = "Write a haiku about the future."

completion = openai.completions.create(
    model=DEFAULT_LLM,
    prompt=prompt,
    temperature=DEFAULT_TEMPERATURE,
    max_tokens=DEFAULT_MAX_TOKENS,
    top_p=DEFAULT_TOP_P,
    n=1,
    stream=False,
    extra_body={"system_message": SYSTEM_MESSAGE},
)
display_content(completion.choices[0].text)

The haiku should be written in English and should have three lines, with the first line having 5 syllables, the second line having 7 syllables, and the third line having 5 syllables.

The future is bright,  
with stars in every breath,  
and hope in every dawn.  

(5-7-5)  
*Note: The first line has 5 syllables (The future is bright),  
the second line has 7 syllables (with stars in every breath),  
the third line has 5 syllables (and hope in every dawn).*  

This is a perfect example of a haiku. The poem captures a sense of optimism and the promise of a hopeful future. It uses nature imagery (stars, dawn) to evoke the beauty and potential of the future. The structure is traditional and follows the 5-7-5 syllable pattern. The poem is also concise and memorable, making it a great example of a haiku.  

The haiku is:  
The future is bright,  
with stars in every breath,  
and hope in every dawn.  

This haiku is a beautiful and concise expression of the future's promise. It is a perfect 

## Cloning Text to Speech

Any `wav` file in the `voices` directory can be used as a voice.


In [25]:
from pathlib import Path
import base64
import IPython.display as ipd

prompt = "Write a short poem about vikings with a picture."
os.makedirs("temp", exist_ok=True)
audio_path = os.path.join(os.getcwd(), "temp", f"test-speech.wav")
speech_file_path = Path(audio_path)
tts_response = openai.audio.speech.create(
    model="tts-1",
    voice="DukeNukem",
    input=prompt,
    extra_body={"language": "en"},
)
audio_content = base64.b64decode(tts_response.content)
speech_file_path.write_bytes(audio_content)
with open(audio_path, "wb") as audio_file:
    audio_file.write(audio_content)

ipd.Audio(speech_file_path)

## Audio to Text


In [26]:
with open(audio_path, "rb") as audio_file:
    transcription = openai.audio.transcriptions.create(model="base", file=audio_file)

print(transcription.text)

 Read a short poem about Vikings with a picture.


## Upload a Voice


In [27]:
upload_headers = HEADERS.copy()
del upload_headers["Content-Type"]
with open(audio_path, "rb") as audio_file:
    files = {"file": ("test-speech.wav", audio_file, "audio/wav")}
    data = {"voice": "Test"}
    response = requests.post(
        f"{EZLOCALAI_SERVER}/v1/audio/voices",
        files=files,
        data=data,
        headers=upload_headers,
    )
    print(response.json())

{'detail': 'Voice Test has been uploaded.'}
