# Generating Completions

The whole point of `zyx` is ensuring working with LLMs as chat models is as quick as possible. 

---

There are two ways to use all of the LLM functions in `zyx`:

<div class="grid cards" markdown>

- Running the function through the `Completions` client
- Running the function straight from the `zyx` namespace

</div>

<br/>

Although simpler, the advantage of using the `Completions` client is that it only needs to instantiate a single client object; furthermore, the `Agentic Framework` is only available as a method of the `Completions` client.

The following sections will through the various features of the `.completion()` method.

---

## Standard Chat Completions

### Completion w/ Completions Client

In [1]:
# import the completions client
from zyx import Completions

# create a completions client
client = Completions()

# create a completion
completion = client.completion(
    messages = [{"role": "user", "content": "Hello!"}],
    model = "gpt-4o-mini"
)

# print the completion content
# responses are returned as the openai `ChatCompletion` object
print(completion.choices[0].message.content)

### Using .completion() directly

In [3]:
# import zyx (or just import the `completion` function)
import zyx as z

# create a completion
completion = z.completion(
    messages = "Hello!", # messages can be passed in as a string

    # supports all standard chat completion arguments
    temperature = 0.5,
    max_completion_tokens = 10, # openai has swapped `max_tokens` for `max_completion_tokens`, this is the new default
    top_p = 0.9,
)

# print the completion content
print(completion.choices[0].message.content)

## Use Any LiteLLM Model

The completions client is a unified wrapper around both the OpenAI & LiteLLM clients, so if you are using a LiteLLM model, you can use it as you normally would; the client will automatically detect the model and use the appropriate client.

In [4]:
# we'll import the function itself this time
from zyx import completion

# create a completion using an ollama model
response = completion("hi whats up?", model = "ollama/granite3-moe")

# print the response
print(response.choices[0].message.content)

## Streaming

Streaming support is built in through the `openai.OpenAI.chat.completions.create()` method, so it is supported out of the box.

In [10]:
# import zyx
import zyx as z

# create a streaming completion
response = z.completion(
    messages = [
        {"role" : "system", "content" : "You only respond in French."},
        {"role" : "user", "content" : "hi whats up?"}
    ],
    model = "ollama/granite3-moe",
    stream = True
)

# print the response
# for chunk in response:
#     print(chunk.choices[0].delta.content or "", end = "")


The following section will cover generating structured outputs through the `completion()` function.