# How to call text-to-text DIAL applications

This notebook covers how to call text-to-text applications via [DIAL API chat/completions](https://epam-rail.com/dial_api#/paths/~1openai~1deployments~1%7BDeployment%20Name%7D~1chat~1completions/post) call.

## Setup

First, we install the necessary dependencies and import the libraries we will be using.

In [1]:
%pip install -r ../python-notebooks-runner/requirements.txt > /dev/null

Note: you may need to restart the kernel to use updated packages.


In [2]:
import requests
import openai
import langchain_openai

Run docker compose in a separate terminal to start the DIAL Core server locally along with the `echo` application.

`echo` is a simple text-to-text application which returns the content of the last user message.

```sh
(cd ..; docker compose up core echo)
```

Let's now configure DIAL Core URL: it will be `http://localhost:8080`, if the DIAL Core is run locally.

In [4]:
import os
dial_url = os.environ.get("DIAL_URL", "http://localhost:8080")
os.environ["DIAL_URL"] = dial_url

## Curl

The echo application deployment is called `echo`.

The local DIAL Core server URL is `dial_url`.

The OpenAI API version we are going to use is `2023-03-15-preview`.

Hence the echo application is accessible via the url:

```
http://${DIAL_URL}/openai/deployments/echo/chat/completions?api-version=2023-03-15-preview
```

The corresponding curl command with a singe message in the request is:

In [5]:
!curl -X POST "${DIAL_URL}/openai/deployments/echo/chat/completions?api-version=2023-03-15-preview" \
  -H "Api-Key:dial_api_key" \
  -H "Content-Type:application/json" \
  -d '{"messages": [{"role": "user", "content": "Hello world!"}]}'

{"choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Hello world!"}}],"usage":null,"id":"15ab357a-c760-4cab-8925-c693db2b0bfb","created":1706095229,"object":"chat.completion"}

## Requests


Let's make an HTTP request from Python using `requests` library and make sure the return message matches the message in the request.

The arguments are identical to the curl command above.

Let's call the application in non-streaming mode first:


In [6]:
response = requests.post(
    f"{dial_url}/openai/deployments/echo/chat/completions?api-version=2023-03-15-preview",
    headers={"Api-Key": "dial_api_key"},
    json={"messages": [{"role": "user", "content": "Hello world!"}]},
)
body = response.json()
display(body)
completion = body["choices"][0]["message"]["content"]
print(f"Completion: {completion!r}")
assert completion == "Hello world!", "Unexpected completion"

{'choices': [{'index': 0,
   'finish_reason': 'stop',
   'message': {'role': 'assistant', 'content': 'Hello world!'}}],
 'usage': None,
 'id': 'ee2e401b-aebe-47e1-bc9d-ad4461da8988',
 'created': 1706095232,
 'object': 'chat.completion'}

Completion: 'Hello world!'


When streaming is enabled, the chat completion returns a sequence of messages, each containing a chunk of a generated response:

In [7]:
response = requests.post(
    f"{dial_url}/openai/deployments/echo/chat/completions?api-version=2023-03-15-preview",
    headers={"Api-Key": "dial_api_key"},
    json={"messages": [{"role": "user", "content": "Hello world!"}], "stream": True},
)
for chunk in response.iter_lines():
    print(chunk)

b'data: {"choices":[{"index":0,"finish_reason":null,"delta":{"role":"assistant"}}],"usage":null,"id":"90efebc8-6857-4cc5-b1c0-79383f385f22","created":1706095234,"object":"chat.completion.chunk"}'
b''
b'data: {"choices":[{"index":0,"finish_reason":null,"delta":{"content":"Hello world!"}}],"usage":null,"id":"90efebc8-6857-4cc5-b1c0-79383f385f22","created":1706095234,"object":"chat.completion.chunk"}'
b''
b'data: {"choices":[{"index":0,"finish_reason":"stop","delta":{}}],"usage":null,"id":"90efebc8-6857-4cc5-b1c0-79383f385f22","created":1706095234,"object":"chat.completion.chunk"}'
b''
b'data: [DONE]'
b''


## OpenAI Python SDK

The DIAL deployment could be called using [OpenAI Python SDK](https://pypi.org/project/openai/) as well.


In [9]:
openai_client = openai.AzureOpenAI(
    azure_endpoint=dial_url,
    azure_deployment="echo",
    api_key="dial_api_key",
    api_version="2023-03-15-preview",
)

In the non-streaming mode:

In [10]:

chat_completion = openai_client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Hello world!",
        }
    ],
    model="echo",
)
print(chat_completion)
completion = chat_completion.choices[0].message.content
print(f"Completion: {completion!r}")
assert completion == "Hello world!", "Unexpected completion"

ChatCompletion(id='5d17d2d5-f4ea-4a3d-b6e0-b2feafd70dd8', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Hello world!', role='assistant', function_call=None, tool_calls=None))], created=1706095241, model=None, object='chat.completion', system_fingerprint=None, usage=None)
Completion: 'Hello world!'


And in the streaming mode:

In [11]:
chat_completion = openai_client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Hello world!",
        }
    ],
    stream=True,
    model="echo",
)
completion = ""
for chunk in chat_completion:
    print(chunk)
    content = chunk.choices[0].delta.content
    if content:
        completion += content
print(f"Completion: {completion!r}")
assert completion == "Hello world!", "Unexpected completion"

ChatCompletionChunk(id='62b16151-58ed-4f68-ad67-ca9a69effb13', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706095243, model=None, object='chat.completion.chunk', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='62b16151-58ed-4f68-ad67-ca9a69effb13', choices=[Choice(delta=ChoiceDelta(content='Hello world!', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706095243, model=None, object='chat.completion.chunk', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='62b16151-58ed-4f68-ad67-ca9a69effb13', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1706095243, model=None, object='chat.completion.chunk', system_fingerprint=None, usage=None)
Completion: 'Hello world!'


## LangChain

Let's call the application via the [LangChain](https://pypi.org/project/langchain-openai/) library.

In [13]:
from langchain_core.messages import HumanMessage

llm = langchain_openai.AzureChatOpenAI(
    azure_endpoint=dial_url,
    azure_deployment="echo",
    api_key="dial_api_key",
    api_version="2023-03-15-preview",
)

In non-streaming mode:

In [14]:
output = llm.generate(messages=[[HumanMessage(content="Hello world!")]])
print(output)
completion = output.generations[0][0].text
print(f"Completion: {completion!r}")
assert completion == "Hello world!", "Unexpected completion"

generations=[[ChatGeneration(text='Hello world!', generation_info={'finish_reason': 'stop', 'logprobs': None}, message=AIMessage(content='Hello world!'))]] llm_output={'token_usage': {}, 'model_name': 'gpt-3.5-turbo'} run=[RunInfo(run_id=UUID('c92dacc5-ffa9-4498-99ca-15966453bda0'))]
Completion: 'Hello world!'


In the streaming mode:

In [16]:
output = llm.stream(input=[HumanMessage(content="Hello world!")])
completion = ""
for chunk in output:
    print(chunk.dict())
    completion += chunk.content
print(f"Completion: {completion!r}")
assert completion == "Hello world!", "Unexpected completion"

{'content': '', 'additional_kwargs': {}, 'type': 'AIMessageChunk', 'example': False}
{'content': 'Hello world!', 'additional_kwargs': {}, 'type': 'AIMessageChunk', 'example': False}
{'content': '', 'additional_kwargs': {}, 'type': 'AIMessageChunk', 'example': False}
Completion: 'Hello world!'
