# How to call text-to-text DIAL applications

This notebook covers how to call text-to-text application via [DIAL API chat/completions](https://epam-rail.com/dial_api#/paths/~1openai~1deployments~1%7BDeployment%20Name%7D~1chat~1completions/post) call.

## Setup

Install the client libraries, which we will use to call the API.


In [1]:
%pip install -r ../python-notebooks-runner/requirements.txt > /dev/null

Note: you may need to restart the kernel to use updated packages.


Run docker compose in a separate terminal to start the DIAL Core server locally along with the echo application.

```sh
(cd ..; docker compose up echo core)
```

Configure DIAL Core ULR: if DIAL Core is run locally via `docker compose`, it's going to be `http://localhost:8080`.

In [2]:
import os
dial_url = os.environ.get("DIAL_URL", "http://localhost:8080")
os.environ["DIAL_URL"] = dial_url

## Curl

The echo application deployment is called `echo`.

The local DIAL Core server URL is `dial_url`.

The OpenAI API version we are going to use is `2023-03-15-preview`.

Hence the echo application is accessible via the url:

```
http://${DIAL_URL}/openai/deployments/echo/chat/completions?api-version=2023-03-15-preview
```

The corresponding curl command with a singe message in the request is:

In [3]:
!curl -X POST "${DIAL_URL}/openai/deployments/echo/chat/completions?api-version=2023-03-15-preview" -H "Api-Key:dial_api_key" -H "Content-Type:application/json" -d '{"messages": [{"role": "user", "content": "Hello world!"}]}'

{"choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Hello world!"}}],"usage":null,"id":"867d7593-ef73-4fb0-a948-aec6cb761b10","created":1706025457,"object":"chat.completion"}

## Requests


Let's make an HTTP request from Python using `requests` library and make sure the return message matches the message in the request.

The arguments are identical to the curl command above.

### Non-streaming


In [4]:
import requests

response = requests.post(
    f"{dial_url}/openai/deployments/echo/chat/completions?api-version=2023-03-15-preview",
    headers={"Api-Key": "dial_api_key"},
    json={"messages": [{"role": "user", "content": "Hello world!"}]},
)
body = response.json()
display(body)
completion = body["choices"][0]["message"]["content"]
print(f"Completion: {completion!r}")
assert completion == "Hello world!", "Unexpected completion"

{'choices': [{'index': 0,
   'finish_reason': 'stop',
   'message': {'role': 'assistant', 'content': 'Hello world!'}}],
 'usage': None,
 'id': 'cc2b4161-7dce-48b4-811d-af372f4327e7',
 'created': 1706025457,
 'object': 'chat.completion'}

Completion: 'Hello world!'


### Streaming

When streaming is enabled, the chat completion returns a sequence of message, each containing a chunk of a generated response.


In [5]:
response = requests.post(
    f"{dial_url}/openai/deployments/echo/chat/completions?api-version=2023-03-15-preview",
    headers={"Api-Key": "dial_api_key"},
    json={"messages": [{"role": "user", "content": "Hello world!"}], "stream": True},
)
for chunk in response.iter_lines():
    print(chunk)

b'data: {"choices":[{"index":0,"finish_reason":null,"delta":{"role":"assistant"}}],"usage":null,"id":"fd9ce1fe-aeb9-44eb-b8f0-e50ee14eb919","created":1706025457,"object":"chat.completion.chunk"}'
b''
b'data: {"choices":[{"index":0,"finish_reason":null,"delta":{"content":"Hello world!"}}],"usage":null,"id":"fd9ce1fe-aeb9-44eb-b8f0-e50ee14eb919","created":1706025457,"object":"chat.completion.chunk"}'
b''
b'data: {"choices":[{"index":0,"finish_reason":"stop","delta":{}}],"usage":null,"id":"fd9ce1fe-aeb9-44eb-b8f0-e50ee14eb919","created":1706025457,"object":"chat.completion.chunk"}'
b''
b'data: [DONE]'
b''


## OpenAI Python SDK

The DIAL deployment could be called using [OpenAI Python SDK](https://pypi.org/project/openai/) as well.


In [6]:
import openai

openai_client = openai.AzureOpenAI(
    azure_endpoint=dial_url,
    azure_deployment="echo",
    api_key="dial_api_key",
    api_version="2023-03-15-preview",
)

chat_completion = openai_client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Hello world!",
        }
    ],
    model="echo",
)
print(chat_completion)
completion = chat_completion.choices[0].message.content
print(f"Completion: {completion!r}")
assert completion == "Hello world!", "Unexpected completion"

ChatCompletion(id='503668de-f941-4f70-bd32-9fea93db1a42', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Hello world!', role='assistant', function_call=None, tool_calls=None))], created=1706025457, model=None, object='chat.completion', system_fingerprint=None, usage=None)
Completion: 'Hello world!'


## LangChain

Or via [LangChain](https://pypi.org/project/langchain-openai/) library


In [7]:
from langchain_core.messages import HumanMessage
from langchain_openai import AzureChatOpenAI

llm = AzureChatOpenAI(
    azure_endpoint=dial_url,
    azure_deployment="echo",
    api_key="dial_api_key",
    api_version="2023-03-15-preview",
)

output = llm.generate(messages=[[HumanMessage(content="Hello world!")]])
print(output)
completion = output.generations[0][0].text
print(f"Completion: {completion!r}")
assert completion == "Hello world!", "Unexpected completion"

generations=[[ChatGeneration(text='Hello world!', generation_info={'finish_reason': 'stop', 'logprobs': None}, message=AIMessage(content='Hello world!'))]] llm_output={'token_usage': {}, 'model_name': 'gpt-3.5-turbo'} run=[RunInfo(run_id=UUID('0499fc9d-dcff-4355-a51e-f9eb72265789'))]
Completion: 'Hello world!'
