# Elastic's Recipe: Smarter Orders with Phi-3 small models

In this notebook we will learn how to deploy [phi-3](https://azure.microsoft.com/en-us/products/phi-3) models on [Azure AI Studio](https://ai.azure.com) and using them with Elastic Open Inference Service to create a RAG application. This notebook illustrates the article [Elastic's Recipe: Smarter Orders with Phi-3 small models](https://www.elastic.co/search-labs/blog/utilizing-phi3-models).


## Install packages and import necessary modules


In [None]:
# install packages
!python3 -m pip install elasticsearch==8.14

from elasticsearch import Elasticsearch, exceptions
from elasticsearch.helpers import bulk
from getpass import getpass
import json

Collecting elasticsearch==8.14
  Downloading elasticsearch-8.14.0-py3-none-any.whl (480 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m480.2/480.2 kB[0m [31m7.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting elastic-transport<9,>=8.13 (from elasticsearch==8.14)
  Downloading elastic_transport-8.13.1-py3-none-any.whl (64 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m64.5/64.5 kB[0m [31m7.9 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: elastic-transport, elasticsearch
Successfully installed elastic-transport-8.13.1 elasticsearch-8.14.0


## Declaring variables

This code will create inputs where you can enter your credentials.

Here you can learn how to retrieve your Elasticsearch credentials: [Finding Your Cloud ID](https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#finding-your-cloud-id).


In [None]:
ELASTIC_CLUSTER_ID = getpass("Elastic Cloud ID: ")
ELASTIC_API_KEY = getpass("Elastic Api Key: ")

AZURE_API_KEY = getpass("Azure API Key: ")
AZURE_TARGET_URL = getpass("Azure target URL: ")

Elastic Cloud ID: ··········
Elastic Api Key: ··········
Azure API Key: ··········
Azure target URL: ··········


## Instance Elasticsearch client


In [None]:
es_client = Elasticsearch(
    cloud_id=ELASTIC_CLUSTER_ID,
    api_key=ELASTIC_API_KEY,
)

## Creating embeddings endpoint


In [None]:
try:
    es_client.options(
        request_timeout=60, max_retries=3, retry_on_timeout=True
    ).inference.put_model(
        task_type="sparse_embedding",
        inference_id="elser-embeddings",
        body={
            "service": "elser",
            "service_settings": {
                "num_allocations": 1,
                "num_threads": 1,
            },
        },
    )

    print("Embedding endpoint created successfully.")
except exceptions.BadRequestError as e:
    if e.error == "resource_already_exists_exception":
        print("Embedding endpoint already created.")
    else:
        raise e

Embedding endpoint created successfully.


## Creating completion endpoint


In [None]:
try:
    es_client.options(
        request_timeout=60, max_retries=3, retry_on_timeout=True
    ).inference.put_model(
        task_type="completion",
        inference_id="phi3-completion",
        body={
            "service": "azureaistudio",
            "service_settings": {
                "api_key": AZURE_API_KEY,
                "target": AZURE_TARGET_URL,
                "provider": "microsoft_phi",
                "endpoint_type": "token",
            },
        },
    )
    print("Completion endpoint created successfully")
except exceptions.BadRequestError as e:
    if e.error == "resource_already_exists_exception":
        print("Completion endpoint already created.")
    else:
        raise e

NameError: name 'AZURE_API_KEY' is not defined

## Creating index


In [None]:
try:
    es_client.indices.create(
        index="lasticco-menu",
        body={
            "mappings": {
                "properties": {
                    "code": {"type": "keyword"},
                    "title": {"type": "text"},
                    "description": {
                        "type": "semantic_text",
                        "inference_id": "elser-embeddings",
                    },
                    "price": {"type": "double"},
                    "customizations": {"type": "object"},
                }
            }
        },
    )
except exceptions.RequestError as e:
    if e.error == "resource_already_exists_exception":
        print("Index already exists.")
    else:
        raise e

## Indexing data


In [None]:
menu_dishes = [
    {
        "code": "carbonara",
        "title": "Pasta Carbonara",
        "description": "Pasta Carbonara \n Perfectly al dente spaghetti enrobed in a velvety sauce of farm-fresh eggs, aged Pecorino Romano, and smoky guanciale. Finished with a kiss of cracked black pepper for a classic Roman indulgence.",
        "price": 14.99,
        "customizations": {
            "vegetarian": [True, False],
            "cream": [True, False],
            "extras": ["cheese", "garlic", "ham"],
        },
    },
    {
        "code": "alfredo",
        "title": "Chicken Alfredo",
        "description": "Chicken Alfredo \n Recipe includes golden pan-fried seasoned chicken breasts and tender fettuccine, coated in the most dreamy cream sauce ever, coated with a velvety garlic and Parmesan cream sauce.",
        "price": 18.99,
        "customizations": {
            "vegetarian": [True, False],
            "cream": [True, False],
            "extras": ["cheese", "onions", "olives"],
        },
    },
    {
        "code": "gnocchi",
        "title": "Four Cheese Gnocchi",
        "description": "Four Cheese Gnocchi \n soft pillowy potato gnocchi coated in a silken cheesy sauce made of four different cheeses: Gouda, Parmigiano, Brie, and the star, Gorgonzola. The combination of four different types of cheese will make your tastebuds dance for joy.",
        "price": 15.99,
        "customizations": {
            "vegetarian": [True, False],
            "cream": [True, False],
            "extras": ["cheese", "bacon", "mushrooms"],
        },
    },
]

In [None]:
# This function will create a bulk object for the given id and body
def build_bulk_obj(id, body):
    return {"_index": "lasticco-menu", "_id": id, "_source": body}


data = []

# Constructing bulk object for each dish
for i, dish in enumerate(menu_dishes):
    data.append(build_bulk_obj(i + 1, dish))

try:
    # Using the bulk API to index the data
    bulk(es_client, data)
    print("Data indexed successfully.")
except exceptions.RequestError as e:
    print("Error indexing data.")
    print(e)

Data indexed successfully.


### Retrieving relevant dishes

We use a semantic query to retrieve the most relevant dishes based on the customer request.

In [None]:
try:
    response = es_client.search(
        index="lasticco-menu",
        body={
            "query": {
                "semantic": {
                    "field": "description",
                    "query": "may I have a carbonara with cream and bacon?",
                }
            },
        },
    )
    dishes = []

    for r in response.body["hits"]["hits"]:
        dishes.append(r["_source"])

    print(f"Response: {json.dumps(dishes, indent=2)}")
except Exception as e:
    print(e)

Response: [
  {
    "code": "carbonara",
    "price": 14.99,
    "description": {
      "text": "Pasta Carbonara \n Perfectly al dente spaghetti enrobed in a velvety sauce of farm-fresh eggs, aged Pecorino Romano, and smoky guanciale. Finished with a kiss of cracked black pepper for a classic Roman indulgence.",
      "inference": {
        "inference_id": "elser-embeddings",
        "model_settings": {
          "task_type": "sparse_embedding"
        },
        "chunks": [
          {
            "text": "Pasta Carbonara \n Perfectly al dente spaghetti enrobed in a velvety sauce of farm-fresh eggs, aged Pecorino Romano, and smoky guanciale. Finished with a kiss of cracked black pepper for a classic Roman indulgence.",
            "embeddings": {
              "carbon": 2.0847998,
              "pasta": 2.0838325,
              "spaghetti": 1.9527067,
              "##ara": 1.7632319,
              "romano": 1.6877614,
              "al": 1.6518246,
              "dent": 1.5832088,
  

### Putting everything together

With this script we can ask the user to order, and keep the status of the order updated.

In [None]:
current_order = {"order": []}

while True:
    query = input("What would you like to order? ")

    try:
        response = es_client.search(
            index="lasticco-menu",
            body={
                "size": 3,
                "_source": {"excludes": ["*embeddings", "*chunks", "*inference"]},
                "query": {
                    "semantic": {
                        "field": "description",
                        "query": query,
                    }
                },
            },
        )

        dishes = []

        for r in response.body["hits"]["hits"]:
            dishes.append(r["_source"])

        # Build prompt
        example_order = {
            "order": [
                {
                    "code": "carbonara",
                    "qty": 1,
                    "customizations": [{"vegetarian": True}],
                },
                {
                    "code": "alfredo",
                    "qty": 2,
                    "customizations": [{"extras": ["cheese"]}],
                },
                {
                    "code": "gnocchi",
                    "qty": 1,
                    "customizations": [{"extras": ["mushrooms"]}],
                },
            ],
        }

        input_content = f"""
            Your task is to manage an order based on the AVAILABLE DISHES in the MENU and the USER REQUEST. Follow these strict rules:

              1. ONLY add dishes to the order that are explicitly listed in the MENU.
              2. If the requested dish is not in the MENU, do not add anything to the order.
              3. The response must always be a valid JSON object containing an "order" array, even if it's empty.
              4. Do not invent or hallucinate any dishes that are not in the MENU.
              5. Respond only with the updated order object, nothing else.

            Example of an order object:
            {json.dumps(example_order, indent=2)}

            MENU:
            {json.dumps(dishes, indent=2)}

            CURRENT ORDER:
            {json.dumps(current_order, indent=2)}

            USER REQUEST: {query}


            Remember:

            If the requested dish is not in the MENU, return the current order unchanged.
            Customizations should be added as an object with the same structure as in the MENU.
            For boolean customizations, use true/false values.
            For array customizations, use an array with the selected items.
        """

        response = es_client.options(
            request_timeout=60, max_retries=3, retry_on_timeout=True
        ).inference.inference(
            task_type="completion",
            inference_id="phi3-completion",
            input=input_content,
        )

        completion_result = response["completion"][0]["result"]
        print(f"Result: \n {completion_result}\n")

        current_order = json.loads(response["completion"][0]["result"])
        print(
            f"The current order status is: \n {json.dumps(current_order, indent=2)}\n"
        )

    except Exception as e:
        print(e)

What would you like to order? pasta with cheese
Result: 
 {
  "order": [
    {
      "code": "carbonara",
      "qty": 1,
      "customizations": [
        {
          "extras": [
            "cheese"
          ]
        }
      ]
    }
  ]
}

The current order status is: 
 {
  "order": [
    {
      "code": "carbonara",
      "qty": 1,
      "customizations": [
        {
          "extras": [
            "cheese"
          ]
        }
      ]
    }
  ]
}

What would you like to order? alfredo
Result: 
 {
  "order": [
    {
      "code": "carbonara",
      "qty": 1,
      "customizations": [
        {
          "extras": [
            "cheese"
          ]
        }
      ]
    },
    {
      "code": "alfredo",
      "qty": 1,
      "customizations": []
    }
  ]
}

The current order status is: 
 {
  "order": [
    {
      "code": "carbonara",
      "qty": 1,
      "customizations": [
        {
          "extras": [
            "cheese"
          ]
        }
      ]
    },
    {
      "

KeyboardInterrupt: Interrupted by user

## Cleanup

Finally, we can delete the resources used to prevent them from consuming resources.


In [None]:
# Cleanup - Delete Index
es_client.indices.delete(index="lasticco-menu", ignore=[400, 404])

# Cleanup - Delete Completions
es_client.inference.delete_model(inference_id="phi3-completion", ignore=[400, 404])

# Cleanup - Delete Embeddings Endpoint
es_client.inference.delete_model(inference_id="elser-embeddings", ignore=[400, 404])

  es_client.indices.delete(index="lasticco-menu", ignore=[400, 404])
  es_client.inference.delete_model(inference_id="phi3-completion", ignore=[400, 404])
  es_client.inference.delete_model(inference_id="elser-embeddings", ignore=[400, 404])


ObjectApiResponse({'acknowledged': True, 'pipelines': [], 'indexes': []})