<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/customization/llms/AzureOpenAI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Azure OpenAI

Azure openAI resources unfortunately differ from standard openAI resources as you can't generate embeddings unless you use an embedding model. The regions where these models are available can be found here: https://learn.microsoft.com/en-us/azure/cognitive-services/openai/concepts/models#embeddings-models

Furthermore the regions that support embedding models unfortunately don't support the latest versions (<*>-003) of openAI models, so we are forced to use one region for embeddings and another for the text generation.

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [1]:
%pip install llama-index-embeddings-azure-openai
%pip install llama-index-llms-azure-openai

Collecting llama-index-embeddings-azure-openai
  Downloading llama_index_embeddings_azure_openai-0.3.0-py3-none-any.whl.metadata (794 bytes)
Collecting llama-index-core<0.13.0,>=0.12.0 (from llama-index-embeddings-azure-openai)
  Downloading llama_index_core-0.12.14-py3-none-any.whl.metadata (2.5 kB)
Collecting llama-index-embeddings-openai<0.4.0,>=0.3.0 (from llama-index-embeddings-azure-openai)
  Downloading llama_index_embeddings_openai-0.3.1-py3-none-any.whl.metadata (684 bytes)
Collecting llama-index-llms-azure-openai<0.4.0,>=0.3.0 (from llama-index-embeddings-azure-openai)
  Downloading llama_index_llms_azure_openai-0.3.0-py3-none-any.whl.metadata (4.0 kB)
Collecting dataclasses-json (from llama-index-core<0.13.0,>=0.12.0->llama-index-embeddings-azure-openai)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting dirtyjson<2.0.0,>=1.0.8 (from llama-index-core<0.13.0,>=0.12.0->llama-index-embeddings-azure-openai)
  Downloading dirtyjson-1.0.8-py3-none-an

In [2]:
!pip install llama-index

Collecting llama-index
  Downloading llama_index-0.12.14-py3-none-any.whl.metadata (12 kB)
Collecting llama-index-agent-openai<0.5.0,>=0.4.0 (from llama-index)
  Downloading llama_index_agent_openai-0.4.3-py3-none-any.whl.metadata (727 bytes)
Collecting llama-index-cli<0.5.0,>=0.4.0 (from llama-index)
  Downloading llama_index_cli-0.4.0-py3-none-any.whl.metadata (1.5 kB)
Collecting llama-index-indices-managed-llama-cloud>=0.4.0 (from llama-index)
  Downloading llama_index_indices_managed_llama_cloud-0.6.4-py3-none-any.whl.metadata (3.6 kB)
Collecting llama-index-multi-modal-llms-openai<0.5.0,>=0.4.0 (from llama-index)
  Downloading llama_index_multi_modal_llms_openai-0.4.2-py3-none-any.whl.metadata (726 bytes)
Collecting llama-index-program-openai<0.4.0,>=0.3.0 (from llama-index)
  Downloading llama_index_program_openai-0.3.1-py3-none-any.whl.metadata (764 bytes)
Collecting llama-index-question-gen-openai<0.4.0,>=0.3.0 (from llama-index)
  Downloading llama_index_question_gen_openai-0.

In [3]:
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
import logging
import sys

logging.basicConfig(
    stream=sys.stdout, level=logging.INFO
)  # logging.DEBUG for more verbose output
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

Here, we setup the embedding model (for retrieval) and llm (for text generation).
Note that you need not only model names (e.g. "text-embedding-ada-002"), but also model deployment names (the one you chose when deploying the model in Azure.
You must pass the deployment name as a parameter when you initialize `AzureOpenAI` and `OpenAIEmbedding`.

In [10]:
azure_endpoint="https://oai-muse-stg-uks.openai.azure.com"
model_name="gpt-4o-stg"
api_version="2024-03-01-preview"
api_key="1be0c34cae674e49a562af741e709318"

ADA_EMBEDDING_BASE_URL="https://o1-muse-test.openai.azure.com"
ADA_EMBEDDING_MODEL_NAME="text-embedding-ada-002"
ADA_EMBEDDING_API_KEY="5617dc693f1040ea811aceff21157e4a"
ADA_EMBEDDING_API_VERSION="2023-05-15"

llm = AzureOpenAI(
    model="gpt-4o",
    deployment_name="gpt-4o",
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
)

# You need to deploy your own embedding model as well as your own chat completion model
embed_model = AzureOpenAIEmbedding(
    model="text-embedding-ada-002",
    deployment_name=ADA_EMBEDDING_MODEL_NAME,
    api_key=ADA_EMBEDDING_API_KEY,
    azure_endpoint=ADA_EMBEDDING_BASE_URL,
    api_version=ADA_EMBEDDING_API_VERSION,
)

In [5]:
from llama_index.core import Settings

Settings.llm = llm
Settings.embed_model = embed_model

In [8]:
documents = SimpleDirectoryReader(
    input_files=["sample_data/README.md"]
).load_data()
index = VectorStoreIndex.from_documents(documents)

In [11]:
query = "What is most interesting about this essay?"
query_engine = index.as_query_engine()
answer = query_engine.query(query)

print(answer.get_formatted_sources())
print("query was:", query)
print("answer was:", answer)

ValueError: Unknown model 'gpt-4o-stg'. Please provide a valid OpenAI model name in: o1, o1-2024-12-17, o1-preview, o1-preview-2024-09-12, o1-mini, o1-mini-2024-09-12, gpt-4, gpt-4-32k, gpt-4-1106-preview, gpt-4-0125-preview, gpt-4-turbo-preview, gpt-4-vision-preview, gpt-4-1106-vision-preview, gpt-4-turbo-2024-04-09, gpt-4-turbo, gpt-4o, gpt-4o-2024-05-13, gpt-4o-2024-08-06, gpt-4o-2024-11-20, chatgpt-4o-latest, gpt-4o-mini, gpt-4o-mini-2024-07-18, gpt-4-0613, gpt-4-32k-0613, gpt-4-0314, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-0125, gpt-3.5-turbo-1106, gpt-3.5-turbo-0613, gpt-3.5-turbo-16k-0613, gpt-3.5-turbo-0301, text-davinci-003, text-davinci-002, gpt-3.5-turbo-instruct, text-ada-001, text-babbage-001, text-curie-001, ada, babbage, curie, davinci, gpt-35-turbo-16k, gpt-35-turbo, gpt-35-turbo-0125, gpt-35-turbo-1106, gpt-35-turbo-0613, gpt-35-turbo-16k-0613