## Step 2: Vectorize the input data - Azure version

In this step, we will create a json file with mathematical vectors for all the text in the files we downloaded and cleaned in Step 1. This type of vectorization is powered by large language models and provides the basis for next-generation search capabilities. In this step, we will call the OpenAI Ada-002 model to provide the vectors, and the davinci model to assist with the indexing. 

- Execute the following cell to install required packages

In [None]:
!pip install openai
!pip install langchain
!pip install llama_index
message = "The required packages have been installed"
print(message)

- Execute the following cell to import required dependencies

In [None]:
import os
import json
import openai
from langchain.llms import AzureOpenAI
from langchain.chat_models import AzureChatOpenAI
from langchain.embeddings import OpenAIEmbeddings
from llama_index import LangchainEmbedding
from llama_index import (
    GPTSimpleVectorIndex,
    SimpleDirectoryReader, 
    LLMPredictor,
    PromptHelper,
    ServiceContext
)
message = "The dependencies have been imported"
print(message)

- Enter your API Key and Azure OpenAI Configuration Details Below

In [None]:
# azure usage copied from examples: https://github.com/jerryjliu/llama_index/blob/main/examples/azure_demo/AzureOpenAI.ipynb
# and https://github.com/hwchase17/langchain/issues/2377
# and https://python.langchain.com/en/latest/modules/models/chat/integrations/azure_chat_openai.html
os.environ["OPENAI_API_BASE"] = "https://my_endpoint.openai.azure.com/"
os.environ["OPENAI_API_VERSION"] = "2023-03-15-preview" # enter your preferred API version if needed
os.environ["OPENAI_API_KEY"] = "REPLACE_WITH_YOUR_AZURE_OPENAI_KEY" 
# Note the model deployment names are the names of your azure deployments, not the names of the models. The model_deployment_name values on the following 2 lines what I named my azure model deployments.
os.environ["AZURE_QUERY_MODEL_DEPLOYMENT_NAME"] = "REPLACE_WITH_YOUR_AZURE_DEPLOYMENT_NAME_FOR_QUERY_MODEL"
os.environ["AZURE_EMBEDDINGS_MODEL_DEPLOYMENT_NAME"] = "REPLACE_WITH_YOUR_AZURE_DEPLOYMENT_NAME_FOR_EMBEDDINGS_MODEL"
openai.api_base = os.getenv("OPENAI_API_BASE")
openai.api_version = os.getenv("OPENAI_API_VERSION")
openai.api_type = "azure"
openai.api_key = os.getenv("OPENAI_API_KEY")
openai.azure_query_deployment = os.getenv("AZURE_QUERY_MODEL_DEPLOYMENT_NAME")
openai.azure_embeddings_deployment = os.getenv("AZURE_EMBEDDINGS_MODEL_DEPLOYMENT_NAME")
message = "The api settings have been loaded"
print(message)

- The code block below defines the model parameters for: 
  - "llm" which is used for queries 
  - "embedding_llm" which is used for creating the vector db and indices
- Execute the below code block

In [None]:
llm = AzureChatOpenAI(
    openai_api_base=openai.api_base,
    openai_api_version=openai.api_version,
    deployment_name=openai.azure_query_deployment,
    openai_api_key=openai.api_key,
    openai_api_type=openai.api_type,
)

# llm = AzureOpenAI(deployment_name=openai.azure_query_deployment, model_kwargs={
#     "api_key": openai.api_key,
#     "api_base": openai.api_base,
#     "api_type": openai.api_type,
#     "api_version": openai.api_version,
# })
llm_predictor = LLMPredictor(llm=llm)

embedding_llm = LangchainEmbedding(OpenAIEmbeddings(
    document_model_name=openai.azure_embeddings_deployment,
    query_model_name=openai.azure_query_deployment
))

documents = SimpleDirectoryReader('html_downloads').load_data()
message = "The model settings have been defined, and the documents are loaded for indexing"
print(message)

- This block is currently unused, but left here for reference. If you want to customize the prompt_helper, uncomment this section, and add the prompt_helper to the service_context in the next code block below
- You can skip to the next block

In [None]:
# # max LLM token input size
# max_input_size = 500
# # set number of output tokens
# num_output = 48
# # set maximum chunk overlap
# max_chunk_overlap = 20

# prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

- The following code block sets calls openai ada-002 model to generate embeddings to populate the vector db, and creates the vector db and indices.
- It may take several seconds before you see any return text
- You may see errors indicating you have exceeded the Azure OpenAI rate limit. You can most likely ignore these errors as the rate limit is more than enough for most jobs, sometimes this code may try to send a little too fast, but it should retry until it completes successfully.

In [None]:
service_context = ServiceContext.from_defaults(
    llm_predictor=llm_predictor,
    embed_model=embedding_llm
)
index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)
index.save_to_disk('testindex.azure.json')
message = "The index has been saved"
print(message)

### Now, query ChatGPT with context from your data source!

In [None]:
response = index.query("how can I install tanzu application platform?")
print(response)