## Azure langchain

In [1]:
%pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


In [6]:
!export AZURE_API_BASE=https://tresen-test-1.openai.azure.com/
!export AZURE_OPEN_API_KEY=2e9de9a9d04c4702823a16e233ebd02c

In [7]:
import os
import openai

# openai.api_key = os.environ['API_KEY']
# os.environ["OPENAI_API_KEY"] = os.environ['API_KEY']

openai.api_type = "azure"
openai.api_base =  os.environ['AZURE_API_BASE']
openai.api_version = "2023-03-15-preview"
openai.api_key =  os.environ['AZURE_OPEN_API_KEY']



KeyError: 'AZURE_API_BASE'

In [4]:
from llama_index import (
    GPTVectorStoreIndex, 
    GPTSimpleKeywordTableIndex, 
    SimpleDirectoryReader,
    LLMPredictor,
    ServiceContext
)

from langchain.llms.openai import OpenAIChat


In [5]:
wiki_titles = ["Toronto", "Seattle", "Chicago", "Boston", "Houston"]


In [6]:
from pathlib import Path

import requests
for title in wiki_titles:
    response = requests.get(
        'https://en.wikipedia.org/w/api.php',
        params={
            'action': 'query',
            'format': 'json',
            'titles': title,
            'prop': 'extracts',
            # 'exintro': True,
            'explaintext': True,
        }
    ).json()
    page = next(iter(response['query']['pages'].values()))
    wiki_text = page['extract']

    data_path = Path('data')
    if not data_path.exists():
        Path.mkdir(data_path)

    with open(data_path / f"{title}.txt", 'w') as fp:
        fp.write(wiki_text)
        

In [7]:
# Load all wiki documents
city_docs = {}
for wiki_title in wiki_titles:
    city_docs[wiki_title] = SimpleDirectoryReader(input_files=[f"data/{wiki_title}.txt"]).load_data()

In [8]:
# # LLM Predictor (gpt-3.5-turbo)
llm_predictor_chatgpt = LLMPredictor(llm=OpenAIChat(temperature=0, model_name="gpt-3.5-turbo"))
service_context = ServiceContext.from_defaults(
    llm_predictor=llm_predictor_chatgpt, chunk_size_limit=1024
)

# llm_predictor_gpt4 = LLMPredictor(llm=OpenAIChat(temperature=0, model_name="gpt-4"))
# service_context = ServiceContext.from_defaults(
#     llm_predictor=llm_predictor_gpt4, chunk_size_limit=1024



In [9]:
# Build city document index
vector_indices = {}
for wiki_title in wiki_titles:
    
    # build vector index
    vector_indices[wiki_title] = GPTVectorStoreIndex.from_documents(
        city_docs[wiki_title], service_context=service_context
    )

    # set id for vector index
    vector_indices[wiki_title].set_index_id(wiki_title)

////////////
200
////////////
200


INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 20744 tokens


////////////
200
////////////
200


INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 16942 tokens


////////////
200
////////////
200
////////////
200
////////////
200


INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 26082 tokens


////////////
200
////////////
200
////////////
200


INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 18648 tokens


////////////
200
////////////
200
////////////
200


INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 21844 tokens


////////////
200


In [10]:
index_summaries = {
    wiki_title: (
        f"This content contains Wikipedia articles about {wiki_title}. "
        f"Use this index if you need to lookup specific facts about {wiki_title}.\n"
        "Do not use this index if you want to analyze multiple cities."
    )
    for wiki_title in wiki_titles
}

In [11]:
query_engine = vector_indices["Toronto"].as_query_engine()
response = query_engine.query("What are the sports teams in Toronto?")

INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 8 tokens


////////////
200


INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1835 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens


////////////
200


In [12]:
print(str(response))

Toronto is represented in five major league sports, with teams in the National Hockey League (NHL), Major League Baseball (MLB), National Basketball Association (NBA), Canadian Football League (CFL), and Major League Soccer (MLS). The specific teams are the Toronto Maple Leafs, Toronto Blue Jays, Toronto Raptors, Toronto Argonauts, and Toronto FC.


In [13]:
from llama_index.indices.composability import ComposableGraph

graph = ComposableGraph.from_indices(
    GPTSimpleKeywordTableIndex,
    [index for _, index in vector_indices.items()], 
    [summary for _, summary in index_summaries.items()],
    max_keywords_per_chunk=50
)

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 0 tokens


In [14]:
# get root index
root_index = graph.get_index(graph.root_id)

# set id of root index
root_index.set_index_id("compare_contrast")

In [15]:
# define decompose_transform
from llama_index.indices.query.query_transform.base import DecomposeQueryTransform

decompose_transform = DecomposeQueryTransform(
    llm_predictor_chatgpt, verbose=True
)


In [16]:
# define custom retrievers
from llama_index.query_engine.transform_query_engine import TransformQueryEngine

custom_query_engines = {}

for index in vector_indices.values():
    query_engine = index.as_query_engine(service_context=service_context)
    query_engine = TransformQueryEngine(
        query_engine,
        query_transform=decompose_transform,
        transform_extra_info={'index_summary': index.index_struct.summary},
    )
    custom_query_engines[index.index_id] = query_engine

custom_query_engines[graph.root_id] = graph.root_index.as_query_engine(
    retriever_mode='simple',
    response_mode='tree_summarize',
    service_context=service_context,
    verbose=True,
)


In [17]:
# define graph
graph_query_engine = graph.as_query_engine(
    custom_query_engines=custom_query_engines
)



In [18]:
query_str = (
    "Compare and contrast the arts and culture of Houston and Boston. "
)
response = graph_query_engine.query(query_str)



INFO:llama_index.indices.keyword_table.retrievers:> Starting query: Compare and contrast the arts and culture of Houston and Boston. 
INFO:llama_index.indices.keyword_table.retrievers:query keywords: ['boston', 'culture', 'compare', 'arts', 'contrast', 'houston']
INFO:llama_index.indices.keyword_table.retrievers:> Extracted keywords: ['boston', 'houston']


////////////
200
[33;1m[1;3m> Current query: Compare and contrast the arts and culture of Houston and Boston. 
[0m[38;5;200m[1;3m> New query: What are some notable cultural institutions or events in Boston?
[0m

INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 11 tokens


////////////
200
////////////
200
[33;1m[1;3m> Current query: Compare and contrast the arts and culture of Houston and Boston. 
[0m[38;5;200m[1;3m> New query: What are some notable cultural institutions or events in Boston?
[0m

INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1949 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens


////////////
200
////////////
200
[33;1m[1;3m> Current query: Compare and contrast the arts and culture of Houston and Boston. 
[0m[38;5;200m[1;3m> New query: What are some notable cultural institutions or events in Houston?
[0m////////////
200


INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 11 tokens


////////////
200
[33;1m[1;3m> Current query: Compare and contrast the arts and culture of Houston and Boston. 
[0m[38;5;200m[1;3m> New query: What are some notable cultural institutions or events in Houston?
[0m

INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1856 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens


////////////
200


INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 521 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 521 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens


////////////
200


In [19]:
print(response)


Both Houston and Boston have a rich arts and culture scene. Boston is known for its classical music institutions such as Boston Symphony Hall and the Handel and Haydn Society, as well as its museums like the Museum of Fine Arts and the Isabella Stewart Gardner Museum. It also has a vibrant theater scene with venues like the Colonial Theater and the Cutler Majestic Theatre. Boston also hosts events like the Boston Early Music Festival and the Boston Arts Festival.

Houston, on the other hand, is known for its rodeo and livestock show, as well as its diverse festivals like the Houston Greek Festival and the Art Car Parade. It also has a thriving theater district and museums like the Museum of Fine Arts and the Houston Museum of Natural Science. Houston also has a strong focus on contemporary art with institutions like the Contemporary Arts Museum Houston and the Station Museum of Contemporary Art.

Overall, while both cities have a strong arts and culture scene, Boston's focus is more on

In [20]:
from llama_index.tools.query_engine import QueryEngineTool

query_engine_tools = []

# add vector index tools
for wiki_title in wiki_titles:
    index = vector_indices[wiki_title]
    summary = index_summaries[wiki_title]
    
    query_engine = index.as_query_engine(service_context=service_context)
    vector_tool = QueryEngineTool.from_defaults(query_engine, description=summary)
    query_engine_tools.append(vector_tool)


# add graph tool
graph_description = (
    "This tool contains Wikipedia articles about multiple cities. "
    "Use this tool if you want to compare multiple cities. "
)
graph_tool = QueryEngineTool.from_defaults(graph_query_engine, description=graph_description)
query_engine_tools.append(graph_tool)


In [21]:
from llama_index.query_engine.router_query_engine import RouterQueryEngine
from llama_index.selectors.llm_selectors import LLMSingleSelector

router_query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(service_context=service_context),
    query_engine_tools=query_engine_tools
)


In [22]:
# ask a compare/contrast question 
response = router_query_engine.query(
    "Compare and contrast the arts and culture of Houston and Boston.",
)


INFO:llama_index.query_engine.router_query_engine:Selecting query engine 5: This tool contains Wikipedia articles about multiple cities. Use this tool if you want to compare multiple cities..
INFO:llama_index.indices.keyword_table.retrievers:> Starting query: Compare and contrast the arts and culture of Houston and Boston.
INFO:llama_index.indices.keyword_table.retrievers:query keywords: ['boston', 'culture', 'compare', 'arts', 'contrast', 'houston']
INFO:llama_index.indices.keyword_table.retrievers:> Extracted keywords: ['boston', 'houston']


////////////
200
////////////
200
[33;1m[1;3m> Current query: Compare and contrast the arts and culture of Houston and Boston.
[0m[38;5;200m[1;3m> New query: What are some notable cultural institutions or events in Boston?
[0m

INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 11 tokens


////////////
200
////////////
200
[33;1m[1;3m> Current query: Compare and contrast the arts and culture of Houston and Boston.
[0m[38;5;200m[1;3m> New query: What are some notable cultural institutions or events in Boston?
[0m

INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1949 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens


////////////
200
////////////
200
[33;1m[1;3m> Current query: Compare and contrast the arts and culture of Houston and Boston.
[0m[38;5;200m[1;3m> New query: What are some notable cultural institutions or events in Houston?
[0m

INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 11 tokens


////////////
200
////////////
200
[33;1m[1;3m> Current query: Compare and contrast the arts and culture of Houston and Boston.
[0m[38;5;200m[1;3m> New query: What are some notable cultural institutions or events in Houston?
[0m

INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1856 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens


////////////
200


INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 459 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 459 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens


////////////
200


In [23]:
print(response)


Houston and Boston both have a rich arts and culture scene. In Boston, notable institutions and events include Boston Symphony Hall, Boston Ballet, Museum of Fine Arts, and the Boston Marathon. Houston also has a diverse range of cultural institutions and events, such as the Houston Livestock Show and Rodeo, Museum of Fine Arts, and the Houston Theater District. Both cities have museums, theaters, and festivals that celebrate their unique cultures. However, Houston has events such as the Houston Greek Festival and the Art Car Parade that are not found in Boston, while Boston has events such as the Boston Early Music Festival and the Italian summer feasts in the North End that are not found in Houston. Overall, both cities offer a vibrant arts and culture scene that is worth exploring.


In [24]:
response = router_query_engine.query("What are the sports teams in Toronto?")


INFO:llama_index.query_engine.router_query_engine:Selecting query engine 0: This content contains Wikipedia articles about Toronto. Use this index if you need to lookup specific facts about Toronto. Do not use this index if you want to analyze multiple cities..


////////////
200


INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 8 tokens


////////////
200


INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1835 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens


////////////
200


In [25]:
print(response)


Toronto is represented in five major league sports, with teams in the National Hockey League (NHL), Major League Baseball (MLB), National Basketball Association (NBA), Canadian Football League (CFL), and Major League Soccer (MLS). The specific teams are the Toronto Maple Leafs, Toronto Blue Jays, Toronto Raptors, Toronto Argonauts, and Toronto FC.
