In [None]:
%pip install -U git+https://github.com/santiagxf/llama_index.git@santiagxf/azure-ai-inference#subdirectory=llama-index-integrations/llms/llama-index-llms-azureai
%pip install -U git+https://github.com/santiagxf/llama_index.git@santiagxf/azure-ai-inference#subdirectory=llama-index-integrations/embeddings/llama-index-embeddings-azureai

In [4]:
import nest_asyncio

nest_asyncio.apply()

In [5]:
from llama_index.core.tools import ToolMetadata
from llama_index.core.llms import ChatMessage
from llama_index.core.selectors import LLMSingleSelector

from llama_index.llms.azureai.inference import AzureAIModelInference as AzureAIModelInferenceLLM
from llama_index.embeddings.azureai.inference import AzureAIModelInference as AzureAIModelInferenceEmbedding

In [48]:
import os
from dotenv import load_dotenv

load_dotenv(".env")

True

In [49]:
llm = AzureAIModelInferenceLLM(
    endpoint=os.environ["AZURE_AI_COHERE_CMDR_ENDPOINT_URL"],
    credential=os.environ['AZURE_AI_COHERE_CMDR_ENDPOINT_KEY']
)

In [51]:
phi3 = AzureAIModelInferenceLLM(
    endpoint=os.environ["AZURE_AI_PHI3_MINI_ENDPOINT_URL"],
    credential=os.environ['AZURE_AI_PHI3_MINI_ENDPOINT_KEY']
)

In [52]:
from llama_index.core import Settings

Settings.llm = llm
Settings.embed_model = AzureAIModelInferenceEmbedding(
    endpoint=os.environ["AZURE_AI_COHERE_EMBED_ENDPOINT_URL"],
    credential=os.environ['AZURE_AI_COHERE_EMBED_ENDPOINT_KEY'],
    client_kwargs={}
)

In [53]:
from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader("data/paul_graham").load_data()

In [54]:
Settings.chunk_size = 1024
nodes = Settings.node_parser.get_nodes_from_documents(documents)

In [55]:
from llama_index.core import StorageContext

# initialize storage context (by default it's in-memory)
storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)

In [56]:
from llama_index.core import SummaryIndex
from llama_index.core import VectorStoreIndex

summary_index = SummaryIndex(nodes, storage_context=storage_context)
vector_index = VectorStoreIndex(nodes, storage_context=storage_context)

In [57]:
list_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()

In [58]:
from llama_index.core.tools import QueryEngineTool

list_tool = QueryEngineTool.from_defaults(
    query_engine=list_query_engine,
    description=(
        "Useful for summarization questions related to Paul Graham eassy on"
        " What I Worked On."
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from Paul Graham essay on What"
        " I Worked On."
    ),
)

In [65]:
from llama_index.core.query_engine import RouterQueryEngine
query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(llm=phi3),
    query_engine_tools=[
        list_tool,
        vector_tool,
    ],
)

In [66]:
response = query_engine.query("What is the summary of the document?")
print(str(response))

The document is a collection of essays by Paul Graham, a computer programmer, entrepreneur, and investor, detailing his journey as a writer, programmer, and entrepreneur. Graham reflects on his early experiences with programming and writing, his education, and his evolving interests. He discusses his decision to pursue philosophy and AI in college, his graduate school experience, and his realization that the field of AI was a hoax, which led him to focus on Lisp. Graham also writes about his time at art school and his career at Interleaf, a software company. He founded Viaweb, an e-commerce software company, and later co-founded Y Combinator, a successful startup investment and mentoring program. Graham also worked on open-source projects, such as a new dialect of Lisp called Arc, and discovered the power of publishing essays online. He explores the impact of customs and traditions in various fields and shares his thoughts on independent thinking and rapid change. The essays offer insi

In [67]:
response = query_engine.query("What did Paul Graham do after RICS?")
print(str(response))

After leaving RICS, Paul Graham returned to New York and resumed his previous life, but with the added benefit of financial security. He continued painting and experimented with new techniques, combining traditional painting with photography and printmaking. He also began searching for a new apartment to buy, contemplating which neighborhood to live in. During this time, he had an idea for a new startup, which led him to eventually found a new company called Aspra.
