# Learn Llama_index

Doc [Llama_index with Azure OpenAI](https://docs.llamaindex.ai/en/stable/examples/customization/llms/AzureOpenAI/)

## Start from basic examples. Using Azure OpenAI

In [1]:
print ('Hello')

from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
import logging
import sys

logging.basicConfig(
    stream=sys.stdout, level=logging.INFO
)  # logging.DEBUG for more verbose output
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

print ('World')

Hello
World


In [2]:
import os

api_key = os.environ["api_key"]
azure_endpoint = os.environ["azure_endpoint"]
api_version = os.environ["api_version"]

llm = AzureOpenAI(
    model="gpt-35-turbo",
    deployment_name="gpt35",
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
)

# You need to deploy your own embedding model as well as your own chat completion model
embed_model = AzureOpenAIEmbedding(
    model="text-embedding-ada-002",
    deployment_name="embedding-ada-002",
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
)

In [3]:
from llama_index.core import Settings

Settings.llm = llm
Settings.embed_model = embed_model

In [4]:
documents = SimpleDirectoryReader(
    input_files=["./paul_graham_essay.txt"]
).load_data()
index = VectorStoreIndex.from_documents(documents)

INFO:httpx:HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/embedding-ada-002/embeddings?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/embedding-ada-002/embeddings?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/embedding-ada-002/embeddings?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/embedding-ada-002/embeddings?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/embedding-ada-002/embeddings?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/embedding-ada-002/embeddings?api-version=2023-07-01-preview "

In [5]:
query = "What is most interesting about this essay?"
query_engine = index.as_query_engine()
answer = query_engine.query(query)

print(answer.get_formatted_sources())
print("query was:", query)
print("answer was:", answer)

INFO:httpx:HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/embedding-ada-002/embeddings?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/embedding-ada-002/embeddings?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
> Source (Doc id: 26c98266-13e2-4abb-9856-22f4f42848f5): Notes

[1] My experience skipped a step in the evolution of computers: time-sharing machines wi...

> Source (Doc id: c3313b5e-f7d9-4847-8817-914db725fe6b): What I Worked On

February 2021

Before college the two main things I worked on, outside of s...
query was: What is most

## Learn 'Query pipeline'

[llama_index/blob/main/docs/docs/examples/pipeline/query_pipeline.ipynb](https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/pipeline/query_pipeline.ipynb)

Todo:
- to store the graph representation. with `networkx` or `pygraphviz`
- why do we need a cohere api key?
- Finish the walkthrough

### Setup

Error.

```shell
Traceback (most recent call last):

  File c:\Users\haiyang\miniconda3\envs\llamaindex\Lib\site-packages\IPython\core\interactiveshell.py:3553 in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  Cell In[2], line 2
    import phoenix as px

  File c:\Users\haiyang\miniconda3\envs\llamaindex\Lib\site-packages\phoenix\__init__.py:56
    except PhoenixError, e:
           ^
SyntaxError: multiple exception types must be parenthesized
```

Try `pip install llama-index-callbacks-arize-phoenix`.

Error with install `llama-index-callbacks-arize-phoenix`:

```shell
      building 'hdbscan._hdbscan_tree' extension
      error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for hdbscan
Failed to build hdbscan
ERROR: Could not build wheels for hdbscan, which is required to install pyproject.toml-based projects
```

Try install MS Visual Studio (with only C++ related options) [ref](https://github.com/run-llama/llama_index/issues/10602#issuecomment-1939692627)

This fixes the issue in installing `llama-index-callbacks-arize-phoenix`

After `llama-index-callbacks-arize-phoenix` installed, the `import phoenix as px` issue is fixed.

In [6]:
# setup Arize Phoenix for logging/observability
import phoenix as px

px.launch_app()
import llama_index.core

llama_index.core.set_global_handler("arize_phoenix")

INFO:phoenix.datasets.dataset:Dataset: phoenix_dataset_c99fd02a-0f34-4aa7-9401-34c234f5f601 initialized
Dataset: phoenix_dataset_c99fd02a-0f34-4aa7-9401-34c234f5f601 initialized


  from .autonotebook import tqdm as notebook_tqdm


🌍 To view the Phoenix app in your browser, visit http://localhost:6006/
📺 To view the Phoenix app in a notebook, run `px.active_session().view()`
📖 For more information on how to use Phoenix, check out https://docs.arize.com/phoenix


In [7]:
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
from llama_index.core import Settings

# Settings.llm = OpenAI(model="gpt-3.5-turbo")
# Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

llm = AzureOpenAI(
    model="gpt-35-turbo",
    deployment_name="gpt35",
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
)

# You need to deploy your own embedding model as well as your own chat completion model
embed_model = AzureOpenAIEmbedding(
    model="text-embedding-ada-002",
    deployment_name="embedding-ada-002",
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
)

Settings.llm = llm
Settings.embed_model = embed_model

In [8]:
from llama_index.core import SimpleDirectoryReader

reader = SimpleDirectoryReader("./paul_graham")

docs = reader.load_data()

import os
from llama_index.core import (
    StorageContext,
    VectorStoreIndex,
    load_index_from_storage,
)

if not os.path.exists("storage"):
    index = VectorStoreIndex.from_documents(docs)
    # save index to disk
    index.set_index_id("vector_index")
    index.storage_context.persist("./storage")
else:
    # rebuild storage context
    storage_context = StorageContext.from_defaults(persist_dir="storage")
    # load index
    index = load_index_from_storage(storage_context, index_id="vector_index")

INFO:llama_index.core.indices.loading:Loading indices with ids: ['vector_index']
Loading indices with ids: ['vector_index']


### Chain together Prompt and LLM

In [9]:
from llama_index.core.query_pipeline import QueryPipeline
from llama_index.core import PromptTemplate

# try chaining basic prompts
prompt_str = "Please generate related movies to {movie_name}"
prompt_tmpl = PromptTemplate(prompt_str)
llm = AzureOpenAI(
    model="gpt-35-turbo",
    deployment_name="gpt35",
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
)

p = QueryPipeline(chain=[prompt_tmpl, llm], verbose=True)

In [10]:
output = p.run(movie_name="The Departed")

[1;3;38;2;155;135;227m> Running module cf1bf591-492b-4a96-b50f-33b33aee0ca5 with input: 
movie_name: The Departed

[0m[1;3;38;2;155;135;227m> Running module 252752da-239d-4752-8bb9-829779b1ca55 with input: 
messages: Please generate related movies to The Departed

[0mINFO:httpx:HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"


In [11]:
print(str(output))

assistant: 1. Infernal Affairs (2002) - This is the original Hong Kong film that inspired The Departed. It follows a similar storyline of undercover cops infiltrating the criminal underworld.

2. Internal Affairs (1990) - This American crime thriller, starring Richard Gere and Andy Garcia, revolves around a corrupt cop and an internal affairs officer determined to expose him.

3. The Town (2010) - Directed by and starring Ben Affleck, this crime drama follows a group of bank robbers in Boston who find themselves in a dangerous situation when a heist goes wrong.

4. Heat (1995) - Directed by Michael Mann, this crime thriller features Al Pacino and Robert De Niro as a detective and a professional thief, respectively, whose paths cross in a high-stakes game of cat and mouse.

5. The Departed (2006) - Although it's the movie that inspired this list, it's worth mentioning The Departed itself. Directed by Martin Scorsese, it tells the story of an undercover cop and a mole in the police force

Try output parsing

In [12]:
from typing import List
from pydantic import BaseModel, Field
from llama_index.core.output_parsers import PydanticOutputParser


class Movie(BaseModel):
    """Object representing a single movie."""

    name: str = Field(..., description="Name of the movie.")
    year: int = Field(..., description="Year of the movie.")


class Movies(BaseModel):
    """Object representing a list of movies."""

    movies: List[Movie] = Field(..., description="List of movies.")

llm = AzureOpenAI(
    model="gpt-35-turbo",
    deployment_name="gpt35",
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
)
output_parser = PydanticOutputParser(Movies)
json_prompt_str = """\
Please generate related movies to {movie_name}. Output with the following JSON format: 
"""
json_prompt_str = output_parser.format(json_prompt_str)

# add JSON spec to prompt template
json_prompt_tmpl = PromptTemplate(json_prompt_str)

p = QueryPipeline(chain=[json_prompt_tmpl, llm, output_parser], verbose=True)
output = p.run(movie_name="Toy Story")

[1;3;38;2;155;135;227m> Running module a03729e8-a136-43af-a01a-e5b00fc00705 with input: 
movie_name: Toy Story

[0m[1;3;38;2;155;135;227m> Running module 87fad292-b887-4e2b-a8cf-79f9c9c5ff00 with input: 
messages: Please generate related movies to Toy Story. Output with the following JSON format: 



Here's a JSON schema to follow:
{"$defs": {"Movie": {"description": "Object representing a single movie.", "prop...

[0mINFO:httpx:HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
[1;3;38;2;155;135;227m> Running module bc4a1b9c-e9e9-49d3-8415-4b9ea75ed595 with input: 
input: assistant: {
  "movies": [
    {
      "name": "Finding Nemo",
      "year": 2003
    },
    {
      "name": "Cars",
      "year": 2006
    },
    {
 

In [13]:
output

Movies(movies=[Movie(name='Finding Nemo', year=2003), Movie(name='Cars', year=2006), Movie(name='Up', year=2009), Movie(name='Inside Out', year=2015), Movie(name='Coco', year=2017)])

In [14]:
prompt_str = "Please generate related movies to {movie_name}"
prompt_tmpl = PromptTemplate(prompt_str)
# let's add some subsequent prompts for fun
prompt_str2 = """\
Here's some text:

{text}

Can you rewrite this with a summary of each movie?
"""
prompt_tmpl2 = PromptTemplate(prompt_str2)
llm = AzureOpenAI(
    model="gpt-35-turbo",
    deployment_name="gpt35",
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
)
llm_c = llm.as_query_component(streaming=True)

p = QueryPipeline(
    chain=[prompt_tmpl, llm_c, prompt_tmpl2, llm_c], verbose=True
)
# p = QueryPipeline(chain=[prompt_tmpl, llm_c], verbose=True)

output = p.run(movie_name="The Dark Knight")
for o in output:
    print(o.delta, end="")

[1;3;38;2;155;135;227m> Running module 70ba6463-ca9e-4d07-b30c-cb7f53741b7b with input: 
movie_name: The Dark Knight

[0m[1;3;38;2;155;135;227m> Running module 4db51fd4-b8b8-4b53-98e5-ee54aaf3ab18 with input: 
messages: Please generate related movies to The Dark Knight

[0m[1;3;38;2;155;135;227m> Running module e4c13609-9df9-4487-832b-60d7e49b90ae with input: 
text: <generator object llm_chat_callback.<locals>.wrap.<locals>.wrapped_llm_chat.<locals>.wrapped_gen at 0x00000184639E7670>

[0mINFO:httpx:HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
[1;3;38;2;155;135;227m> Running module 218d70ba-1942-45cc-85d2-b7c1ee9d0ab9 with input: 
messages: Here's some text:

1. Batman Begins (2005)
2. The Dark Knight Rises (2012)

In [15]:
p = QueryPipeline(
    chain=[
        json_prompt_tmpl,
        llm.as_query_component(streaming=True),
        output_parser,
    ],
    verbose=True,
)
output = p.run(movie_name="Toy Story")
print(output)

[1;3;38;2;155;135;227m> Running module 656d886f-1746-4a28-9d5c-86b3a7766e5f with input: 
movie_name: Toy Story

[0m[1;3;38;2;155;135;227m> Running module 5e82954c-4839-4925-b419-cacf8bd2e97b with input: 
messages: Please generate related movies to Toy Story. Output with the following JSON format: 



Here's a JSON schema to follow:
{"$defs": {"Movie": {"description": "Object representing a single movie.", "prop...

[0m[1;3;38;2;155;135;227m> Running module 44c435dd-a261-4cac-bda4-6bc9ca9e3a9e with input: 
input: <generator object llm_chat_callback.<locals>.wrap.<locals>.wrapped_llm_chat.<locals>.wrapped_gen at 0x0000018463AD9250>

[0mINFO:httpx:HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
movies=[Movie(name='Findi

### Chain Together Qury Rewrting Workflow (propmt + LLM) with Retrieval

In [16]:
#!pip install llama-index-postprocessor-cohere-rerank

In [17]:
from llama_index.postprocessor.cohere_rerank import CohereRerank

# generate question regarding topic
prompt_str1 = "Please generate a concise question about Paul Graham's life regarding the following topic {topic}"
prompt_tmpl1 = PromptTemplate(prompt_str1)
# use HyDE to hallucinate answer.
prompt_str2 = (
    "Please write a passage to answer the question\n"
    "Try to include as many key details as possible.\n"
    "\n"
    "\n"
    "{query_str}\n"
    "\n"
    "\n"
    'Passage:"""\n'
)
prompt_tmpl2 = PromptTemplate(prompt_str2)

llm = AzureOpenAI(
    model="gpt-35-turbo",
    deployment_name="gpt35",
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
)

retriever = index.as_retriever(similarity_top_k=5)
p = QueryPipeline(
    chain=[prompt_tmpl1, llm, prompt_tmpl2, llm, retriever], verbose=True
)

In [18]:
nodes = p.run(topic="college")
len(nodes)

[1;3;38;2;155;135;227m> Running module d8b63681-2b37-4cda-94dc-105b0e213a6d with input: 
topic: college

[0m[1;3;38;2;155;135;227m> Running module befa5e03-7ac4-47e1-a360-8edb156b59bc with input: 
messages: Please generate a concise question about Paul Graham's life regarding the following topic college

[0mINFO:httpx:HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
[1;3;38;2;155;135;227m> Running module 2ed88a25-8b01-4f6a-8b04-b1937ddb6264 with input: 
query_str: assistant: How did Paul Graham's college experience shape his career and entrepreneurial mindset?

[0m[1;3;38;2;155;135;227m> Running module a052e8d3-3be4-4182-80c1-a5bc1adb84d2 with input: 
messages: Please write a passage to answer the question
Try to inc

5

### Create a Full RAG Pipeline as a DAG

In [19]:
from llama_index.postprocessor.cohere_rerank import CohereRerank
from llama_index.core.response_synthesizers import TreeSummarize

# TODO COHERE_API_KEY should be set explicitly
os.environ["COHERE_API_KEY"]=api_key

# define modules
prompt_str = "Please generate a question about Paul Graham's life regarding the following topic {topic}"
prompt_tmpl = PromptTemplate(prompt_str)
llm = AzureOpenAI(
    model="gpt-35-turbo",
    deployment_name="gpt35",
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
)
retriever = index.as_retriever(similarity_top_k=3)
reranker = CohereRerank()
summarizer = TreeSummarize(llm=llm)

In [20]:
# define query pipeline
p = QueryPipeline(verbose=True)
p.add_modules(
    {
        "llm": llm,
        "prompt_tmpl": prompt_tmpl,
        "retriever": retriever,
        "summarizer": summarizer,
        "reranker": reranker,
    }
)

In [21]:
p.add_link("prompt_tmpl", "llm")
p.add_link("llm", "retriever")
p.add_link("retriever", "reranker", dest_key="nodes")
p.add_link("llm", "reranker", dest_key="query_str")
p.add_link("reranker", "summarizer", dest_key="nodes")
p.add_link("llm", "summarizer", dest_key="query_str")

# look at summarizer input keys
print(summarizer.as_query_component().input_keys)

required_keys={'query_str', 'nodes'} optional_keys=set()


In [22]:
# !pip install pyvis

In [23]:
# !pip install pygraphviz 

In [24]:
# TODO: Need to fix this part: to store the grapsh representation of the pipeline
# using pyvis: encoding error
# using pygraphviz: cannot install pygraphviz in the current environment 

## create graph
# from pyvis.network import Network

# net = Network(notebook=True, cdn_resources="in_line", directed=True)
# net.from_nx(p.dag)
# net.show("rag_dag.html")

# # another option using `pygraphviz`
# from networkx.drawing.nx_agraph import to_agraph
# from IPython.display import Image
# agraph = to_agraph(p.dag)
# agraph.layout(prog="dot")
# agraph.draw('rag_dag.png')
# display(Image('rag_dag.png'))

In [25]:
# TODO: Needs cohere api key.
# TODO: fix the issue with the graph representation

# response = p.run(topic="YC")

[1;3;38;2;155;135;227m> Running module prompt_tmpl with input: 
topic: YC

[0m[1;3;38;2;155;135;227m> Running module llm with input: 
messages: Please generate a question about Paul Graham's life regarding the following topic YC

[0mINFO:httpx:HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/gpt35/chat/completions?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
[1;3;38;2;155;135;227m> Running module retriever with input: 
input: assistant: What role did Paul Graham play in the founding and development of Y Combinator (YC)?

[0mINFO:httpx:HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/embedding-ada-002/embeddings?api-version=2023-07-01-preview "HTTP/1.1 200 OK"
HTTP Request: POST https://haiyang-azopenai-test.openai.azure.com//openai/deployments/embe

ApiError: status_code: 403, body: <!doctype html><meta charset="utf-8"><meta name=viewport content="width=device-width, initial-scale=1"><title>403</title>403 Forbidden