# LlamaIndex + Pinecone + TruLens

In this quickstart you will create a simple Llama Index App with Pinecone to answer complex queries over multiple data sources.  You will also log it with TruLens and get feedback on an LLM response.

* While Pinecone provides a powerful and efficient retrieval engine, it remains challenging to answer complex questions that require multi-step reasoning and synthesis over many data sources.

* With LlamaIndex, we combine the power of vector similiarty search and multi-step reasoning to delivery higher quality and richer responses.

* On top of it all, TruLens allows us to get feedback track and manage our experiments and get feedback on the quality of our app.

Here, we show 2 specific use-cases:

1. compare and contrast queries over Wikipedia articles about different cities.

2. temporal queries that require reasoning over time

## Setup
### Add API keys
For this quickstart you will need Open AI and Huggingface keys

In [1]:
! pip install trulens 

Collecting trulens
  Using cached trulens-0.13.3-py3-none-any.whl (95 kB)
Installing collected packages: trulens
Successfully installed trulens-0.13.3


In [2]:
import os
os.environ["OPENAI_API_KEY"] = ""
os.environ["HUGGINGFACE_API_KEY"] = ""

PINECONE_API_KEY = ""
PINECONE_ENV = ""

### Import from Pinecone, LlamaIndex and TruLens

In [3]:
# Pinecone
import pinecone
# TruLens
from trulens_eval import TruLlama, Feedback, Huggingface, Tru
tru = Tru()
# LlamaIndex
from llama_index import VectorStoreIndex
from llama_index import StorageContext
from llama_index.vector_stores import PineconeVectorStore
from llama_index.indices.composability import ComposableGraph
from llama_index.indices.keyword_table.simple_base import SimpleKeywordTableIndex
from llama_index.indices.query.query_transform.base import DecomposeQueryTransform
from llama_index.query_engine.transform_query_engine import TransformQueryEngine

# Others
from pathlib import Path
import requests

  from tqdm.autonotebook import tqdm




### Initialize Pinecone Index

In [4]:
pinecone.init(api_key = PINECONE_API_KEY, environment=PINECONE_ENV)

# create index if it does not already exist
# dimensions are for text-embedding-ada-002
pinecone.create_index("quickstart-index",
    dimension=1536,
    metric="euclidean",
    pod_type="starter")

pinecone_index = pinecone.Index("quickstart-index")

 ## Load Dataset

In [5]:
from llama_index import SimpleDirectoryReader

In [6]:
wiki_titles = ["Toronto", "Seattle", "San Francisco", "Chicago", "Boston", "Washington, D.C.", "Cambridge, Massachusetts", "Houston"]

data_path = Path('data_wiki')

for title in wiki_titles:
    response = requests.get(
        'https://en.wikipedia.org/w/api.php',
        params={
            'action': 'query',
            'format': 'json',
            'titles': title,
            'prop': 'extracts',
            'explaintext': True,
        }
    ).json()
    page = next(iter(response['query']['pages'].values()))
    wiki_text = page['extract']

    if not data_path.exists():
        Path.mkdir(data_path)

    with open(data_path / f"{title}.txt", 'w') as fp:
        fp.write(wiki_text)
        
 # Load all wiki documents
city_docs = {}
all_docs = []
for wiki_title in wiki_titles:
    city_docs[wiki_title] = SimpleDirectoryReader(input_files=[data_path / f"{wiki_title}.txt"]).load_data()
    all_docs.extend(city_docs[wiki_title])


### Build Indices

In [7]:
# Build index for each city document
city_indices = {}
index_summaries = {}
for wiki_title in wiki_titles:
    print(f"Building index for {wiki_title}")
    # create storage context
    vector_store = PineconeVectorStore(pinecone_index=pinecone_index, namespace=wiki_title)
    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    
    # build index
    city_indices[wiki_title] = VectorStoreIndex.from_documents(city_docs[wiki_title], storage_context=storage_context)

    # set summary text for city
    index_summaries[wiki_title] = f"Wikipedia articles about {wiki_title}"


None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.


Building index for Toronto


Upserted vectors:   0%|          | 0/20 [00:00<?, ?it/s]

Building index for Seattle


Upserted vectors:   0%|          | 0/17 [00:00<?, ?it/s]

Building index for San Francisco


Upserted vectors:   0%|          | 0/24 [00:00<?, ?it/s]

Building index for Chicago


Upserted vectors:   0%|          | 0/25 [00:00<?, ?it/s]

Building index for Boston


Upserted vectors:   0%|          | 0/18 [00:00<?, ?it/s]

Building index for Washington, D.C.


Upserted vectors:   0%|          | 0/23 [00:00<?, ?it/s]

Building index for Cambridge, Massachusetts


Upserted vectors:   0%|          | 0/13 [00:00<?, ?it/s]

Building index for Houston


Upserted vectors:   0%|          | 0/21 [00:00<?, ?it/s]

### Build Graph Query Engine for Compare & Contrast Query

In [8]:
graph = ComposableGraph.from_indices(
    SimpleKeywordTableIndex,
    [index for _, index in city_indices.items()], 
    [summary for _, summary in index_summaries.items()],
    max_keywords_per_chunk=50
)



decompose_transform = DecomposeQueryTransform(verbose=True)

custom_query_engines = {}
for wiki_title in wiki_titles:
    index = city_indices[wiki_title]
    query_engine = index.as_query_engine()
    query_engine = TransformQueryEngine(
        query_engine,
        query_transform=decompose_transform,
        transform_extra_info={'index_summary': index_summaries[wiki_title]},
    )
    custom_query_engines[index.index_id] = query_engine

custom_query_engines[graph.root_id] = graph.root_index.as_query_engine(
    retriever_mode='simple',
    response_mode='tree_summarize',
)

# with query decomposition in subindices
query_engine = graph.as_query_engine(custom_query_engines=custom_query_engines)

### Run Query

In [9]:
response = query_engine.query("Compare and contrast the demographics in Seattle, Houston, and Toronto.")

from llama_index.response.pprint_utils import pprint_response

pprint_response(response)

[33;1m[1;3m> Current query: Compare and contrast the demographics in Seattle, Houston, and Toronto.
[0m[38;5;200m[1;3m> New query:  What is the population of Houston?
[0m[33;1m[1;3m> Current query: Compare and contrast the demographics in Seattle, Houston, and Toronto.
[0m[38;5;200m[1;3m> New query:  What is the population of Houston?
[0m[33;1m[1;3m> Current query: Compare and contrast the demographics in Seattle, Houston, and Toronto.
[0m[38;5;200m[1;3m> New query:  What is the population of Toronto?
[0m[33;1m[1;3m> Current query: Compare and contrast the demographics in Seattle, Houston, and Toronto.
[0m[38;5;200m[1;3m> New query:  What is the population of Toronto?
[0m[33;1m[1;3m> Current query: Compare and contrast the demographics in Seattle, Houston, and Toronto.
[0m[38;5;200m[1;3m> New query:  What is the population of Seattle?
[0m[33;1m[1;3m> Current query: Compare and contrast the demographics in Seattle, Houston, and Toronto.
[0m[38;5;200m[1

## Initialize Feedback Function(s)

In [10]:
# Initialize Huggingface-based feedback function collection class:
hugs = Huggingface()

# Define a language match feedback function using HuggingFace.
f_lang_match = Feedback(hugs.language_match).on_input_output()
# By default this will check language match on the main app input and main app
# output.

✅ In language_match, input text1 will be set to *.__record__.main_input or `Select.RecordInput` .
✅ In language_match, input text2 will be set to *.__record__.main_output or `Select.RecordOutput` .


## Instrument chain for logging with TruLens

In [11]:
tru_query_engine = TruLlama(query_engine,
    app_id='LlamaIndex_with_Pinecone_App1',
    feedbacks=[f_lang_match])

✅ app LlamaIndex_with_Pinecone_App1 -> default.sqlite
✅ feedback def. feedback_definition_hash_81275c68ccfb6a7f48908e7d3841f7e0 -> default.sqlite


In [12]:
# Instrumented query engine can operate like the original:
llm_response = tru_query_engine.query("Compare and contrast the demographics in Seattle, Houston, and Toronto.")

print(llm_response)

[33;1m[1;3m> Current query: Compare and contrast the demographics in Seattle, Houston, and Toronto.
[0m[38;5;200m[1;3m> New query:  What is the population of Houston?
[0m[33;1m[1;3m> Current query: Compare and contrast the demographics in Seattle, Houston, and Toronto.
[0m[38;5;200m[1;3m> New query:  What is the population of Houston?
[0m[33;1m[1;3m> Current query: Compare and contrast the demographics in Seattle, Houston, and Toronto.
[0m[38;5;200m[1;3m> New query:  What is the population of Toronto?
[0m[33;1m[1;3m> Current query: Compare and contrast the demographics in Seattle, Houston, and Toronto.
[0m[38;5;200m[1;3m> New query:  What is the population of Toronto?
[0m[33;1m[1;3m> Current query: Compare and contrast the demographics in Seattle, Houston, and Toronto.
[0m[38;5;200m[1;3m> New query:  What is the population of Seattle?
[0m[33;1m[1;3m> Current query: Compare and contrast the demographics in Seattle, Houston, and Toronto.
[0m[38;5;200m[1

## Explore in a Dashboard

In [13]:
tru.run_dashboard() # open a local streamlit app to explore

# tru.stop_dashboard() # stop if needed


Starting dashboard ...


Accordion(children=(VBox(children=(VBox(children=(Label(value='STDOUT'), Output())), VBox(children=(Label(valu…

Waiting for {'error': 'Model papluca/xlm-roberta-base-language-detection is currently loading', 'estimated_time': 44.49275207519531} (44.49275207519531) second(s).


Dashboard started at http://192.168.4.23:8501 .


<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>