# 📓 Llama-Index Quickstart

In this quickstart you will create a simple Llama Index app and learn how to log it and get feedback on an LLM response.

For evaluation, we will leverage the "hallucination triad" of groundedness, context relevance and answer relevance.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/truera/trulens/blob/main/trulens_eval/examples/quickstart/llama_index_quickstart.ipynb)

## Setup

### Install dependencies
Let's install some of the dependencies for this notebook if we don't have them already

In [1]:
# pip install trulens_eval llama_index openai

### Add API keys
For this quickstart, you will need Open AI and Huggingface keys. The OpenAI key is used for embeddings and GPT, and the Huggingface key is used for evaluation.

In [2]:
import os
os.environ["OPENAI_API_KEY"] = "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"

### Import from TruLens

In [3]:
from trulens_eval import Tru
tru = Tru()

🦑 Tru initialized with db url sqlite:///default.sqlite .
🛑 Secret keys may be written to the database. See the `database_redact_keys` option of Tru` to prevent this.


### Download data

This example uses the text of Paul Graham’s essay, [“What I Worked On”](https://paulgraham.com/worked.html), and is the canonical llama-index example.

The easiest way to get it is to [download it via this link](https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt) and save it in a folder called data. You can do so with the following command:

In [4]:
import os
os.getcwd()

'c:\\Users\\Evan\\Desktop\\training_llm\\rag_api'

### Create Simple LLM Application

This example uses LlamaIndex which internally uses an OpenAI LLM.

In [5]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core import VectorStoreIndex,SimpleDirectoryReader,ServiceContext
documents = SimpleDirectoryReader("rag_api/uploads").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()


In [16]:

from fastapi import UploadFile, HTTPException, File
import aiofiles
import  os
import re
from llama_index.core import VectorStoreIndex,SimpleDirectoryReader,ServiceContext,StorageContext
from llama_index.vector_stores.qdrant import QdrantVectorStore
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from llama_index.embeddings.langchain import LangchainEmbedding
from tqdm import tqdm

ModuleNotFoundError: No module named 'llama_index.vector_stores'

### Send your first request

In [6]:
response = query_engine.query("What did the author do growing up?")
print(response)

The author attended government schools for 9 out of 12 years before spending 2.5 years at a particular school.


## Initialize Feedback Function(s)

In [7]:
import numpy as np

# Initialize provider class
from trulens_eval.feedback.provider.openai import OpenAI
openai = OpenAI()

# select context to be used in feedback. the location of context is app specific.
from trulens_eval.app import App
context = App.select_context(query_engine)

# imports for feedback
from trulens_eval import Feedback

# Define a groundedness feedback function
from trulens_eval.feedback import Groundedness
grounded = Groundedness(groundedness_provider=OpenAI())
f_groundedness = (
    Feedback(grounded.groundedness_measure_with_cot_reasons)
    .on(context.collect()) # collect context chunks into a list
    .on_output()
    .aggregate(grounded.grounded_statements_aggregator)
)

# Question/answer relevance between overall question and answer.
f_qa_relevance = Feedback(openai.relevance).on_input_output()

# Question/statement relevance between question and each context chunk.
f_qs_relevance = (
    Feedback(openai.qs_relevance)
    .on_input()
    .on(context)
    .aggregate(np.mean)
)

✅ In groundedness_measure_with_cot_reasons, input source will be set to __record__.app.query.rets.source_nodes[:].node.text.collect() .
✅ In groundedness_measure_with_cot_reasons, input statement will be set to __record__.main_output or `Select.RecordOutput` .
✅ In relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In relevance, input response will be set to __record__.main_output or `Select.RecordOutput` .
✅ In qs_relevance, input question will be set to __record__.main_input or `Select.RecordInput` .
✅ In qs_relevance, input statement will be set to __record__.app.query.rets.source_nodes[:].node.text .


## Instrument app for logging with TruLens

In [8]:
from trulens_eval import TruLlama
tru_query_engine_recorder = TruLlama(query_engine,
    app_id='LlamaIndex_App1',
    feedbacks=[f_groundedness, f_qa_relevance, f_qs_relevance])

In [9]:
# or as context manager
with tru_query_engine_recorder as recording:
    query_engine.query("What did the author do growing up?")

## Retrieve records and feedback

In [10]:
# The record of the app invocation can be retrieved from the `recording`:

rec = recording.get() # use .get if only one record
# recs = recording.records # use .records if multiple

display(rec)

Record(record_id='record_hash_56e53381af40be526ab632ae206ada68', app_id='LlamaIndex_App1', cost=Cost(n_requests=2, n_successful_requests=2, n_classes=0, n_tokens=2056, n_stream_chunks=0, n_prompt_tokens=2031, n_completion_tokens=25, cost=0.0030845000000000004), perf=Perf(start_time=datetime.datetime(2024, 2, 29, 16, 11, 59, 342173), end_time=datetime.datetime(2024, 2, 29, 16, 12, 3, 630191)), ts=datetime.datetime(2024, 2, 29, 16, 12, 3, 631727), tags='-', meta=None, main_input='What did the author do growing up?', main_output='The author attended government schools for 9 out of 12 years before spending 2.5 years at a particular school.', main_error=None, calls=[RecordAppCall(stack=[RecordAppCallMethod(path=Lens().app, method=Method(obj=Obj(cls=llama_index.core.query_engine.retriever_query_engine.RetrieverQueryEngine, id=2416097729744, init_bindings=None), name='query')), RecordAppCallMethod(path=Lens().app, method=Method(obj=Obj(cls=llama_index.core.query_engine.retriever_query_engine.

In [11]:
tru.run_dashboard()

Starting dashboard ...
Config file already exists. Skipping writing process.
Credentials file already exists. Skipping writing process.


Accordion(children=(VBox(children=(VBox(children=(Label(value='STDOUT'), Output())), VBox(children=(Label(valu…

Dashboard started at http://192.168.1.29:8501 .


<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>

In [12]:
# The results of the feedback functions can be rertireved from
# `Record.feedback_results` or using the `wait_for_feedback_result` method. The
# results if retrieved directly are `Future` instances (see
# `concurrent.futures`). You can use `as_completed` to wait until they have
# finished evaluating or use the utility method:

for feedback, feedback_result in rec.wait_for_feedback_results().items():
    print(feedback.name, feedback_result.result)

# See more about wait_for_feedback_results:
# help(rec.wait_for_feedback_results)

groundedness_measure_with_cot_reasons 0.5
relevance 0.8
qs_relevance 0.15000000000000002


In [13]:
records, feedback = tru.get_records_and_feedback(app_ids=["LlamaIndex_App1"])

records.head()

Unnamed: 0,app_id,app_json,type,record_id,input,output,tags,record_json,cost_json,perf_json,ts,relevance,qs_relevance,groundedness_measure_with_cot_reasons,relevance_calls,qs_relevance_calls,groundedness_measure_with_cot_reasons_calls,latency,total_tokens,total_cost
0,LlamaIndex_App1,"{""tru_class_info"": {""name"": ""TruLlama"", ""modul...",RetrieverQueryEngine(llama_index.core.query_en...,record_hash_ddf49602dea32aa32f54f9a400508f6b,"""What did the author do growing up?""","""The author attended government schools for 9 ...",-,"{""record_id"": ""record_hash_ddf49602dea32aa32f5...","{""n_requests"": 2, ""n_successful_requests"": 2, ...","{""start_time"": ""2024-02-29T16:09:24.217268"", ""...",2024-02-29T16:09:29.803582,0.8,0.15,0.5,[{'args': {'prompt': 'What did the author do g...,[{'args': {'question': 'What did the author do...,[{'args': {'source': ['@Teslarati @KlenderJoey...,5,2045,0.003063
1,LlamaIndex_App1,"{""tru_class_info"": {""name"": ""TruLlama"", ""modul...",RetrieverQueryEngine(llama_index.core.query_en...,record_hash_56e53381af40be526ab632ae206ada68,"""What did the author do growing up?""","""The author attended government schools for 9 ...",-,"{""record_id"": ""record_hash_56e53381af40be526ab...","{""n_requests"": 2, ""n_successful_requests"": 2, ...","{""start_time"": ""2024-02-29T16:11:59.342173"", ""...",2024-02-29T16:12:03.631727,0.8,0.15,0.5,[{'args': {'prompt': 'What did the author do g...,[{'args': {'question': 'What did the author do...,[{'args': {'source': ['@Teslarati @KlenderJoey...,4,2056,0.003085


In [14]:
tru.get_leaderboard(app_ids=["LlamaIndex_App1"])

Unnamed: 0_level_0,relevance,groundedness_measure_with_cot_reasons,qs_relevance,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
LlamaIndex_App1,0.8,0.5,0.15,4.5,0.003074


## Explore in a Dashboard

In [15]:
tru.run_dashboard() # open a local streamlit app to explore

# tru.stop_dashboard() # stop if needed

Starting dashboard ...
Config file already exists. Skipping writing process.
Credentials file already exists. Skipping writing process.
Dashboard already running at path:   Network URL: http://192.168.1.29:8501



<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>

Alternatively, you can run `trulens-eval` from a command line in the same folder to start the dashboard.