# 📓 🦙 LlamaIndex Integration

TruLens provides TruLlama, a deep integration with LlamaIndex to allow you to
inspect and evaluate the internals of your application built using LlamaIndex.
This is done through the instrumentation of key LlamaIndex classes and methods.
To see all classes and methods instrumented, see *Appendix: LlamaIndex
Instrumented Classes and Methods*.

In addition to the default instrumentation, TruChain exposes the
*select_context* and *select_source_nodes* methods for evaluations that require
access to retrieved context or source nodes. Exposing these methods bypasses the
need to know the json structure of your app ahead of time, and makes your
evaluations re-usable across different apps.


## Example usage

Below is a quick example of usage. First, we'll create a standard LlamaIndex query engine from Paul Graham's Essay, *What I Worked On* 

In [4]:
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader

documents = SimpleWebPageReader(html_to_text=True).load_data(
    ["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()

To instrument an LlamaIndex query engine, all that's required is to wrap it using TruLlama.

In [5]:
from trulens.ext.instrument.llamaindex import TruLlama

tru_query_engine_recorder = TruLlama(query_engine)

with tru_query_engine_recorder as recording:
    print(query_engine.query("What did the author do growing up?"))

🦑 Tru initialized with db url sqlite:///default.sqlite .
🛑 Secret keys may be written to the database. See the `database_redact_keys` option of Tru` to prevent this.
The author, growing up, worked on writing short stories and programming.


To properly evaluate LLM apps we often need to point our evaluation at an
internal step of our application, such as the retreived context. Doing so allows
us to evaluate for metrics including context relevance and groundedness.

For LlamaIndex applications where the source nodes are used, `select_context`
can be used to access the retrieved text for evaluation.

In [None]:
import numpy as np
from trulens.core import Feedback
from trulens.ext.provider.openai import OpenAI

provider = OpenAI()

context = TruLlama.select_context(query_engine)

f_context_relevance = (
    Feedback(provider.context_relevance)
    .on_input()
    .on(context)
    .aggregate(np.mean)
)

For added flexibility, the select_context method is also made available through
`trulens_eval.app.App`. This allows you to switch between frameworks without
changing your context selector:

In [None]:
from trulens.core.app import App

context = App.select_context(query_engine)

You can find the full quickstart available here: [LlamaIndex Quickstart](../../../getting_started/quickstarts/llama_index_quickstart)

## Async Support
TruLlama also provides async support for LlamaIndex through the `aquery`,
`achat`, and `astream_chat` methods. This allows you to track and evaluate async
applciations.

As an example, below is an LlamaIndex async chat engine (`achat`).

In [6]:
# Imports main tools:
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader
from trulens.core import Tru
from trulens.ext.instrument.llamaindex import TruLlama

tru = Tru()
documents = SimpleWebPageReader(html_to_text=True).load_data(
    ["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)

chat_engine = index.as_chat_engine()

To instrument an LlamaIndex `achat` engine, all that's required is to wrap it using TruLlama - just like with the query engine.

In [7]:
tru_chat_recorder = TruLlama(chat_engine)

with tru_chat_recorder as recording:
    llm_response_async = await chat_engine.achat(
        "What did the author do growing up?"
    )

print(llm_response_async)

A new object of type ChatMemoryBuffer at 0x2bf581210 is calling an instrumented method put. The path of this call may be incorrect.
Guessing path of new object is app.memory based on other object (0x2bf5e5050) using this function.
Could not determine main output from None.
Could not determine main output from None.
Could not determine main output from None.
Could not determine main output from None.


The author worked on writing short stories and programming while growing up.


## Streaming Support

TruLlama also provides streaming support for LlamaIndex. This allows you to track and evaluate streaming applications.

As an example, below is an LlamaIndex query engine with streaming.

In [8]:
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader

documents = SimpleWebPageReader(html_to_text=True).load_data(
    ["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)

chat_engine = index.as_chat_engine(streaming=True)

Just like with other methods, just wrap your streaming query engine with TruLlama and operate like before.

You can also print the response tokens as they are generated using the `response_gen` attribute.

In [9]:
tru_chat_engine_recorder = TruLlama(chat_engine)

with tru_chat_engine_recorder as recording:
    response = chat_engine.stream_chat("What did the author do growing up?")

for c in response.response_gen:
    print(c)

A new object of type ChatMemoryBuffer at 0x2c1df9950 is calling an instrumented method put. The path of this call may be incorrect.
Guessing path of new object is app.memory based on other object (0x2c08b04f0) using this function.
Could not find usage information in openai response:
<openai.Stream object at 0x2bf5f3ed0>
Could not find usage information in openai response:
<openai.Stream object at 0x2bf5f3ed0>


For more usage examples, check out the [LlamaIndex examples directory](https://github.com/truera/trulens/tree/main/trulens_eval/examples/frameworks/llama_index).

## Appendix: LlamaIndex Instrumented Classes and Methods

The modules, classes, and methods that trulens instruments can be retrieved from
the appropriate Instrument subclass.

In [14]:
from trulens.ext.instrument.llamaindex import LlamaInstrument

LlamaInstrument().print_instrumentation()

Module langchain*
  Class langchain.agents.agent.BaseMultiActionAgent
    Method plan: (self, intermediate_steps: 'List[Tuple[AgentAction, str]]', callbacks: 'Callbacks' = None, **kwargs: 'Any') -> 'Union[List[AgentAction], AgentFinish]'
    Method aplan: (self, intermediate_steps: 'List[Tuple[AgentAction, str]]', callbacks: 'Callbacks' = None, **kwargs: 'Any') -> 'Union[List[AgentAction], AgentFinish]'
  Class langchain.agents.agent.BaseSingleActionAgent
    Method plan: (self, intermediate_steps: 'List[Tuple[AgentAction, str]]', callbacks: 'Callbacks' = None, **kwargs: 'Any') -> 'Union[AgentAction, AgentFinish]'
    Method aplan: (self, intermediate_steps: 'List[Tuple[AgentAction, str]]', callbacks: 'Callbacks' = None, **kwargs: 'Any') -> 'Union[AgentAction, AgentFinish]'
  Class langchain.chains.base.Chain
    Method invoke: (self, input: Dict[str, Any], config: Optional[langchain_core.runnables.config.RunnableConfig] = None, **kwargs: Any) -> Dict[str, Any]
    Method ainvoke: (sel

### Instrumenting other classes/methods.
Additional classes and methods can be instrumented by use of the
`trulens_eval.instruments.Instrument` methods and decorators. Examples of
such usage can be found in the custom app used in the `custom_example.ipynb`
notebook which can be found in
`trulens_eval/examples/expositional/end2end_apps/custom_app/custom_app.py`. More
information about these decorators can be found in the
`docs/trulens_eval/tracking/instrumentation/index.ipynb` notebook.

### Inspecting instrumentation
The specific objects (of the above classes) and methods instrumented for a
particular app can be inspected using the `App.print_instrumented` as
exemplified in the next cell. Unlike `Instrument.print_instrumentation`, this
function only shows what in an app was actually instrumented.

In [11]:
tru_chat_engine_recorder.print_instrumented()

Components:
	TruLlama (Other) at 0x2bf5d5d10 with path __app__
	OpenAIAgent (Other) at 0x2bf535a10 with path __app__.app
	ChatMemoryBuffer (Other) at 0x2bf537210 with path __app__.app.memory
	SimpleChatStore (Other) at 0x2be6ef710 with path __app__.app.memory.chat_store

Methods:
Object at 0x2bf537210:
	<function ChatMemoryBuffer.put at 0x2b14c19e0> with path __app__.app.memory
	<function BaseMemory.put at 0x2b1448f40> with path __app__.app.memory
Object at 0x2bf535a10:
	<function BaseQueryEngine.query at 0x2b137dc60> with path __app__.app
	<function BaseQueryEngine.aquery at 0x2b137e2a0> with path __app__.app
	<function AgentRunner.chat at 0x2bf5aa160> with path __app__.app
	<function AgentRunner.achat at 0x2bf5aa2a0> with path __app__.app
	<function AgentRunner.stream_chat at 0x2bf5aa340> with path __app__.app
	<function BaseQueryEngine.retrieve at 0x2b137e340> with path __app__.app
	<function BaseQueryEngine.synthesize at 0x2b137e3e0> with path __app__.app
	<function BaseChatEngine.