# 📓 Llama-Index Integration

TruLens provides TruLlama, a deep integration with Llama-Index to allow you to
inspect and evaluate the internals of your application built using Llama-Index.
This is done through the instrumentation of key Llama-Index classes and methods.
To see all classes and methods instrumented, see *Appendix: Llama-Index
Instrumented Classes and Methods*.

In addition to the default instrumentation, TruChain exposes the
*select_context* and *select_source_nodes* methods for evaluations that require
access to retrieved context or source nodes. Exposing these methods bypasses the
need to know the json structure of your app ahead of time, and makes your
evaluations re-usable across different apps.


## Example usage

Below is a quick example of usage. First, we'll create a standard Llama-Index query engine from Paul Graham's Essay, *What I Worked On* 

In [None]:
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader

documents = SimpleWebPageReader(html_to_text=True).load_data(
    ["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()

To instrument an Llama-Index query engine, all that's required is to wrap it using TruLlama.

In [None]:
from trulens_eval import TruLlama
tru_query_engine_recorder = TruLlama(query_engine)

with tru_query_engine_recorder as recording:
    print(query_engine.query("What did the author do growing up?"))

To properly evaluate LLM apps we often need to point our evaluation at an
internal step of our application, such as the retreived context. Doing so allows
us to evaluate for metrics including context relevance and groundedness.

For Llama-Index applications where the source nodes are used, `select_context`
can be used to access the retrieved text for evaluation.

Example usage:

```python
context = TruLlama.select_context(query_engine)

f_context_relevance = (
    Feedback(provider.qs_relevance)
    .on_input()
    .on(context)
    .aggregate(np.mean)
    )
```

For added flexibility, the select_context method is also made available through
`trulens_eval.app.App`. This allows you to switch between frameworks without
changing your context selector:

```python
from trulens_eval.app import App
context = App.select_context(query_engine)
```

You can find the full quickstart available here: [Llama-Index Quickstart](/trulens_eval/llama_index_quickstart)

## Async Support
TruLlama also provides async support for Llama-Index through the `aquery`,
`achat`, and `astream_chat` methods. This allows you to track and evaluate async
applciations.

As an example, below is an Llama-Index async chat engine (`achat`).

In [None]:
# Imports main tools:
from trulens_eval import TruLlama, Tru
tru = Tru()

from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader

documents = SimpleWebPageReader(html_to_text=True).load_data(
    ["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)

chat_engine = index.as_chat_engine(streaming=True)

To instrument an Llama-Index `achat` engine, all that's required is to wrap it using TruLlama - just like with the query engine.

In [None]:
tru_chat_recorder = TruLlama(chat_engine)

with tru_chat_recorder as recording:
    llm_response_async = await chat_engine.aquery("What did the author do growing up?")

print(llm_response_async)

## Streaming Support

TruLlama also provides streaming support for Llama-Index. This allows you to track and evaluate streaming applications.

As an example, below is an Llama-Index query engine with streaming.

In [None]:
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader
from trulens_eval import TruLlama

documents = SimpleWebPageReader(html_to_text=True).load_data(
    ["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine(streaming=True)

Just like with other methods, just wrap your streaming query engine with TruLlama and operate like before.

You can also print the response tokens as they are generated using the `response_gen` attribute.

In [None]:
tru_query_engine_recorder = TruLlama(query_engine)

with tru_query_engine_recorder as recording:
    response = query_engine.query("What did the author do growing up?")

for c in response.response_gen:
    print(c)

For more usage examples, check out the [Llama-Index examples directory](https://github.com/truera/trulens/tree/main/trulens_eval/examples/frameworks/llama_index).

## Appendix: Llama-Index Instrumented Classes and Methods

As of `trulens_eval` version 0.20.0, TruLlama instrumetns the follwowing classes by default:

* `BaseComponent`
* `BaseLLM`
* `BaseQueryEngine`
* `BaseRetriever`
* `BaseIndex`
* `BaseChatEngine`
* `Prompt`
* `llama_index.prompts.prompt_type.PromptType` # enum
* `BaseQuestionGenerator`
* `BaseSynthesizer`
* `Refine`
* `LLMPredictor`
* `LLMMetadata`
* `BaseLLMPredictor`
* `VectorStore`
* `ServiceContext`
* `PromptHelper`
* `BaseEmbedding`
* `NodeParser`
* `ToolMetadata`
* `BaseTool`
* `BaseMemory`
* `WithFeedbackFilterNodes`


TruLlama instruments the following methods:
* `query`
* `aquery`
* `chat`
* `achat`
* `stream_chat`
* `astream_chat`
* `complete`
* `stream_complete`
* `acomplete`
* `astream_complete`
* `__call__`
* `call`
* `acall`
* `put`
* `get_response`
* `predict`
* `retrieve`
* `synthesize`

### Instrumenting other classes/methods.
Additional classes and methods can be instrumented by use of the
`trulens_eval.utils.instruments.Instrument` methods and decorators. Examples of
such usage can be found in the custom app used in the `custom_example.ipynb`
notebook which can be found in
`trulens_eval/examples/expositional/end2end_apps/custom_app/custom_app.py`. More
information about these decorators can be found in the
`trulens_instrumentation.ipynb` notebook.

### Inspecting instrumentation
The specific objects (of the above classes) and methods instrumented for a
particular app can be inspected using the `App.print_instrumented` as
exemplified in the next cell.

In [None]:
tru_query_engine_recorder.print_instrumented()