# Llama-Index Integration

TruLens provides TruLlama, a deep integration with Llama-Index to allow you to inspect and evaluate the internals of your application built using Llama-Index.

TruLlama captures all of the metrics and metadata listed in the [instrumentation overview](../basic_instrumentation). In addition, TruLlama provides the `select_source_nodes` method to capture the source nodes of your query.

## Supported methods
TruLlama supports both sync and async modes using the following Llama-Index query engine methods:
* `query`
* `aquery`
* `chat`
* `achat`
* `stream_chat`
* `astream_chat`

## Example usage

Below is a quick example of usage. First, we'll create a standard Llama-Index query engine from Paul Graham's Essay, *What I Worked On* 

In [None]:
from llama_index import VectorStoreIndex, SimpleWebPageReader
from trulens_eval import TruLlama

documents = SimpleWebPageReader(html_to_text=True).load_data(
    ["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()

To instrument an Llama-Index query engine, all that's required is to wrap it using TruLlama. Then the instrumented query engine can be used like the original.

In [None]:
tru_query_engine = TruLlama(query_engine)

llm_response = tru_query_engine.query("What did the author do growing up?")

You can find the full quickstart available here: [Llama-Index Quickstart](../llama_index_quickstart)

## Async Support
TruLlama also provides async support for Llama-Index through the `aquery`, `achat`, and `astream_chat` methods. This allows you to track and evaluate async applciations.

As an example, below is an Llama-Index async chat engine (`achat`).

In [None]:
# Imports main tools:
from trulens_eval import TruLlama, Feedback, Tru, feedback, Select
tru = Tru()

from llama_index import VectorStoreIndex, SimpleWebPageReader

documents = SimpleWebPageReader(html_to_text=True).load_data(
    ["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)

chat_engine = index.as_chat_engine(streaming=True)

To instrument an Llama-Index `achat` engine, all that's required is to wrap it using TruLlama - just like with the query engine. Then the instrumented `achat` engine can be used like the original.

In [None]:
tru_achat_engine = TruLlama(chat_engine)

# Instrumented query engine can operate like the original:
llm_response_async = await tru_achat_engine.aquery("What did the author do growing up?")

print(llm_response_async)

## Streaming Support

TruLlama also provides streaming support for Llama-Index. This allows you to track and evaluate streaming applications.

As an example, below is an Llama-Index query engine with streaming.

In [None]:
from llama_index import VectorStoreIndex, SimpleWebPageReader
from trulens_eval import TruLlama

documents = SimpleWebPageReader(html_to_text=True).load_data(
    ["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine(streaming=True)

Just like with other methods, just wrap your streaming query engine with TruLlama and operate like before.

You can also print the response tokens as they are generated using the `response_gen` attribute.

In [None]:
tru_query_engine = TruLlama(query_engine)

response = tru_query_engine.query("What did the author do growing up?")

for c in response.response_gen:
    print(c)

For more usage examples, check out the [Llama-Index examples directory](https://github.com/truera/trulens/tree/main/trulens_eval/examples/frameworks/llama_index).