# RAG Workbench - LlamaIndex Quickstart

In this notebook, we showcase how to use the **LastMile Tracing SDK** and **RAG Workbench** to manage, evaluate, and debug your LlamaIndex applications. With tracing automatically setup, you can easily debug your application using LastMile's RAG Workbench.

## Notebook Outline
* [Step 1: Install and Setup](#setup)
* [Step 2: Configure LastMile Instrumentor](#step2)
* [Step 3: Load Document](#step3)
* [Step 4: Create an Index and Query Engine](#step4)
* [Step 5: Query the Index](#step5)
* [Step 6: View Results in RAG Workbench UI](#step6)


<a name="setup"></a>
# Step 1: Install and Setup

First install the required packages.

In [1]:
!pip install llama-index-embeddings-openai --upgrade
%pip install -q html2text llama-index pandas pyarrow tqdm
%pip install -q llama-index-readers-web
%pip install -q llama-index-callbacks-openinference
!pip install openai --upgrade
!pip install "tracing-auto-instrumentation[llama-index]" --upgrade



You need the following API tokens/keys:
1. **LastMile AI API Token:** Get from the [Settings page](https://lastmileai.dev/settings?page=tokens) after creating a free LastMile AI account.
2. **OpenAI API Key:** Create from the [API Keys page](https://platform.openai.com/account/api-keys).

Setup your keys using one of the following methods:

* **Google Colab:** Add secrets `OPENAI_API_KEY` and `LASTMILE_API_TOKEN` in "Secrets Manager" (lock icon on left).
* **Local Notebook:** Create a `.env` file and add a line for each key (ex. `LASTMILE_API_TOKEN=your-api-token`).

Run the code cell below after setting the keys. Avoid inputting keys directly in the notebook.

In [2]:
import os

try:
    # If running on Google Colab, use userdata to securely input keys
    from google.colab import userdata
    OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
    LASTMILE_API_TOKEN = userdata.get('LASTMILE_API_TOKEN')
except ModuleNotFoundError:
    # If running locally, load keys from .env file
    from dotenv import load_dotenv
    load_dotenv()
    OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
    LASTMILE_API_TOKEN = os.getenv('LASTMILE_API_TOKEN')

os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
os.environ['LASTMILE_API_TOKEN'] = LASTMILE_API_TOKEN

<a name="step2"></a>

## Step 2: Configure LastMile Instrumentor

Next, we need to configure the LastMile Instrumentor which will auto-instrument tracing for your LlamaIndex application. This allows us to easily see the various steps in our RAG system in the RAG Workbench UI.

In [3]:
import llama_index.core
from tracing_auto_instrumentation.llama_index import LlamaIndexCallbackHandler

llama_index.core.global_handler = LlamaIndexCallbackHandler(
    project_name="LlamaIndex-QuickStart",
)

<a name="step3"></a>

# Step 3: Load Document


In [4]:
from llama_index.core import VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter
from llama_index.readers.web import SimpleWebPageReader

documents = SimpleWebPageReader().load_data(["https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt"])

parser = SentenceSplitter()
nodes = parser.get_nodes_from_documents(documents)

<a name="step4"></a>

# Step 4: Create an Index and Query Engine

In [5]:
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

<a name="step5"></a>

# Step 5: Query the Index

In [6]:
import textwrap

max_characters_per_line = 80
queries = [
    "What did Paul Graham do growing up?",
    "When and how did Paul Graham's mother die?",
    "What, in Paul Graham's opinion, is the most distinctive thing about YC?"
]

for query in queries:
    response = query_engine.query(query)
    print("Query")
    print("=====")
    print(textwrap.fill(query, max_characters_per_line))
    print()
    print("Response")
    print("========")
    print(textwrap.fill(str(response), max_characters_per_line))
    print()

Query
=====
What did Paul Graham do growing up?

Response
Paul Graham wrote short stories and began programming on the IBM 1401 in 9th
grade using an early version of Fortran. Later, he convinced his father to buy a
TRS-80, where he wrote simple games, a rocket prediction program, and a word
processor.

Query
=====
When and how did Paul Graham's mother die?

Response
Paul Graham's mother died when he was 18 years old, from a brain tumor.

Query
=====
What, in Paul Graham's opinion, is the most distinctive thing about YC?

Response
The most distinctive thing about Y Combinator, according to Paul Graham, is that
it provided a way to scale startup funding by funding startups in batches.



<a name="step6"></a>

# Step 6: View Results in RAG Workbench UI
We can view the results in the RAG Workbench UI.

From your terminal, export your LASTMILE_API_TOKEN

```bash
export LASTMILE_API_TOKEN="<your-api-token>"
```

Next, run the following command in your terminal to launch the UI:

```bash
rag-debug launch
```

Navigate to the url provided by the RAG Workbench (opens up your web browser). This will look like http://localhost:8080/

1. Click the **Traces Tab**.
2. Select Project **'LlamaIndex-QuickStart'**. You can see the Trace logged in the UI.

<img width="850" alt="llama_overview" src="https://github.com/lastmile-ai/aiconfig/assets/81494782/01c084be-01a6-4d77-a4e4-a59e00e55af7">



3. Click on a Trace to see the steps of our LlamaIndex application and other metadata.

<img width="850" alt="llama_trace_view" src="https://github.com/lastmile-ai/aiconfig/assets/81494782/a379f572-4998-4bf1-8a97-1f044a98f97c">

Now you can dive into specific steps and check out logs, parameters, and other aspects of your application!

