# Llama 3.1 Rag Agent with LlamaIndex

<a target="_blank" href="https://colab.research.google.com/github/ytang07/ai_agents_cookbooks/blob/main/llamaindex/llama31_8b_rag_agent.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

This notebook will walk you through building a LlamaIndex ReactAgent using Llama 3.1 70b. We will be using [OctoAI](https://octo.ai) as our embeddings and llm provider.

## Install Dependencies

In [8]:
# ! pip install -qU llama-index llama-index-llms-openai llama-index-readers-file octoai llama-index-llms-octoai llama-index-embeddings-octoai llama-index-embeddings-openai llama-index-llms-openai-like

# ! pip freeze | grep llama-index-core
# ! pip freeze | grep embeddings-openai

In [9]:
#additional imports
import logging
from httpx import HTTPStatusError, ConnectError

# logging.basicConfig(level=logging.DEBUG)
logging.getLogger().setLevel(logging.WARNING)


## Setup API Keys
To run the rest of the notebook you will need access to an OctoAI API key. You can sign up for an account [here](https://octoai.cloud/). If you need further guidance you can check OctoAI's [documentation page](https://octo.ai/docs/getting-started/how-to-create-octoai-access-token).

In [11]:
from os import environ
from getpass import getpass
# environ["OCTOAI_API_KEY"] = getpass("Input your OCTOAI API key: ")
from dotenv import load_dotenv

load_dotenv()

OCTOAI_API_KEY = environ["OCTOAI_API_KEY"]

## Import libraries and setup LlamaIndex

In [12]:
from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    StorageContext,
    load_index_from_storage,
)
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.embeddings.octoai import OctoAIEmbedding
from llama_index.core import Settings as LlamaGlobalSettings
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai_like import OpenAILike

# Set the default model to use for embeddings
LlamaGlobalSettings.embed_model = OctoAIEmbedding()

# Create an llm object to use for the QueryEngine and the ReActAgent
llm = OpenAILike(
    model="meta-llama-3.1-70b-instruct",
    api_base="https://text.octoai.run/v1",
    api_key=environ["OCTOAI_API_KEY"],
    context_window=40000,
    is_function_calling_model=True,
    is_chat_model=True,
)


## Load Documents

In [13]:
try:
    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/food"
    )
    food_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/shampoo"
    )
    shampoo_index = load_index_from_storage(storage_context)

    index_loaded = True
except:
    index_loaded = False
print("Indexes loaded:", index_loaded)

Indexes loaded: False


This is the point we create our vector indexes, by calculating the embedding vectors for each of the chunks. You only need to run this once.

In [14]:
if not index_loaded:
    # load data
    food_docs = SimpleDirectoryReader(
        input_files=["./food/foodInfo.pdf"]
    ).load_data()

    # build index
    food_index = VectorStoreIndex.from_documents(food_docs, show_progress=True)

    # persist index
    food_index.storage_context.persist(persist_dir="./storage/food")

    # load data
    shampoo_docs = SimpleDirectoryReader(
        input_files=["./shampoo/shampooInfo.pdf"]
    ).load_data()

    # build index
    shampoo_index = VectorStoreIndex.from_documents(shampoo_docs, show_progress=True)

    # persist index
    shampoo_index.storage_context.persist(persist_dir="./storage/shampoo")


Parsing nodes: 100%|██████████| 16/16 [00:00<00:00, 1247.77it/s]
Generating embeddings: 100%|██████████| 16/16 [00:02<00:00,  7.42it/s]
Parsing nodes: 100%|██████████| 6/6 [00:00<00:00, 631.85it/s]
Generating embeddings: 100%|██████████| 6/6 [00:00<00:00,  7.27it/s]


Now create the query engines.

In [15]:
food_engine = food_index.as_query_engine(similarity_top_k=3, llm=llm)
shampoo_engine = shampoo_index.as_query_engine(similarity_top_k=3, llm=llm)

We can now define the query engines as tools that will be used by the agent.

As there is a query engine per document we need to also define one tool for each of them.

In [16]:
query_engine_tools = [
    QueryEngineTool(
        query_engine=food_engine,
        metadata=ToolMetadata(
            name="food",
            description=(
                "Provides information about ingredients in food. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=shampoo_engine,
        metadata=ToolMetadata(
            name="shampoo",
            description=(
                "Provides information about ingredients in shampoo. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
]

## Creating the Agent
Now we have all the elements to create a LlamaIndex ReactAgent

In [17]:
agent = ReActAgent.from_tools(
    query_engine_tools,
    llm=llm,
    verbose=True,
    max_turns=10,
)

Now we can interact with the agent and ask a question.

In [19]:
response = agent.chat("Tell me about the Sodium Laureth Sulfate ingredients that is in some shampoo?")
print(str(response))

> Running step c5f13dff-5fb0-45bc-ba99-7fa205c7b048. Step input: Tell me about the Sodium Laureth Sulfate ingredients that is in some shampoo?
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: shampoo
Action Input: {'input': 'Sodium Laureth Sulfate ingredients in some shampoo'}
[0m[1;3;34mObservation: Sodium Laureth Sulfate (SLES) is a milder alternative to Sodium Lauryl Sulfate (SLS), providing good cleansing and foaming properties while being less irritating to the skin and hair. It is currently the most widely and largely used surfactant in shampoos.
[0m> Running step 576c44b3-3a9a-421c-a0e4-d8f2e16ca8ce. Step input: None
[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer
Answer: Sodium Laureth Sulfate (SLES) is a mild alternative to Sodium Lauryl Sulfate (SLS), providing good cleansing and foaming properties while being less irritating to the sk